Huge thanks to KiwiCo for sponsoring today’s video! Go to https://www.kiwico.com/welchlabs and use code WELCHLABS for 50% off your first monthly crate or for 20% off your first Panda Crate!
New Book Available for Preorder Now! The Welch Labs Illustrated Guide to AI (30:47):
http://www.welchlabs.com/resources/ai-book
Sections
0:00 - Intro
3:43 - AlexNet & Overfitting
5:19 - Overfitting
6:45 - Rethinking Generalization
11:05 - KiwiCo is Awesome
12:28 - The Double Descent Hypothesis
13:57 - Double Descent is Real!
16:01 - Double Descent with Polynomial Curvefitting?!
20:36 - But why?
22:35 - Should I throw out my books?
24:28 - The Bias-Variance Tradeoff
28:30 - My take
30:47 - I’ve written a new book on AI!
Books with U-shaped test set error curves:
Murphy, Kevin P. Probabilistic machine learning: an introduction. MIT press, 2022.
Goodfellow, Ian, et al. *Deep learning*. Vol. 1. No. 2. Cambridge: MIT press, 2016.
Russell, Stuart Jonathan, and Peter Norvig, eds. *Prentice Hall series in artificial intelligence*. Englewood Cliffs, NJ:: Prentice Hall, 1995.
Bishop, Christopher M., and Nasser M. Nasrabadi. *Pattern recognition and machine learning*. Vol. 4. No. 4. New York: springer, 2006.
Learning, Machine. "Tom mitchell." *Publisher: McGraw Hill* (1997): 31.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. "The elements of statistical learning." (2009).
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. "An introduction to statistical learning." (2009).
Abu-Mostafa, Yaser S., Malik Magdon-Ismail, and Hsuan-Tien Lin. *Learning from data*. Vol. 4. New York: AMLBook, 2012.
MacKay, David JC. *Information theory, inference and learning algorithms*. Cambridge university press, 2003.
Harvard Team’s code & results:
https://gitlab.com/harvard-machine-learning/double-descent
Great repo showing polynomial double descent:
https://github.com/RylanSchaeffer/Stanford-AI-Alignment-Double-Descent-Tutorial
Technical Notes
- 26:25 For these linear fits, we’re using N=15 instead of N=5 points. This increases the bias and reduces the variance of these fits, making the bias variance trade-off more clear, but also pushes out the interpolation threshold. Full results are here: https://github.com/stephencwelch/manim_videos/blob/master/_2025/generalization/Final Video Polynomial Examples.ipynb
- 27:38 It’s tricky to show the full bias-variance results here, as the variance explodes ad Degree=4. Instead we’ve chosen to show qualitative breakdowns, showing which terms dominate the overall error at each degree. Full results can be seen here: https://github.com/stephencwelch/manim_videos/blob/master/_2025/generalization/Final%20Video%20Polynomial%20Examples.ipynb
Special Thanks to Patrons https://www.patreon.com/welchlabs
Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman, Jake Ehrlich, Mitch Jacobs, Lauren Steely, Jeff Eastman, Rodolfo Ibarra, Clark Barrus, Rob Napier, Andrew White, Richard B Johnston, abhiteja mandava, Burt Humburg, Kevin Mitchell, Daniel Sanchez, Ferdie Wang, Tripp Hill, Richard Harbaugh Jr, Prasad Raje, Kalle Aaltonen, Midori Switch Hound, Zach Wilson, Chris Seltzer, Ven Popov, Hunter Nelson, Amit Bueno, Scott Olsen, Johan Rimez, Shehryar Saroya, Tyler Christensen, Beckett Madden-Woods, Darrell Thomas, Javier Soto, U007D, Caleb Begly, Rick Rubenstein, Brent Hunsaker, Dan Patterson, Tchsurvives, Alex Adai, Walter Reade, Zyansheep, Walter Reade, Duncan Stannett, Reginald Carey, Jean-Manuel Izaret, dh71633, Adrian Rodriguez, Dimitar Stojanovski, Michael Harder, Peter Maldonado, Emily Pesce, David Johnston, Insang Song, FaeTheWolf, Stephen Taylor, KittenKaboodle, EMatter, PATRICKMCCORMACK, John Beahan, Cameron, Cole Jones, Garrett Thornburg, Jeroen W, Rohit Sharma, GlennB, Emmanuel Cortes, Katie Quinn, Karina C, Cakra WW, Mike Ton, Eric Gometz, MacCallister Higgins, Niko Drossos, David Eraso, Tom Zehle, Steve, Brian Lineburg, rjbl, Michael Loh, Perry Vais, Bengal0, Farhad Manjoo, Sara Chipps, Ellis Driscoll, William Taysom, Will Harmon, CK, Abdullah, Peter Cho, Leo Nikora, Griffin Smith, Ash Katnoria, Alex, Markus Hays Nielsen
Special thanks to: Mikhail Belkin, Preetum Nakkiran, Emily Zhang, Varun Reddy
Code for Welch Labs Videos: https://github.com/stephencwelch/manim_videos
Written by: Stephen Welch
Produced by: Stephen Welch, Sam Baskin, and Pranav Gundu
Premium Beat IDs
EEDYZ3FP44YX8OWT
New Book Available for Preorder Now! The Welch Labs Illustrated Guide to AI (30:47):
http://www.welchlabs.com/resources/ai-book
Sections
0:00 - Intro
3:43 - AlexNet & Overfitting
5:19 - Overfitting
6:45 - Rethinking Generalization
11:05 - KiwiCo is Awesome
12:28 - The Double Descent Hypothesis
13:57 - Double Descent is Real!
16:01 - Double Descent with Polynomial Curvefitting?!
20:36 - But why?
22:35 - Should I throw out my books?
24:28 - The Bias-Variance Tradeoff
28:30 - My take
30:47 - I’ve written a new book on AI!
Books with U-shaped test set error curves:
Murphy, Kevin P. Probabilistic machine learning: an introduction. MIT press, 2022.
Goodfellow, Ian, et al. *Deep learning*. Vol. 1. No. 2. Cambridge: MIT press, 2016.
Russell, Stuart Jonathan, and Peter Norvig, eds. *Prentice Hall series in artificial intelligence*. Englewood Cliffs, NJ:: Prentice Hall, 1995.
Bishop, Christopher M., and Nasser M. Nasrabadi. *Pattern recognition and machine learning*. Vol. 4. No. 4. New York: springer, 2006.
Learning, Machine. "Tom mitchell." *Publisher: McGraw Hill* (1997): 31.
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. "The elements of statistical learning." (2009).
Hastie, Trevor, Robert Tibshirani, and Jerome Friedman. "An introduction to statistical learning." (2009).
Abu-Mostafa, Yaser S., Malik Magdon-Ismail, and Hsuan-Tien Lin. *Learning from data*. Vol. 4. New York: AMLBook, 2012.
MacKay, David JC. *Information theory, inference and learning algorithms*. Cambridge university press, 2003.
Harvard Team’s code & results:
https://gitlab.com/harvard-machine-learning/double-descent
Great repo showing polynomial double descent:
https://github.com/RylanSchaeffer/Stanford-AI-Alignment-Double-Descent-Tutorial
Technical Notes
- 26:25 For these linear fits, we’re using N=15 instead of N=5 points. This increases the bias and reduces the variance of these fits, making the bias variance trade-off more clear, but also pushes out the interpolation threshold. Full results are here: https://github.com/stephencwelch/manim_videos/blob/master/_2025/generalization/Final Video Polynomial Examples.ipynb
- 27:38 It’s tricky to show the full bias-variance results here, as the variance explodes ad Degree=4. Instead we’ve chosen to show qualitative breakdowns, showing which terms dominate the overall error at each degree. Full results can be seen here: https://github.com/stephencwelch/manim_videos/blob/master/_2025/generalization/Final%20Video%20Polynomial%20Examples.ipynb
Special Thanks to Patrons https://www.patreon.com/welchlabs
Juan Benet, Ross Hanson, Yan Babitski, AJ Englehardt, Alvin Khaled, Eduardo Barraza, Hitoshi Yamauchi, Jaewon Jung, Mrgoodlight, Shinichi Hayashi, Sid Sarasvati, Dominic Beaumont, Shannon Prater, Ubiquity Ventures, Matias Forti, Brian Henry, Tim Palade, Petar Vecutin, Nicolas baumann, Jason Singh, Robert Riley, vornska, Barry Silverman, Jake Ehrlich, Mitch Jacobs, Lauren Steely, Jeff Eastman, Rodolfo Ibarra, Clark Barrus, Rob Napier, Andrew White, Richard B Johnston, abhiteja mandava, Burt Humburg, Kevin Mitchell, Daniel Sanchez, Ferdie Wang, Tripp Hill, Richard Harbaugh Jr, Prasad Raje, Kalle Aaltonen, Midori Switch Hound, Zach Wilson, Chris Seltzer, Ven Popov, Hunter Nelson, Amit Bueno, Scott Olsen, Johan Rimez, Shehryar Saroya, Tyler Christensen, Beckett Madden-Woods, Darrell Thomas, Javier Soto, U007D, Caleb Begly, Rick Rubenstein, Brent Hunsaker, Dan Patterson, Tchsurvives, Alex Adai, Walter Reade, Zyansheep, Walter Reade, Duncan Stannett, Reginald Carey, Jean-Manuel Izaret, dh71633, Adrian Rodriguez, Dimitar Stojanovski, Michael Harder, Peter Maldonado, Emily Pesce, David Johnston, Insang Song, FaeTheWolf, Stephen Taylor, KittenKaboodle, EMatter, PATRICKMCCORMACK, John Beahan, Cameron, Cole Jones, Garrett Thornburg, Jeroen W, Rohit Sharma, GlennB, Emmanuel Cortes, Katie Quinn, Karina C, Cakra WW, Mike Ton, Eric Gometz, MacCallister Higgins, Niko Drossos, David Eraso, Tom Zehle, Steve, Brian Lineburg, rjbl, Michael Loh, Perry Vais, Bengal0, Farhad Manjoo, Sara Chipps, Ellis Driscoll, William Taysom, Will Harmon, CK, Abdullah, Peter Cho, Leo Nikora, Griffin Smith, Ash Katnoria, Alex, Markus Hays Nielsen
Special thanks to: Mikhail Belkin, Preetum Nakkiran, Emily Zhang, Varun Reddy
Code for Welch Labs Videos: https://github.com/stephencwelch/manim_videos
Written by: Stephen Welch
Produced by: Stephen Welch, Sam Baskin, and Pranav Gundu
Premium Beat IDs
EEDYZ3FP44YX8OWT
- Category
- Artificial Intelligence



Comments