The bias-variance tradeoff

\n", " \n", "\n", "A model will create predictions, and those predictions will be wrong to some degree when we generalize outside our initial data.\n", "\n", "It turns out we can decompose the expected error of _a model_ like this[^biasform]\n", "\n", "$$\n", "E[\\text{model error risk}] = \\text{model bias}^2+\\text{model variance}+\\text{noise}\n", "$$\n", "\n", "[^biasform]: _(If you want to see the derivation of this, you can go to the [wiki page](https://en.wikipedia.org/wiki/Bias%E2%80%93variance_tradeoff) or [DS100](https://www.textbook.ds100.org/ch/21/bias_modeling.html). The former's notation is a little simpler but the latter is more helpful with intuition)_\n", "\n", "**\"Model bias\"**\n", "- Def: Is errors stemming from the model's assumptions in how it predicts the outcome variable. (It is the opposite of model accuracy.)\n", "- Adding more variables or polynomial transformations of existing variables **will usually reduce bias** \n", "- Adding more data to the training dataset can (but might not) reduce bias\n", "\n", "**\"Model variance\"**\n", "- Def: Is extent to which estimated model varies from sample to sample\n", "- Adding more variables or polynomial transformations of existing variables **will usually increase model variance** \n", "- Adding more data to the training dataset will reduce variance\n", "\n", "**Noise**\n", "- Def: Randomness in the data generating process beyond our understanding\n", "- To reduce the noise term, you need more data, better data collection, and more accurate measurements\n", "\n", "```{admonition} Key takeaway #2\n", ":class: tip\n", "**We tune the complexity of our models to change our models' bias and variance, and there is an optimal amount of complexity.**\n", "```\n", "\n", "> **THE FUNDAMENTAL TRADEOFF: Increasing model complexity increases its variance but reduces its bias** \n", "> - Models that are too simple have high bias but low variance \n", "> - Models that are too complex have the opposite problem\n", "> - Collecting a TON of data can allow you to use complex models with less variance \n", ">\n", "> This is the essence of the bias-variance tradeoff, a fundamental issue that we face in choosing models for prediction.\"[^tradeoff]\n", " \n", "[^tradeoff]: _(This is adapted from [DS100](https://www.textbook.ds100.org/ch/15/bias_modeling.html))_\n", "\n", "I like these two graphs to see all of this discussion visually:\n", "\n", "````{tabbed} The classic bias-variance tradeoff\n", "- Models that are too simple are said to be **\"underfit\"** (take steps to reduce bias)\n", "- Models that are too complicated are said to be **\"overfit\"** (take steps to reduce variance)\n", "\n", "```{image} img/bias_modeling_bias_var_plot.png\n", ":alt: bias_var\n", ":width: 500px\n", "```\n", "````\n", "\n", "````{tabbed} Optimizing model performance via complexity \n", "- Models that are too simple perform poorly (low scores, high bias)\n", "- Models that are too complex perform well in training but poorly outside that sample (high variance)\n", "\n", "```{image} img/validation-curve.png\n", ":alt: validation\n", ":width: 500px\n", "```\n", "\n", "[Source](https://jakevdp.github.io/PythonDataScienceHandbook/figures/05.03-validation-curve.png)\n", "\n", "````" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Video: The bias-variance tradeoff

\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Minimizing model risk

\n", "\n", "Our tools to minimize model risk are\n", "1. More data\n", "2. [Feature engineering (adding, cleaning, and selecting features; dimensionality reduction).](https://jakevdp.github.io/PythonDataScienceHandbook/05.04-feature-engineering.html) \n", "3. [Model selection](https://jakevdp.github.io/PythonDataScienceHandbook/05.03-hyperparameters-and-model-validation.html)\n", "4. Model evaluation [via cross validation (CV)](03c_ModelEval) or [out-of-sample (OOS) forecasting](03c1_OOS)\n", "\n", "I added some good external resources in the links above on feature engineering and model selection. The next pages here will dig into model evaluation because it gets at the flow of testing a model." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.12" } }, "nbformat": 4, "nbformat_minor": 4 }