.. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_packages_scikit-learn_auto_examples_plot_variance_linear_regr.py: ================================================== Plot variance and regularization in linear models ================================================== .. code-block:: python import numpy as np # Smaller figures from matplotlib import pyplot as plt plt.rcParams['figure.figsize'] = (3, 2) We consider the situation where we have only 2 data point .. code-block:: python X = np.c_[ .5, 1].T y = [.5, 1] X_test = np.c_[ 0, 2].T Without noise, as linear regression fits the data perfectly .. code-block:: python from sklearn import linear_model regr = linear_model.LinearRegression() regr.fit(X, y) plt.plot(X, y, 'o') plt.plot(X_test, regr.predict(X_test)) .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_001.png :class: sphx-glr-single-img In real life situation, we have noise (e.g. measurement noise) in our data: .. code-block:: python np.random.seed(0) for _ in range(6): noisy_X = X + np.random.normal(loc=0, scale=.1, size=X.shape) plt.plot(noisy_X, y, 'o') regr.fit(noisy_X, y) plt.plot(X_test, regr.predict(X_test)) .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_002.png :class: sphx-glr-single-img As we can see, our linear model captures and amplifies the noise in the data. It displays a lot of variance. We can use another linear estimator that uses regularization, the :class:`~sklearn.linear_model.Ridge` estimator. This estimator regularizes the coefficients by shrinking them to zero, under the assumption that very high correlations are often spurious. The alpha parameter controls the amount of shrinkage used. .. code-block:: python regr = linear_model.Ridge(alpha=.1) np.random.seed(0) for _ in range(6): noisy_X = X + np.random.normal(loc=0, scale=.1, size=X.shape) plt.plot(noisy_X, y, 'o') regr.fit(noisy_X, y) plt.plot(X_test, regr.predict(X_test)) plt.show() .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_variance_linear_regr_003.png :class: sphx-glr-single-img **Total running time of the script:** ( 0 minutes 0.066 seconds) .. _sphx_glr_download_packages_scikit-learn_auto_examples_plot_variance_linear_regr.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download :download:`Download Python source code: plot_variance_linear_regr.py ` .. container:: sphx-glr-download :download:`Download Jupyter notebook: plot_variance_linear_regr.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_