.. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr_packages_scikit-learn_auto_examples_plot_california_prediction.py: A simple regression analysis on the California housing data =========================================================== Here we perform a simple regression analysis on the California housing data, exploring two types of regressors. .. code-block:: python from sklearn.datasets import fetch_california_housing data = fetch_california_housing(as_frame=True) Print a histogram of the quantity to predict: price .. code-block:: python import matplotlib.pyplot as plt plt.figure(figsize=(4, 3)) plt.hist(data.target) plt.xlabel('price ($1000s)') plt.ylabel('count') plt.tight_layout() .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_001.png :class: sphx-glr-single-img Print the join histogram for each feature .. code-block:: python for index, feature_name in enumerate(data.feature_names): plt.figure(figsize=(4, 3)) plt.scatter(data.data[feature_name], data.target) plt.ylabel('Price', size=15) plt.xlabel(feature_name, size=15) plt.tight_layout() .. rst-class:: sphx-glr-horizontal * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_002.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_003.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_004.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_005.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_006.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_007.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_008.png :class: sphx-glr-multi-img * .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_009.png :class: sphx-glr-multi-img Simple prediction .. code-block:: python from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(data.data, data.target) from sklearn.linear_model import LinearRegression clf = LinearRegression() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 50], [0, 50], '--k') plt.axis('tight') plt.xlabel('True price ($1000s)') plt.ylabel('Predicted price ($1000s)') plt.tight_layout() .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_010.png :class: sphx-glr-single-img Prediction with gradient boosted tree .. code-block:: python from sklearn.ensemble import GradientBoostingRegressor clf = GradientBoostingRegressor() clf.fit(X_train, y_train) predicted = clf.predict(X_test) expected = y_test plt.figure(figsize=(4, 3)) plt.scatter(expected, predicted) plt.plot([0, 50], [0, 50], '--k') plt.axis('tight') plt.xlabel('True price ($1000s)') plt.ylabel('Predicted price ($1000s)') plt.tight_layout() .. image:: /packages/scikit-learn/auto_examples/images/sphx_glr_plot_california_prediction_011.png :class: sphx-glr-single-img Print the error rate .. code-block:: python import numpy as np print("RMS: %r " % np.sqrt(np.mean((predicted - expected) ** 2))) plt.show() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none RMS: 0.5314909993118918 **Total running time of the script:** ( 0 minutes 4.479 seconds) .. _sphx_glr_download_packages_scikit-learn_auto_examples_plot_california_prediction.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download :download:`Download Python source code: plot_california_prediction.py ` .. container:: sphx-glr-download :download:`Download Jupyter notebook: plot_california_prediction.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_