{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Shrinkage covariance estimation: LedoitWolf vs OAS and max-likelihood\n\nWhen working with covariance estimation, the usual approach is to use\na maximum likelihood estimator, such as the\n:class:`~sklearn.covariance.EmpiricalCovariance`. It is unbiased, i.e. it\nconverges to the true (population) covariance when given many\nobservations. However, it can also be beneficial to regularize it, in\norder to reduce its variance; this, in turn, introduces some bias. This\nexample illustrates the simple regularization used in\n`shrunk_covariance` estimators. In particular, it focuses on how to\nset the amount of regularization, i.e. how to choose the bias-variance\ntrade-off.\n\n.. rubric:: References\n\n.. [1] \"Shrinkage Algorithms for MMSE Covariance Estimation\"\n   Chen et al., IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Generate sample data\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nn_features, n_samples = 40, 20\nnp.random.seed(42)\nbase_X_train = np.random.normal(size=(n_samples, n_features))\nbase_X_test = np.random.normal(size=(n_samples, n_features))\n\n# Color samples\ncoloring_matrix = np.random.normal(size=(n_features, n_features))\nX_train = np.dot(base_X_train, coloring_matrix)\nX_test = np.dot(base_X_test, coloring_matrix)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Compute the likelihood on test data\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from scipy import linalg\n\nfrom sklearn.covariance import ShrunkCovariance, empirical_covariance, log_likelihood\n\n# spanning a range of possible shrinkage coefficient values\nshrinkages = np.logspace(-2, 0, 30)\nnegative_logliks = [\n    -ShrunkCovariance(shrinkage=s).fit(X_train).score(X_test) for s in shrinkages\n]\n\n# under the ground-truth model, which we would not have access to in real\n# settings\nreal_cov = np.dot(coloring_matrix.T, coloring_matrix)\nemp_cov = empirical_covariance(X_train)\nloglik_real = -log_likelihood(emp_cov, linalg.inv(real_cov))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Compare different approaches to setting the regularization parameter\n\nHere we compare 3 approaches:\n\n* Setting the parameter by cross-validating the likelihood on three folds\n  according to a grid of potential shrinkage parameters.\n\n* A close formula proposed by Ledoit and Wolf to compute\n  the asymptotically optimal regularization parameter (minimizing a MSE\n  criterion), yielding the :class:`~sklearn.covariance.LedoitWolf`\n  covariance estimate.\n\n* An improvement of the Ledoit-Wolf shrinkage, the\n  :class:`~sklearn.covariance.OAS`, proposed by Chen et al. [1]_. Its\n  convergence is significantly better under the assumption that the data\n  are Gaussian, in particular for small samples.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from sklearn.covariance import OAS, LedoitWolf\nfrom sklearn.model_selection import GridSearchCV\n\n# GridSearch for an optimal shrinkage coefficient\ntuned_parameters = [{\"shrinkage\": shrinkages}]\ncv = GridSearchCV(ShrunkCovariance(), tuned_parameters)\ncv.fit(X_train)\n\n# Ledoit-Wolf optimal shrinkage coefficient estimate\nlw = LedoitWolf()\nloglik_lw = lw.fit(X_train).score(X_test)\n\n# OAS coefficient estimate\noa = OAS()\nloglik_oa = oa.fit(X_train).score(X_test)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Plot results\n\n\nTo quantify estimation error, we plot the likelihood of unseen data for\ndifferent values of the shrinkage parameter. We also show the choices by\ncross-validation, or with the LedoitWolf and OAS estimates.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\n\nfig = plt.figure()\nplt.title(\"Regularized covariance: likelihood and shrinkage coefficient\")\nplt.xlabel(\"Regularization parameter: shrinkage coefficient\")\nplt.ylabel(\"Error: negative log-likelihood on test data\")\n# range shrinkage curve\nplt.loglog(shrinkages, negative_logliks, label=\"Negative log-likelihood\")\n\nplt.plot(plt.xlim(), 2 * [loglik_real], \"--r\", label=\"Real covariance likelihood\")\n\n# adjust view\nlik_max = np.amax(negative_logliks)\nlik_min = np.amin(negative_logliks)\nymin = lik_min - 6.0 * np.log((plt.ylim()[1] - plt.ylim()[0]))\nymax = lik_max + 10.0 * np.log(lik_max - lik_min)\nxmin = shrinkages[0]\nxmax = shrinkages[-1]\n# LW likelihood\nplt.vlines(\n    lw.shrinkage_,\n    ymin,\n    -loglik_lw,\n    color=\"magenta\",\n    linewidth=3,\n    label=\"Ledoit-Wolf estimate\",\n)\n# OAS likelihood\nplt.vlines(\n    oa.shrinkage_, ymin, -loglik_oa, color=\"purple\", linewidth=3, label=\"OAS estimate\"\n)\n# best CV estimator likelihood\nplt.vlines(\n    cv.best_estimator_.shrinkage,\n    ymin,\n    -cv.best_estimator_.score(X_test),\n    color=\"cyan\",\n    linewidth=3,\n    label=\"Cross-validation best estimate\",\n)\n\nplt.ylim(ymin, ymax)\nplt.xlim(xmin, xmax)\nplt.legend()\n\nplt.show()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "<div class=\"alert alert-info\"><h4>Note</h4><p>The maximum likelihood estimate corresponds to no shrinkage,\n   and thus performs poorly. The Ledoit-Wolf estimate performs really well,\n   as it is close to the optimal and is not computationally costly. In this\n   example, the OAS estimate is a bit further away. Interestingly, both\n   approaches outperform cross-validation, which is significantly most\n   computationally costly.</p></div>\n\n"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.14"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}