{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Joint feature selection with multi-task Lasso\n\nThe multi-task lasso allows to fit multiple regression problems\njointly enforcing the selected features to be the same across\ntasks. This example simulates sequential measurements, each task\nis a time instant, and the relevant features vary in amplitude\nover time while being the same. The multi-task lasso imposes that\nfeatures that are selected at one time point are select for all time\npoint. This makes feature selection by the Lasso more stable.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Generate data\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nrng = np.random.RandomState(42)\n\n# Generate some 2D coefficients with sine waves with random frequency and phase\nn_samples, n_features, n_tasks = 100, 30, 40\nn_relevant_features = 5\ncoef = np.zeros((n_tasks, n_features))\ntimes = np.linspace(0, 2 * np.pi, n_tasks)\nfor k in range(n_relevant_features):\n    coef[:, k] = np.sin((1.0 + rng.randn(1)) * times + 3 * rng.randn(1))\n\nX = rng.randn(n_samples, n_features)\nY = np.dot(X, coef.T) + rng.randn(n_samples, n_tasks)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Fit models\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from sklearn.linear_model import Lasso, MultiTaskLasso\n\ncoef_lasso_ = np.array([Lasso(alpha=0.5).fit(X, y).coef_ for y in Y.T])\ncoef_multi_task_lasso_ = MultiTaskLasso(alpha=1.0).fit(X, Y).coef_"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Plot support and time series\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import matplotlib.pyplot as plt\n\nfig = plt.figure(figsize=(8, 5))\nplt.subplot(1, 2, 1)\nplt.spy(coef_lasso_)\nplt.xlabel(\"Feature\")\nplt.ylabel(\"Time (or Task)\")\nplt.text(10, 5, \"Lasso\")\nplt.subplot(1, 2, 2)\nplt.spy(coef_multi_task_lasso_)\nplt.xlabel(\"Feature\")\nplt.ylabel(\"Time (or Task)\")\nplt.text(10, 5, \"MultiTaskLasso\")\nfig.suptitle(\"Coefficient non-zero location\")\n\nfeature_to_plot = 0\nplt.figure()\nlw = 2\nplt.plot(coef[:, feature_to_plot], color=\"seagreen\", linewidth=lw, label=\"Ground truth\")\nplt.plot(\n    coef_lasso_[:, feature_to_plot], color=\"cornflowerblue\", linewidth=lw, label=\"Lasso\"\n)\nplt.plot(\n    coef_multi_task_lasso_[:, feature_to_plot],\n    color=\"gold\",\n    linewidth=lw,\n    label=\"MultiTaskLasso\",\n)\nplt.legend(loc=\"upper center\")\nplt.axis(\"tight\")\nplt.ylim([-1.1, 1.1])\nplt.show()"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.14"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}