{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "\n# Displaying Pipelines\n\nThe default configuration for displaying a pipeline in a Jupyter Notebook is\n`'diagram'` where `set_config(display='diagram')`. To deactivate HTML representation,\nuse `set_config(display='text')`.\n\nTo see more detailed steps in the visualization of the pipeline, click on the\nsteps in the pipeline.\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "# Authors: The scikit-learn developers\n# SPDX-License-Identifier: BSD-3-Clause"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Displaying a Pipeline with a Preprocessing Step and Classifier\nThis section constructs a :class:`~sklearn.pipeline.Pipeline` with a preprocessing\nstep, :class:`~sklearn.preprocessing.StandardScaler`, and classifier,\n:class:`~sklearn.linear_model.LogisticRegression`, and displays its visual\nrepresentation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from sklearn import set_config\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nsteps = [\n    (\"preprocessing\", StandardScaler()),\n    (\"classifier\", LogisticRegression()),\n]\npipe = Pipeline(steps)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To visualize the diagram, the default is `display='diagram'`.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "set_config(display=\"diagram\")\npipe  # click on the diagram below to see the details of each step"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "To view the text pipeline, change to `display='text'`.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "set_config(display=\"text\")\npipe"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "Put back the default display\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "set_config(display=\"diagram\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Displaying a Pipeline Chaining Multiple Preprocessing Steps & Classifier\nThis section constructs a :class:`~sklearn.pipeline.Pipeline` with multiple\npreprocessing steps, :class:`~sklearn.preprocessing.PolynomialFeatures` and\n:class:`~sklearn.preprocessing.StandardScaler`, and a classifier step,\n:class:`~sklearn.linear_model.LogisticRegression`, and displays its visual\nrepresentation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.preprocessing import PolynomialFeatures, StandardScaler\n\nsteps = [\n    (\"standard_scaler\", StandardScaler()),\n    (\"polynomial\", PolynomialFeatures(degree=3)),\n    (\"classifier\", LogisticRegression(C=2.0)),\n]\npipe = Pipeline(steps)\npipe  # click on the diagram below to see the details of each step"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Displaying a Pipeline and Dimensionality Reduction and Classifier\nThis section constructs a :class:`~sklearn.pipeline.Pipeline` with a\ndimensionality reduction step, :class:`~sklearn.decomposition.PCA`,\na classifier, :class:`~sklearn.svm.SVC`, and displays its visual\nrepresentation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "from sklearn.decomposition import PCA\nfrom sklearn.pipeline import Pipeline\nfrom sklearn.svm import SVC\n\nsteps = [(\"reduce_dim\", PCA(n_components=4)), (\"classifier\", SVC(kernel=\"linear\"))]\npipe = Pipeline(steps)\npipe  # click on the diagram below to see the details of each step"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Displaying a Complex Pipeline Chaining a Column Transformer\nThis section constructs a complex :class:`~sklearn.pipeline.Pipeline` with a\n:class:`~sklearn.compose.ColumnTransformer` and a classifier,\n:class:`~sklearn.linear_model.LogisticRegression`, and displays its visual\nrepresentation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nfrom sklearn.compose import ColumnTransformer\nfrom sklearn.impute import SimpleImputer\nfrom sklearn.linear_model import LogisticRegression\nfrom sklearn.pipeline import Pipeline, make_pipeline\nfrom sklearn.preprocessing import OneHotEncoder, StandardScaler\n\nnumeric_preprocessor = Pipeline(\n    steps=[\n        (\"imputation_mean\", SimpleImputer(missing_values=np.nan, strategy=\"mean\")),\n        (\"scaler\", StandardScaler()),\n    ]\n)\n\ncategorical_preprocessor = Pipeline(\n    steps=[\n        (\n            \"imputation_constant\",\n            SimpleImputer(fill_value=\"missing\", strategy=\"constant\"),\n        ),\n        (\"onehot\", OneHotEncoder(handle_unknown=\"ignore\")),\n    ]\n)\n\npreprocessor = ColumnTransformer(\n    [\n        (\"categorical\", categorical_preprocessor, [\"state\", \"gender\"]),\n        (\"numerical\", numeric_preprocessor, [\"age\", \"weight\"]),\n    ]\n)\n\npipe = make_pipeline(preprocessor, LogisticRegression(max_iter=500))\npipe  # click on the diagram below to see the details of each step"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "## Displaying a Grid Search over a Pipeline with a Classifier\nThis section constructs a :class:`~sklearn.model_selection.GridSearchCV`\nover a :class:`~sklearn.pipeline.Pipeline` with\n:class:`~sklearn.ensemble.RandomForestClassifier` and displays its visual\nrepresentation.\n\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "collapsed": false
      },
      "outputs": [],
      "source": [
        "import numpy as np\n\nfrom sklearn.compose import ColumnTransformer\nfrom sklearn.ensemble import RandomForestClassifier\nfrom sklearn.impute import SimpleImputer\nfrom sklearn.model_selection import GridSearchCV\nfrom sklearn.pipeline import Pipeline, make_pipeline\nfrom sklearn.preprocessing import OneHotEncoder, StandardScaler\n\nnumeric_preprocessor = Pipeline(\n    steps=[\n        (\"imputation_mean\", SimpleImputer(missing_values=np.nan, strategy=\"mean\")),\n        (\"scaler\", StandardScaler()),\n    ]\n)\n\ncategorical_preprocessor = Pipeline(\n    steps=[\n        (\n            \"imputation_constant\",\n            SimpleImputer(fill_value=\"missing\", strategy=\"constant\"),\n        ),\n        (\"onehot\", OneHotEncoder(handle_unknown=\"ignore\")),\n    ]\n)\n\npreprocessor = ColumnTransformer(\n    [\n        (\"categorical\", categorical_preprocessor, [\"state\", \"gender\"]),\n        (\"numerical\", numeric_preprocessor, [\"age\", \"weight\"]),\n    ]\n)\n\npipe = Pipeline(\n    steps=[(\"preprocessor\", preprocessor), (\"classifier\", RandomForestClassifier())]\n)\n\nparam_grid = {\n    \"classifier__n_estimators\": [200, 500],\n    \"classifier__max_features\": [\"auto\", \"sqrt\", \"log2\"],\n    \"classifier__max_depth\": [4, 5, 6, 7, 8],\n    \"classifier__criterion\": [\"gini\", \"entropy\"],\n}\n\ngrid_search = GridSearchCV(pipe, param_grid=param_grid, n_jobs=1)\ngrid_search  # click on the diagram below to see the details of each step"
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.11.14"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}