ImpurityDecreaseDisplay#

class skore.ImpurityDecreaseDisplay(*, importances, report_type)[source]#

Display to inspect the Mean Decrease in Impurity (MDI) of tree-based models.

Parameters:
importancesDataFrame

The importances data to display. The columns are:

  • estimator

  • split

  • feature

  • importance

report_type{“estimator”, “cross-validation”, “comparison-estimator”, “comparison-cross-validation”}

Report type from which the display is created.

Attributes:
ax_matplotlib Axes

Matplotlib Axes with the plot.

facet_seaborn FacetGrid

FacetGrid containing the plot.

figure_matplotlib Figure

Figure containing the plot.

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>> from skore import EstimatorReport, train_test_split
>>> iris = load_iris(as_frame=True)
>>> X, y = iris.data, iris.target
>>> y = iris.target_names[y]
>>> split_data = train_test_split(
...     X=X, y=y, random_state=0, as_dict=True, shuffle=True
... )
>>> report = EstimatorReport(
...     RandomForestClassifier(random_state=0), **split_data
... )
>>> display = report.inspection.impurity_decrease()
>>> display.frame()
                feature  importance
0  sepal length (cm)     0.1...
1   sepal width (cm)     0.0...
2  petal length (cm)     0.4...
3   petal width (cm)     0.3...
frame(*, aggregate=('mean', 'std'), select_k=None, sorting_order=None)[source]#

Get the mean decrease in impurity in a dataframe format.

Parameters:
aggregate{“mean”, “std”}, (“mean”, “std”) or None, default=(“mean”, “std”)

Aggregate the importances over splits. Only relevant when report_type is "cross-validation" or "comparison-cross-validation"; ignored otherwise. If None, the raw per-split values are returned.

select_kint, default=None

Select features by importance: positive for top k, negative for bottom k. Selection is per estimator when applicable. For cross-validation, ranking uses mean importance across splits. When aggregate is None, ranking uses mean importance per feature over splits; all split rows are kept for selected features.

sorting_order{“descending”, “ascending”, None}, default=None

Sort features by importance (descending = most important first). When aggregate is None, ordering uses mean importance per feature over splits.

Returns:
DataFrame

Dataframe containing the mean decrease in impurity. When aggregate is not None and the report type involves cross-validation splits, the split column is removed and importance is replaced by aggregated columns: importance_mean and importance_std.

help()[source]#

Display display help using rich or HTML.

plot(*, select_k=None, sorting_order=None)[source]#

Plot the mean decrease in impurity for the different features.

Parameters:
select_kint, default=None

If set, only the top (positive) or bottom (negative) k features by importance are shown. See frame() for details.

sorting_order{“descending”, “ascending”, None}, default=None

Sort features by importance before plotting. See frame() for details.

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>> from skore import EstimatorReport, train_test_split
>>> iris = load_iris(as_frame=True)
>>> X, y = iris.data, iris.target
>>> y = iris.target_names[y]
>>> split_data = train_test_split(
...     X=X, y=y, random_state=0, as_dict=True, shuffle=True
... )
>>> report = EstimatorReport(RandomForestClassifier(), **split_data)
>>> display = report.inspection.impurity_decrease()
>>> display.plot()
set_style(*, policy='update', barplot_kwargs=None, stripplot_kwargs=None, boxplot_kwargs=None)[source]#

Set the style parameters for the display.

Parameters:
policy{“override”, “update”}, default=”update”

Policy to use when setting the style parameters. If “override”, existing settings are set to the provided values. If “update”, existing settings are not changed; only settings that were previously unset are changed.

barplot_kwargsdict, default=None

Keyword arguments to be passed to seaborn.barplot() for rendering the mean decrease in impurity with an EstimatorReport.

stripplot_kwargsdict, default=None

Keyword arguments to be passed to seaborn.stripplot() for rendering the mean decrease in impurity with a CrossValidationReport.

boxplot_kwargsdict, default=None

Keyword arguments to be passed to seaborn.boxplot() for rendering the mean decrease in impurity with a CrossValidationReport.

Returns:
None
Raises:
ValueError

If a style parameter is unknown.

static style_plot(plot_func)[source]#

Apply consistent style to skore displays.

This decorator: 1. Applies default style settings 2. Executes plot_func 3. Calls plt.tight_layout() to make sure axis does not overlap 4. Restores the original style settings

Parameters:
plot_funccallable

The plot function to be decorated.

Returns:
callable

The decorated plot function.