shapiq.imputer.GaussianImputer¶

class shapiq.imputer.GaussianImputer(model, data, x=None, *, sample_size=100, random_state=None, verbose=False)[source]¶

Bases: Imputer

Implements the Gaussian-based approach for imputation according to [Aas21].

This approach assumes that the features of the background data form a multivariate Gaussian distribution. The missing values are imputed by drawing Monte Carlo samples from the conditional distribution given the values of the features present in a coalition.

Note that only continuous features are supported, meaning that this imputer can’t be used for datasets containing categorical or binary features.

References

[Aas21]

Aas, K., Jullum, M., and Løland, A. (2021). Explaining individual predictions when features are dependent: More accurate approximations to Shapley values. Artificial Intelligence 298, 103502. doi: https://doi.org/10.1016/j.artint.2021.103502

Initializes the class.

Parameters:
  • model (object | Game | Callable[[ndarray[tuple[Any, ...], dtype[floating]]], ndarray[tuple[Any, ...], dtype[floating]]]) – The model to explain as a callable function expecting a data points as input and returning the model’s predictions.

  • data (ndarray[tuple[Any, ...], dtype[floating]]) – The background data to use for the explainer as a two-dimensional array with shape (n_samples, n_features).

  • x (ndarray[tuple[Any, ...], dtype[floating]] | None) – The explanation point as a np.ndarray of shape (1, n_features) or (n_features,).

  • sample_size (int) – The number of Monte Carlo samples to draw from the conditional background data for imputation.

  • random_state (int | None) – An optional random seed for reproducibility.

  • verbose (bool) – A flag to enable verbose imputation, which will print a progress bar for model evaluation. Note that this can slow down the imputation process.

Raises:

CategoricalFeatureError – If the background data contains any categorical features.

value_function(coalitions)[source]¶

Imputes the missing values of a data point and gets predictions for all coalitions.

Parameters:

coalitions (ndarray[tuple[Any, ...], dtype[bool]]) – A boolean array of shape (n_coalitions, n_features) indicating which features are present (True) and which are missing (False).

Return type:

ndarray[tuple[Any, ...], dtype[floating]]

Returns:

The model’s predictions on the imputed data points as an array of shape (n_coalitions,).

Raises:

RuntimeError – If no explanation point has been provided, neither in the constructor nor by calling fit().

property cov_mat: ndarray[tuple[Any, ...], dtype[floating]]¶

The covariance matrix of the features.

This proprety is only computed once and then cached.

property mean_per_feature: ndarray[tuple[Any, ...], dtype[floating]]¶

The mean value for each feature.

This proprety is only computed once and then cached.