shapiq.explainer.tree.base#

This module contains the base class for tree model conversion.

Functions

convert_tree_output_type(tree_model, output_type)

Convert the output type of the tree model.

Classes

`EdgeTree`(parents, ancestors, ancestor_nodes, ...)	A dataclass for storing the information of an edge representation of the tree.
`TreeModel`(children_left, children_right, ...)	A dataclass for storing the information of a tree model.

class shapiq.explainer.tree.base.EdgeTree(parents, ancestors, ancestor_nodes, p_e_values, p_e_storages, split_weights, empty_predictions, edge_heights, max_depth, last_feature_node_in_path, interaction_height_store, has_ancestors=None)[source]#

Bases: object

A dataclass for storing the information of an edge representation of the tree.

The dataclass stores the information of an edge representation of the tree in a way that is easy to access and manipulate for the TreeSHAP-IQ algorithm.

# TODO: add more information about the attributes

ancestor_nodes: dict[int, ndarray[int]]#

ancestors: ndarray[int]#

edge_heights: ndarray[int]#

empty_predictions: ndarray[float]#

has_ancestors: Optional[ndarray[bool]] = None#

interaction_height_store: dict[int, ndarray[int]]#

last_feature_node_in_path: ndarray[int]#

max_depth: int#

p_e_storages: ndarray[float]#

p_e_values: ndarray[float]#

parents: ndarray[int]#

split_weights: ndarray[float]#

class shapiq.explainer.tree.base.TreeModel(children_left, children_right, features, thresholds, values, node_sample_weight, empty_prediction=None, leaf_mask=None, n_features_in_tree=None, max_feature_id=None, feature_ids=None, root_node_id=None, n_nodes=None, nodes=None, feature_map_original_internal=None, feature_map_internal_original=None, original_output_type='raw')[source]#

Bases: object

A dataclass for storing the information of a tree model.

The dataclass stores the information of a tree model in a way that is easy to access and manipulate. The dataclass is used to convert tree models from different libraries to a common format.

children_left#: The left children of each node in a tree. Leaf nodes are -1.

children_right#: The right children of each node in a tree. Leaf nodes are -1.

features#: The feature indices of the decision nodes in a tree. Leaf nodes are assumed to be -2 but no check is performed.

thresholds#: The thresholds of the decision nodes in a tree. Leaf nodes are set to NaN.

values#: The values of the leaf nodes in a tree.

node_sample_weight#: The sample weights of the nodes in a tree.

empty_prediction#: The empty prediction of the tree model. The default value is None. Then the empty prediction is computed from the leaf values and the sample weights.

leaf_mask#: The boolean mask of the leaf nodes in a tree. The default value is None. Then the leaf mask is computed from the children left and right arrays.

n_features_in_tree#: The number of features in the tree model. The default value is None. Then the number of features in the tree model is computed from the unique feature indices in the features array.

max_feature_id#: The maximum feature index in the tree model. The default value is None. Then the maximum feature index in the tree model is computed from the features array.

feature_ids#: The feature indices of the decision nodes in the tree model. The default value is None. Then the feature indices of the decision nodes in the tree model are computed from the unique feature indices in the features array.

root_node_id#: The root node id of the tree model. The default value is None. Then the root node id of the tree model is set to 0.

n_nodes#: The number of nodes in the tree model. The default value is None. Then the number of nodes in the tree model is computed from the children left array.

nodes#: The node ids of the tree model. The default value is None. Then the node ids of the tree model are computed from the number of nodes in the tree model.

feature_map_original_internal#: A mapping of feature indices from the original feature indices (as in the model) to the internal feature indices (as in the tree model).

feature_map_internal_original#: A mapping of feature indices from the internal feature indices (as in the tree model) to the original feature indices (as in the model).

original_output_type#: The original output type of the tree model. The default value is “raw”.

compute_empty_prediction()[source]#

Compute the empty prediction of the tree model.

The method computes the empty prediction of the tree model by taking the weighted average of the leaf node values. The method modifies the tree model in place.

Return type:: None

reduce_feature_complexity()[source]#

Reduces the feature complexity of the tree model.

The method reduces the feature complexity of the tree model by removing unused features and reindexing the feature indices of the decision nodes in the tree. The method modifies the tree model in place. To see the original feature mappings, use the feature_mapping_old_new and feature_mapping_new_old attributes.

For example, consider a tree model with the following feature indices: :rtype: None

[0, 1, 8]

The method will remove the unused feature indices and reindex the feature indices of the decision nodes in the tree to the following:

[0, 1, 2]

Feature ‘8’ is ‘renamed’ to ‘2’ such that in the internal representation a one-hot vector (and matrices) of length 3 suffices to represent the feature indices.

children_left: ndarray[int]#

children_right: ndarray[int]#

empty_prediction: Optional[float] = None#

feature_ids: Optional[set] = None#

feature_map_internal_original: Optional[dict[int, int]] = None#

feature_map_original_internal: Optional[dict[int, int]] = None#

features: ndarray[int]#

leaf_mask: Optional[ndarray[bool]] = None#

max_feature_id: Optional[int] = None#

n_features_in_tree: Optional[int] = None#

n_nodes: Optional[int] = None#

node_sample_weight: ndarray[float]#

nodes: Optional[ndarray[int]] = None#

original_output_type: str = 'raw'#

root_node_id: Optional[int] = None#

thresholds: ndarray[float]#

values: ndarray[float]#

shapiq.explainer.tree.base.convert_tree_output_type(tree_model, output_type)[source]#

Convert the output type of the tree model.

Parameters:

tree_model (TreeModel) – The tree model to convert.
output_type (str) – The output type to convert the tree model to. Can be “raw”, “probability”, or “logit”.

Return type:

tuple[TreeModel, bool]

Returns:

The converted tree model and a warning flag indicating whether invalid probability values: were adjusted in logit transformation.