tree based algorithms¶
The module nnlearn.tree includes all tree based models along with the data structures on which they depend.
- class nnlearn.tree.DecisionTree(criterion_name='gini', min_samples_split=2, max_features=None, random_state=42)¶
Bases:
objectDecision Tree data structure.
- Parameters
criterion_name (str, optional) – Name of the metric based on which to define purity of tree nodes.
min_samples_split (int, optional) – Minimum number of samples present within given node in order for it to become an internal node.
max_features (int, optional) – Maximum number of features to take into account when deciding on how to split the node.
random_state (int, optional) – When you are not using all features to split the node and only selecting randomly a subset, then this will ensure reproducibility.
Notes
This implementation uses node objects as an underlying data structure. Each node has left and right child if it is an internal node or root.
- class nnlearn.tree.DecisionTreeClassifier(criterion_name='gini', min_samples_split=2, max_features=None, random_state=42)¶
Bases:
objectThe DecisionTreeClassifier is a tree based ML model used for classification.
- Parameters
criterion_name (str, optional) – Name of the metric based on which to define purity of tree nodes. Options: {‘gini’, ‘entropy’}
min_samples_split (int, optional) – Minimum number of samples present within given node in order for it to become an internal node.
min_samples_leaf (int, optional) – Mimimum number of leaves to be present within a leaf.
max_features (int, optional) – Maximum number of features to take into account when deciding on how to split the node.
random_state (int, optional) – When you are not using all features to split the node and only selecting randomly a subset, then this will ensure reproducibility.
- fit(X, y)¶
Train the model.
- Parameters
X (2d array) – Training data.
y (1d array) – Ground truth values.
- predict(X)¶
Predicts labels for given records.
- Parameters
X (2d array) – Data based on which to predict labels.
- Returns
Predicted labels.
- Return type
1d array
- class nnlearn.tree.Node(X, y, tree, impurity=None, **kwargs)¶
Bases:
objectNode object serves as a core element as part of the deciosion tree data structure.
- Parameters
X (2d array) – Records which the given nodes holds.
y (1d array) – Labels for the records.
tree (
DecisionTree) – Decision tree object.impurity (float, optional) – Impurity of this node.
- threshold¶
Value where to make split.
- Type
float
- feature¶
Index of feature within
Xbased on which to do the split.- Type
int
- is_leaf_node()¶
Return if the current node is a leaf node.
- split()¶
Split the node if it is possible.