metrics¶

The nnlearn.metric module includes functions to measure performance of implemented algorithms.

nnlearn.metrics.absolute_error(y, p)¶

Absolute error.

Absolute error can be defined as follows:

\[\sum_i^n abs(y_i - p_i)\]

where \(n\) is the number of provided records.

Parameters

y (ndarray) – One dimensional array with ground truth values.
p (ndarray) – One dimensional array with predicted values.

Returns

Absolute error as desribed above.

Return type

float

nnlearn.metrics.accuracy_score(y_true, y_hat, **kwargs)¶

Number of correctly classified records over the number of all records.

Parameters

y_true (iterable object) – Ground values.
y_hat (iterale object) – Predicted values.
**kwargs – Arbitrary keyword arguments.

Returns

Accuracy score.

Return type

float

Notes

Accuracy score is a useful metric when your dataset is balanced in terms of labels.

nnlearn.metrics.cross_entropy_score(p)¶

Computation of cross entropy for given two vectors

Parameters: p (float) – Probability of true class.

Notes

Read about details here.

nnlearn.metrics.entropy_score(y)¶

Measure of Entropy of a random variable Y.

Parameters: y (1d array) – Labels of classes.
Returns: Value between 0 to +inf depending on the number of clasess.
Return type: float

Notes

Entropy is a measure of disorder. The higher the entropy, the more disorder there is present. As an example, if you have binary classes where 50 % is positive and the rest negative, then your entropy would be 1 (high), if you only have positive samples, then your entropy is 0. (low) The formula for entropy is as follows: .. math:

E(S) = \sum_i^c -p_i log_2 p_i

where c is number of classes you have.

nnlearn.metrics.error_rate(y, p)¶

Number of incorrectly classified records.

Parameters

y (ndarray) – One dimensional array with ground truth values.
p (ndarray) – One dimensional array with predicted values.

Returns

Number of misclassified records.

Return type

int

nnlearn.metrics.gini_score(y)¶

Measure of Gini impurity.

Parameters: y (1d array) – Labels of classes.
Returns: Float between 0 and 1.
Return type: float

Notes

Gini impurity is usually used within the context of DecisionTrees. The value ranges between 0 and 1. If 0, it means that within your dataset, you only have one class. If more than 0, it means that there is certain likelihood that you will misclassify given sample from your dataset.

For more info, I suggest you visit this blog.

nnlearn.metrics.information_gain_score(x, y)¶

Measure of Information gain.

Parameters

y1 (1d array) – Labels of classes in the parent node.
y2 (1d array) – Labels of classes after the split of the parent node.

Notes

Information gain tells you how much you can tell about certain variable given some other variable. The formula for the information gain is as follows:

\[IG(X, Y) = E(Y) - E(Y | X)\]

where E refers to entropy_score(). If the IG is low, it means that given X, we know a lot about Y. In other words, the more we reduce entropy (disorder) of our target variable Y, the larger the information gain is.k

nnlearn.metrics.mean_absolute_error(y, p)¶

Mean absolute error

Parameters

y (ndarray) – One dimensional array with ground truth values.
p (ndarray) – One dimensional array with predicted values.

Returns

Mean absolute error as desribed above.

Return type

float

nnlearn.metrics.mean_cross_entropy_score(Y, P)¶

Computation of cross entropy for array of vectors

Parameters

y (ndarray) – 1D array with target values
P (ndarray) – 2D array where each row is an array with predicted probabilities for each class.

Notes

Read about details here.

nnlearn.metrics.mean_squared_error(Y, P, var=True)¶

Mean of squared error

Parameters

Y (ndarray) – One dimensional array with ground truth values.
P (ndarray) – One dimensional array with predicted values.

Returns

Mean squared error.

Return type

float