fairlearn.metrics package¶
Functionality for computing metrics, with a particular focus on group metrics.
For our purpose, a metric is a function with signature
f(y_true, y_pred, ....)
where y_true
are the set of true values and y_pred
are
values predicted by a machine learning algorithm. Other
arguments may be present (most often sample weights), which will
affect how the metric is calculated.
The group metrics in this module have signatures
g(y_true, y_pred, group_membership, ...)
where group_membership
is an array of values indicating
a group to which each pair of true and predicted values belong.
The metric is evaluated for the entire set of data, and also
for each subgroup identified in group_membership
.

fairlearn.metrics.
demographic_parity_difference
(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶ Calculate the demographic parity difference.
 Parameters
y_true (1Darray) – Ground truth (correct) labels.
y_pred (1Darray) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1Darray) – Sensitive features.
sample_weight (1Darray) – Sample weights.
 Returns
The difference between the largest and the smallest grouplevel selection rate, \(E[h(X)  A=a]\), across all values \(a\) of the sensitive feature. The demographic parity difference of 0 means that all groups have the same selection rate.

fairlearn.metrics.
demographic_parity_ratio
(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶ Calculate the demographic parity ratio.
 Parameters
y_true (1Darray) – Ground truth (correct) labels.
y_pred (1Darray) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1Darray) – Sensitive features.
sample_weight (1Darray) – Sample weights.
 Returns
The ratio between the smallest and the largest grouplevel selection rate, \(E[h(X)  A=a]\), across all values \(a\) of the sensitive feature. The demographic parity ratio of 1 means that all groups have the same selection rate.

fairlearn.metrics.
difference_from_summary
(summary)[source]¶ Calculate the difference between the maximum and minimum metric value across groups.
 Parameters
summary – A group metric summary
 Returns
The difference between the maximum and the minimum grouplevel metrics described in
summary
. Return type

fairlearn.metrics.
equalized_odds_difference
(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶ Calculate the equalized odds difference.
 Parameters
y_true (1Darray) – Ground truth (correct) labels \(Y\).
y_pred (1Darray) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1Darray) – Sensitive features.
sample_weight (1Darray) – Sample weights.
 Returns
The greater of two metrics: true_positive_rate_difference and false_negative_rate_difference. The former is the difference between the largest and smallest of \(P[h(X)=1  A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1  A=a, Y=0]\). The equalized odds difference of 0 means that all groups have the same true positive, true negative, false positive, and false negative rates.

fairlearn.metrics.
equalized_odds_ratio
(y_true, y_pred, *, sensitive_features, sample_weight=None)[source]¶ Calculate the equalized odds ratio.
 Parameters
y_true (1Darray) – Ground truth (correct) labels \(Y\).
y_pred (1Darray) – Predicted labels \(h(X)\) returned by the classifier.
sensitive_features (1Darray) – Sensitive features.
sample_weight (1Darray) – Sample weights.
 Returns
The smaller of two metrics: true_positive_rate_ratio and false_negative_rate_ratio. The former is the ratio between the smallest and largest of \(P[h(X)=1  A=a, Y=1]\), across all values \(a\) of the sensitive feature. The latter is defined similarly, but for \(P[h(X)=1  A=a, Y=0]\). The equalized odds ratio of 1 means that all groups have the same true positive, true negative, false positive, and false negative rates.

fairlearn.metrics.
false_negative_rate
(y_true, y_pred, sample_weight=None, pos_label=None)[source]¶ Calculate the false negative rate (also called miss rate).
 Parameters
y_true (arraylike) – The list of true values
y_pred (arraylike) – The list of predicted values
sample_weight (arraylike, optional) – A list of weights to apply to each sample. By default all samples are weighted equally
pos_label (scalar, optional) – The value to treat as the ‘positive’ label in the samples. If None (the default) then the largest unique value of the y arrays will be used.
 Returns
The false negative rate for the data
 Return type

fairlearn.metrics.
false_positive_rate
(y_true, y_pred, sample_weight=None, pos_label=None)[source]¶ Calculate the false positive rate (also called fallout).
 Parameters
y_true (arraylike) – The list of true values
y_pred (arraylike) – The list of predicted values
sample_weight (arraylike, optional) – A list of weights to apply to each sample. By default all samples are weighted equally
pos_label (scalar, optional) – The value to treat as the ‘positive’ label in the samples. If None (the default) then the largest unique value of the y arrays will be used.
 Returns
The false positive rate for the data
 Return type

fairlearn.metrics.
group_max_from_summary
(summary)[source]¶ Retrieve the minimum grouplevel metric value from group summary.
 Parameters
summary – A group metric summary
 Returns
The maximum grouplevel metric value across all groups in
summary
. Return type

fairlearn.metrics.
group_min_from_summary
(summary)[source]¶ Retrieve the minimum grouplevel metric value from group summary.
 Parameters
summary – A group metric summary
 Returns
The minimum grouplevel metric value across all groups in
summary
. Return type

fairlearn.metrics.
group_summary
(metric_function, y_true, y_pred, *, sensitive_features, indexed_params=None, **metric_params)[source]¶ Apply a metric to each subgroup of a set of data.
 Parameters
metric_function – Function with signature
metric_function(y_true, y_pred, \*\*metric_params)
y_true – Array of groundtruth values
y_pred – Array of predicted values
sensitive_features – Array indicating the group to which each input value belongs
indexed_params – Names of
metric_function
parameters that should be split according tosensitive_features
in addition toy_true
andy_pred
. Defaults toNone
corresponding to{"sample_weight"}
.**metric_params – Optional arguments to be passed to the
metric_function
 Returns
Object containing the result of applying
metric_function
to the entire dataset and to each group identified insensitive_features
 Return type
sklearn.utils.Bunch
with the fieldsoverall
andby_group

fairlearn.metrics.
make_derived_metric
(transformation_function, summary_function, name=None)[source]¶ Make a callable that calculates a derived metric from the group summary.
 Parameters
transformation_function (func) – A transformation function with the signature
transformation_function(summary)
summary_function (func) – A metric group summary function with the signature
summary_function(y_true, y_pred, *, sensitive_features, **metric_params)
 Returns
A callable object with the signature
derived_metric(y_true, y_pred, *, sensitive_features, **metric_params)
 Return type
func

fairlearn.metrics.
make_metric_group_summary
(metric_function, indexed_params=None, name=None)[source]¶ Make a callable that calculates the group summary of a metric.
 Parameters
metric_function (func) – A metric function with the signature
metric_function(y_true, y_pred, **metric_params)
indexed_params – The names of parameters of
metric_function
that should be split according tosensitive_features
in addition toy_true
andy_pred
. Defaults toNone
corresponding to['sample_weight']
.
 Returns
A callable object with the signature
metric_group_summary(y_true, y_pred, *, sensitive_features, **metric_params)
 Return type
func

fairlearn.metrics.
mean_prediction
(y_true, y_pred, sample_weight=None)[source]¶ Calculate the (weighted) mean prediction.
The true values are ignored, but required as an argument in order to maintain a consistent interface

fairlearn.metrics.
ratio_from_summary
(summary)[source]¶ Calculate the ratio between the maximum and minimum metric value across groups.
 Parameters
summary – A group metric summary
 Returns
The ratio between the maximum and the minimum grouplevel metrics described in
summary
. Return type

fairlearn.metrics.
selection_rate
(y_true, y_pred, *, pos_label=1, sample_weight=None)[source]¶ Calculate the fraction of predicted labels matching the ‘good’ outcome.
The argument pos_label specifies the ‘good’ outcome.

fairlearn.metrics.
true_negative_rate
(y_true, y_pred, sample_weight=None, pos_label=None)[source]¶ Calculate the true negative rate (also called specificity or selectivity).
 Parameters
y_true (arraylike) – The list of true values
y_pred (arraylike) – The list of predicted values
sample_weight (arraylike, optional) – A list of weights to apply to each sample. By default all samples are weighted equally
pos_label (scalar, optional) – The value to treat as the ‘positive’ label in the samples. If None (the default) then the largest unique value of the y arrays will be used.
 Returns
The true negative rate for the data
 Return type

fairlearn.metrics.
true_positive_rate
(y_true, y_pred, sample_weight=None, pos_label=None)[source]¶ Calculate the true positive rate (also called sensitivity, recall, or hit rate).
 Parameters
y_true (arraylike) – The list of true values
y_pred (arraylike) – The list of predicted values
sample_weight (arraylike, optional) – A list of weights to apply to each sample. By default all samples are weighted equally
pos_label (scalar, optional) – The value to treat as the ‘positive’ label in the samples. If None (the default) then the largest unique value of the y arrays will be used.
 Returns
The true positive rate for the data
 Return type