validmind.ProtectedClassesThresholdOptimizer

calculate_fairness_metrics

defcalculate_fairness_metrics(test_df,target,y_pred_opt,protected_classes):

calculate_group_metrics

defcalculate_group_metrics(test_df,target,y_pred_opt,protected_classes):

get_thresholds_by_group

defget_thresholds_by_group(threshold_optimizer):

initialize_and_fit_optimizer

definitialize_and_fit_optimizer(pipeline,X_train,y_train,protected_classes_df):

make_predictions

defmake_predictions(threshold_optimizer,test_df,protected_classes):

plot_thresholds

defplot_thresholds(threshold_optimizer):

ProtectedClassesThresholdOptimizer

@tags('bias_and_fairness')

@tasks('classification', 'regression')

defProtectedClassesThresholdOptimizer(dataset:validmind.vm_models.VMDataset,pipeline=None,protected_classes=None,X_train=None,y_train=None) → Tuple[Dict[str, Any], matplotlib.validmind.vm_models.figure.Figure, validmind.vm_models.RawData]:

Obtains a classifier by applying group-specific thresholds to the provided estimator.

Purpose

This test aims to optimize the fairness of a machine learning model by applying different classification thresholds for different protected groups. It helps in mitigating bias and achieving more equitable outcomes across different demographic groups.

Test Mechanism

The test uses Fairlearn's ThresholdOptimizer to:

Fit an optimizer on the training data, considering protected classes.
Apply optimized thresholds to make predictions on the test data.
Calculate and report various fairness metrics.
Visualize the optimized thresholds.

Signs of High Risk

Large disparities in fairness metrics (e.g., Demographic Parity Ratio, Equalized Odds Ratio) across different protected groups.
Significant differences in False Positive Rates (FPR) or True Positive Rates (TPR) between groups.
Thresholds that vary widely across different protected groups.

Strengths

Provides a post-processing method to improve model fairness without modifying the original model.
Allows for balancing multiple fairness criteria simultaneously.
Offers visual insights into the threshold optimization process.

Limitations

May lead to a decrease in overall model performance while improving fairness.
Requires access to protected attribute information at prediction time.
The effectiveness can vary depending on the chosen fairness constraint and objective.