January 31, 2025
This release includes our new unified versioning scheme for our software, support for thresholds in unit metrics and custom context for test descriptions within the ValidMind Library, and many more enhancements.
Release highlights — 25.01
ValidMind Library (v2.7.7)
Threshold lines in unit metric plots
When logging metrics using log_metric()
, you can now include a thresholds
dictionary. For example, use thresholds={"target": 0.8, "minimum": 0.6}
to define multiple reference levels.
These thresholds automatically appear as horizontal reference lines when you add a Metric Over Time block to the documentation.
The visualization uses a distinct color palette to differentiate between thresholds. It displays only the most recent threshold configuration and includes threshold information in both the chart legend and data table.
Usage example:
log_metric(="AUC Score",
key=auc,
value=datetime(2024, 1, 1),
recorded_at={
thresholds"high_risk": 0.6,
"medium_risk": 0.7,
"low_risk": 0.8,
} )
This enhancement provides immediate visual context for metric values. It helps track metric performance against multiple defined thresholds over time.
Add context to enhance LLM-based text generation for model test results
You can now include contextual information to enhance LLM-based generation of test results descriptions and interpretations. This enhancement improves test result descriptions by incorporating additional context that can be specified through environment variables.
A new notebook demonstrates adding context to LLM-based descriptions with examples of:
- Setting up the environment
- Initializing datasets and models
- Running tests with and without context
Document credit risk scorecard models using XGBoost
We’ve introduced enhancements to the ValidMind Library that focus on documenting credit risk scorecard models:
New notebooks: Learn how to document application scorecard models using the library. These notebooks provide a step-by-step guide for loading a demo dataset, preprocessing data, training models, and documenting the model.
You can choose from three different approaches: running individual tests, running a full test suite, or using a single function to document a model.
New tests:
MutualInformation
: Evaluates feature relevance by calculating mutual information scores between features and the target variable.
ScoreBandDefaultRates
: Analyzes default rates and population distribution across credit score bands.
CalibrationCurve
: Assesses calibration by comparing predicted probabilities against observed frequencies.
ClassifierThresholdOptimization
: Visualizes threshold optimization methods for binary classification models.
ModelParameters
: Extracts and displays model parameters for transparency and reproducibility.
ScoreProbabilityAlignment
: Evaluates alignment between credit scores and predicted probabilities.
Modifications have also been made to existing tests to improve functionality and accuracy. The TooManyZeroValues
test now includes a row count and applies a percentage threshold for zero values.
The split
function in lending_club.py
has been enhanced to support an optional validation set, allowing for more flexible dataset splitting.
A new utility function, get_demo_test_config
, has been added to generate a default test configuration for demo purposes.
Ongoing monitoring notebook for application scorecard model
Several enhancements to the ValidMind Library focus on ongoing monitoring capabilities:
New notebook: Learn how to use ongoing monitoring with credit risk datasets in this step-by-step guide for the ValidMind Library.
- Use our new metrics for data and model drift, and populate the ongoing monitoring documentation for a scorecard model.1
Custom tests: Define and run your own tests using the library:
ScoreBandDiscriminationMetrics.py
: Evaluates discrimination metrics across different score bands.
New tests:
CalibrationCurveDrift
: Evaluates changes in probability calibration.
ClassDiscriminationDrift
: Compares classification discrimination metrics.
ClassImbalanceDrift
: Evaluates drift in class distribution.
ClassificationAccuracyDrift
: Compares classification accuracy metrics.
ConfusionMatrixDrift
: Compares confusion matrix metrics.
CumulativePredictionProbabilitiesDrift
: Compares cumulative prediction probability distributions.
FeatureDrift
: Evaluates changes in feature distribution.
PredictionAcrossEachFeature
: Assesses prediction distributions across features.
PredictionCorrelation
: Assesses correlation changes between predictions and features.
PredictionProbabilitiesHistogramDrift
: Compares prediction probability distributions.
PredictionQuantilesAcrossFeatures
: Assesses prediction distributions across features using quantiles.
ROCCurveDrift
: Compares ROC curves.
ScoreBandsDrift
: Analyzes drift in score bands.
ScorecardHistogramDrift
: Compares score distributions.
TargetPredictionDistributionPlot
: Assesses differences in prediction distributions.
We also improved dataset loading, preprocessing, and feature engineering functions with verbosity control for cleaner output.
Jupyter Notebook templates
Want to create your own code samples using ValidMind’s? We’ve now made it easier for contributors to submit custom code samples.
Our end-to-end notebook template generation notebook will generate a new file with all the bits and pieces of a standard ValidMind notebook to get you started.
The same functionality is also accessible from our Makefile:
make notebook
Mini-templates
The template generation notebook draws from a number of mini-templates, should you need to revise them or grab the information from them manually:
about-validmind.ipynb
: Conceptual overview of ValidMind & prerequisites.install-initialize-validmind.ipynb
: ValidMind Library installation & initialization instructions.next-steps.ipynb
: Directions to review the generated documentation within the ValidMind Platform & additional learning resources.upgrade-validmind.ipynb
: Instructions for comparing & upgrading versions of the ValidMind Library.
ValidMind Platform (v1.29.10)
Edit your dashboards
We’ve streamlined dashboard configuration with dedicated view and edit modes. Click Edit Mode to make changes, then click Done Editing to save and return to view mode:
To prevent any confusion when multiple people are working on the same dashboard, we’ve added some helpful safeguards:
- If someone else makes changes while you’re editing, you’ll get a friendly notification to reload the page
- The system automatically detects if you’re looking at an older version of the dashboard and prompts you to get the latest updates
Optional prompt for risk assessments
Risk assessment generation has been enhanced to allow you to provide an optional prompt before starting text generation. This feature lets you guide the output, ensuring that the generated text aligns more closely with your specific requirements.
Enhancements
ValidMind Library (v2.7.7)
Static descriptions in test results
The TestResult
class now exposes pre-populated test descriptions through the doc
property, separating them from dynamically generated GenAI descriptions:
result.doc
— contains the original docstring of the test.result.description
— contains the dynamically generated description.
This enhancement makes it easier to distinguish between ValidMind’s standard test documentation and the dynamic, context-aware descriptions generated for your specific test results.
You can browse the full catalog of official test descriptions in our test documentation:
Raw data storage for tests
We added raw data storage across all ValidMind Library tests. Every test now returns a RawData
object, allowing post-processing functions to recreate any test output. This feature enhances flexibility and customizability.
New print_env
function
We’ve added a new diagnostic print_env()
utility function that displays comprehensive information about your running environment. This function is particularly useful when:
- Troubleshooting issues in your code
- Seeking support from the ValidMind team
- Verifying your environment configuration
Usage example:
import validmind
validmind.print_env()
This function outputs key details, such as Python version, installed package versions, and relevant environment variables, making it easier to diagnose issues and share your setup with others.
ValidMind Platform (v1.29.10)
Simplified workflow nodes
Workflows are now easier to read when zoomed out, helped by a larger modal window and simplified nodes:
Zooming in reveals more details:
Hovering over a node highlights all in
and out
connections, making relationships clearer:
New editor for mathematical formulas
We replaced the plugin for the editor of mathematical equations and formulas. The new plugin provides an improved interface for adding and editing LaTeX expressions in your documentation.
The new editor also includes a real-time preview and common mathematical symbols for easier equation creation.
How to upgrade
ValidMind Platform
To access the latest version of the ValidMind Platform,2 hard refresh your browser tab:
- Windows:
Ctrl
+Shift
+R
ORCtrl
+F5
- MacOS:
⌘ Cmd
+Shift
+R
OR hold down⌘ Cmd
and click theReload
button
ValidMind Library
To upgrade the ValidMind Library:3
In your Jupyter Notebook:
Then within a code cell or your terminal, run:
%pip install --upgrade validmind
You may need to restart your kernel after running the upgrade package for changes to be applied.