• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • ValidMind Library
    • Python API
    • Public REST API

    • Training Courses
  • Log In
  1. Run tests & test suites
  2. Intro to Assign Scores
  • ValidMind Library
  • Supported models

  • Quickstart
  • Quickstart for model documentation
  • Quickstart for model validation
  • Install and initialize ValidMind Library
  • Store model credentials in .env files

  • Model Development
  • 1 — Set up ValidMind Library
  • 2 — Start model development process
  • 3 — Integrate custom tests
  • 4 — Finalize testing & documentation

  • Model Validation
  • 1 — Set up ValidMind Library for validation
  • 2 — Start model validation process
  • 3 — Developing a challenger model
  • 4 — Finalize validation & reporting

  • Model Testing
  • Run tests & test suites
    • Intro to Assign Scores
    • Configure dataset features
    • Customize test result descriptions
    • Document multiple results for the same test
    • Enable PII detection in tests
    • Explore test suites
    • Explore tests
    • Dataset Column Filters when Running Tests
    • Load dataset predictions
    • Log metrics over time
    • Run individual documentation sections
    • Run documentation tests with custom configurations
    • Run tests with multiple datasets
    • Intro to Unit Metrics
    • Understand and utilize RawData in ValidMind tests
    • Introduction to ValidMind Dataset and Model Objects
    • Run Tests
      • Run dataset based tests
      • Run comparison tests
  • Test descriptions
    • Data Validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LaggedCorrelationHeatmap
      • LJungBox
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • Nlp
        • CommonWords
        • Hashtags
        • LanguageDetection
        • Mentions
        • PolarityAndSubjectivity
        • Punctuations
        • Sentiment
        • StopWords
        • TextDescription
        • Toxicity
    • Model Validation
      • BertScore
      • BleuScore
      • ClusterSizeDistribution
      • ContextualRecall
      • FeaturesAUC
      • MeteorScore
      • ModelMetadata
      • ModelPredictionResiduals
      • RegardScore
      • RegressionResidualsPlot
      • RougeScore
      • TimeSeriesPredictionsPlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • Embeddings
        • ClusterDistribution
        • CosineSimilarityComparison
        • CosineSimilarityDistribution
        • CosineSimilarityHeatmap
        • DescriptiveAnalytics
        • EmbeddingsVisualization2D
        • EuclideanDistanceComparison
        • EuclideanDistanceHeatmap
        • PCAComponentsPairwisePlots
        • StabilityAnalysisKeyword
        • StabilityAnalysisRandomNoise
        • StabilityAnalysisSynonyms
        • StabilityAnalysisTranslation
        • TSNEComponentsPairwisePlots
      • Ragas
        • AnswerCorrectness
        • AspectCritic
        • ContextEntityRecall
        • ContextPrecision
        • ContextPrecisionWithoutReference
        • ContextRecall
        • Faithfulness
        • NoiseSensitivity
        • ResponseRelevancy
        • SemanticSimilarity
      • Sklearn
        • AdjustedMutualInformation
        • AdjustedRandIndex
        • CalibrationCurve
        • ClassifierPerformance
        • ClassifierThresholdOptimization
        • ClusterCosineSimilarity
        • ClusterPerformanceMetrics
        • CompletenessScore
        • ConfusionMatrix
        • FeatureImportance
        • FowlkesMallowsScore
        • HomogeneityScore
        • HyperParametersTuning
        • KMeansClustersOptimization
        • MinimumAccuracy
        • MinimumF1Score
        • MinimumROCAUCScore
        • ModelParameters
        • ModelsPerformanceComparison
        • OverfitDiagnosis
        • PermutationFeatureImportance
        • PopulationStabilityIndex
        • PrecisionRecallCurve
        • RegressionErrors
        • RegressionErrorsComparison
        • RegressionPerformance
        • RegressionR2Square
        • RegressionR2SquareComparison
        • RobustnessDiagnosis
        • ROCCurve
        • ScoreProbabilityAlignment
        • SHAPGlobalImportance
        • SilhouettePlot
        • TrainingTestDegradation
        • VMeasure
        • WeakspotsDiagnosis
      • Statsmodels
        • AutoARIMA
        • CumulativePredictionProbabilities
        • DurbinWatsonTest
        • GINITable
        • KolmogorovSmirnov
        • Lilliefors
        • PredictionProbabilitiesHistogram
        • RegressionCoeffs
        • RegressionFeatureSignificance
        • RegressionModelForecastPlot
        • RegressionModelForecastPlotLevels
        • RegressionModelSensitivityPlot
        • RegressionModelSummary
        • RegressionPermutationFeatureImportance
        • ScorecardHistogram
    • Ongoing Monitoring
      • CalibrationCurveDrift
      • ClassDiscriminationDrift
      • ClassificationAccuracyDrift
      • ClassImbalanceDrift
      • ConfusionMatrixDrift
      • CumulativePredictionProbabilitiesDrift
      • FeatureDrift
      • PredictionAcrossEachFeature
      • PredictionCorrelation
      • PredictionProbabilitiesHistogramDrift
      • PredictionQuantilesAcrossFeatures
      • ROCCurveDrift
      • ScoreBandsDrift
      • ScorecardHistogramDrift
      • TargetPredictionDistributionPlot
    • Plots
      • BoxPlot
      • CorrelationHeatmap
      • HistogramPlot
      • ViolinPlot
    • Prompt Validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
    • Stats
      • CorrelationAnalysis
      • DescriptiveStats
      • NormalityTests
      • OutlierDetection
  • Test sandbox beta

  • Notebooks
  • Code samples
    • Agents
      • AI Agent Validation with ValidMind - Banking Demo
    • Capital Markets
      • Quickstart for knockout option pricing model documentation
      • Quickstart for Heston option pricing model using QuantLib
    • Code Explainer
      • Quickstart for model code documentation
    • Credit Risk
      • Document an application scorecard model
      • Document an application scorecard model
      • Document a credit risk model
      • Document an application scorecard model
      • Document an Excel-based application scorecard model
    • Custom Tests
      • Implement custom tests
      • Integrate external test providers
    • Model Validation
      • Validate an application scorecard model
    • Nlp and Llm
      • Sentiment analysis of financial data using a large language model (LLM)
      • Summarization of financial data using a large language model (LLM)
      • Sentiment analysis of financial data using Hugging Face NLP models
      • Summarization of financial data using Hugging Face NLP models
      • Automate news summarization using LLMs
      • Prompt validation for large language models (LLMs)
      • RAG Model Benchmarking Demo
      • RAG Model Documentation Demo
    • Ongoing Monitoring
      • Ongoing Monitoring for Application Scorecard
      • Quickstart for ongoing monitoring of models with ValidMind
    • Regression
      • Document a California Housing Price Prediction regression model
    • Time Series
      • Document a time series forecasting model
      • Document a time series forecasting model

  • Reference
  • ValidMind Library Python API
  • ​ValidMind Public REST API

On this page

  • Contents
  • About ValidMind
    • Before you begin
    • New to ValidMind?
  • Install the ValidMind Library
  • Initialize the ValidMind Library
    • Get your code snippet
  • Load the demo dataset
  • Train models for testing
  • Initialize ValidMind objects
  • Assign predictions
  • Using assign_scores()
    • Basic Usage
    • Single Scorer Assignment
    • A Scorer returns complex object
    • Multiple Scorers Assignment
    • Passing Parameters to Scorer
    • Multi-Model scorers
    • Scorer Metrics
  • Next steps
    • Work with your model documentation
    • Discover more learning resources
  • Upgrade ValidMind
  • Edit this page
  • Report an issue
  1. Run tests & test suites
  2. Intro to Assign Scores

Intro to Assign Scores

The assign_scores() method is a powerful feature that allows you to compute and add scorer scores as new columns in your dataset. This method takes a model and metric(s) as input, computes the specified metrics from the ValidMind scorer library, and adds them as new columns. The computed metrics provide per-row values, giving you granular insights into model performance at the individual prediction level.

In this interactive notebook, we demonstrate how to use the assign_scores() method effectively. We'll walk through a complete example using a customer churn dataset, showing how to compute and assign row-level metrics (like Brier Score and Log Loss) that provide detailed performance insights for each prediction. You'll learn how to work with single and multiple scorers, pass custom parameters, and handle different metric types - all while maintaining a clean, organized dataset structure. Currently, assign_scores() supports all metrics available in the validmind.scorer module.

The Power of Row-Level Scoring

Traditional model evaluation workflows often focus on aggregate metrics that provide overall performance summaries. The assign_scores() method complements this by providing granular, row-level insights that help you:

  • Identify Problematic Predictions: Spot individual cases where your model performs poorly
  • Understand Model Behavior: Analyze how model performance varies across different types of inputs
  • Enable Detailed Analysis: Perform targeted investigations on specific subsets of your data
  • Support Model Debugging: Pinpoint exactly where and why your model makes errors

Understanding assign_scores()

The assign_scores() method computes row metrics for a given model-dataset combination and adds the results as new columns to your dataset. Each new column follows the naming convention: {model.input_id}_{metric_name}, ensuring clear identification of which model and metric combination generated each score.

Key features:

  • Row-Level Focus: Computes per-prediction metrics rather than aggregate scores
  • Flexible Input: Accepts single metrics or lists of metrics
  • Parameter Support: Allows passing additional parameters to underlying metric implementations
  • Multi-Model Support: Can assign scores from multiple models to the same dataset
  • Type Agnostic: Works with classification, regression, and other model types

This approach provides detailed insights into your model's performance at the individual prediction level, enabling more sophisticated analysis and debugging workflows.

Contents

  • About ValidMind
    • Before you begin
    • New to ValidMind?
  • Install the ValidMind Library
  • Initialize the ValidMind Library
    • Get your code snippet
  • Load the demo dataset
  • Train models for testing
  • Initialize ValidMind objects
  • Assign predictions
  • Using assign_scores()
    • Basic Usage
    • Single Scorer Assignment
    • A Scorer returns complex object
    • Multiple Scorers Assignment
    • Passing Parameters to Scorer
  • Advanced assign_scores() Usage
    • Multi-Model scorers
    • Scorer Metrics
    • Custom Scorer
  • Next steps
    • Work with your model documentation
    • Discover more learning resources
  • Upgrade ValidMind

About ValidMind

ValidMind is a suite of tools for managing model risk, including risk associated with AI and statistical models.

You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on model documentation. Together, these products simplify model risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and model validators.

Before you begin

This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.

If you encounter errors due to missing modules in your Python environment, install the modules with pip install, and then re-run the notebook. For more help, refer to Installing Python Modules.

New to ValidMind?

If you haven't already seen our documentation on the ValidMind Library, we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting models and running tests, as well as find code samples and our Python Library API reference.

For access to all features available in this notebook, you'll need access to a ValidMind account.

Register with ValidMind

Install the ValidMind Library

To install the library:

%pip install -q validmind

Initialize the ValidMind Library

ValidMind generates a unique code snippet for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook.

Get your code snippet

  1. In a browser, log in to ValidMind.

  2. In the left sidebar, navigate to Model Inventory and click + Register Model.

  3. Enter the model details and click Continue. (Need more help?)

    For example, to register a model for use with this notebook, select:

    • Documentation template: Binary classification
    • Use case: Marketing/Sales - Analytics

    You can fill in other options according to your preference.

  4. Go to Getting Started and click Copy snippet to clipboard.

Next, load your model identifier credentials from an .env file or replace the placeholder with your own code snippet:

# Load your model identifier credentials from an `.env` file

%load_ext dotenv
%dotenv .env

# Or replace with your code snippet

import validmind as vm

vm.init(
    api_host="...",
    api_key="...",
    api_secret="...",
    model="...",
)

Load the demo dataset

In this example, we load a demo dataset to demonstrate the assign_scores functionality with customer churn prediction models.

from validmind.datasets.classification import customer_churn as demo_dataset

print(
    f"Loaded demo dataset with: \n\n\t• Target column: '{demo_dataset.target_column}' \n\t• Class labels: {demo_dataset.class_labels}"
)

raw_df = demo_dataset.load_data()
raw_df.head()

Train models for testing

We'll train two different customer churn models to demonstrate the assign_scores functionality with multiple models.

import xgboost as xgb
from sklearn.ensemble import RandomForestClassifier

# Preprocess the data
train_df, validation_df, test_df = demo_dataset.preprocess(raw_df)

# Prepare training data
x_train = train_df.drop(demo_dataset.target_column, axis=1)
y_train = train_df[demo_dataset.target_column]
x_val = validation_df.drop(demo_dataset.target_column, axis=1)
y_val = validation_df[demo_dataset.target_column]

# Train XGBoost model
xgb_model = xgb.XGBClassifier(early_stopping_rounds=10, random_state=42)
xgb_model.set_params(
    eval_metric=["error", "logloss", "auc"],
)
xgb_model.fit(
    x_train,
    y_train,
    eval_set=[(x_val, y_val)],
    verbose=False,
)

# Train Random Forest model
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(x_train, y_train)

print("Models trained successfully!")
print(f"XGBoost training accuracy: {xgb_model.score(x_train, y_train):.3f}")
print(f"Random Forest training accuracy: {rf_model.score(x_train, y_train):.3f}")

Initialize ValidMind objects

We initialize ValidMind dataset and model objects. The input_id parameter is crucial for the assign_scores functionality as it determines the column naming convention for assigned scores.

# Initialize datasets
vm_train_ds = vm.init_dataset(
    input_id="train_dataset",
    dataset=train_df,
    target_column=demo_dataset.target_column,
)
vm_test_ds = vm.init_dataset(
    input_id="test_dataset",
    dataset=test_df,
    target_column=demo_dataset.target_column,
)

# Initialize models with descriptive input_ids
vm_xgb_model = vm.init_model(model=xgb_model, input_id="xgboost_model")
vm_rf_model = vm.init_model(model=rf_model, input_id="random_forest_model")

print("ValidMind objects initialized successfully!")
print(f"XGBoost model ID: {vm_xgb_model.input_id}")
print(f"Random Forest model ID: {vm_rf_model.input_id}")

Assign predictions

Before we can use assign_scores(), we need to assign predictions to our datasets. This step is essential as many unit metrics require both actual and predicted values.

# Assign predictions for both models to both datasets
vm_train_ds.assign_predictions(model=vm_xgb_model)
vm_train_ds.assign_predictions(model=vm_rf_model)

vm_test_ds.assign_predictions(model=vm_xgb_model)
vm_test_ds.assign_predictions(model=vm_rf_model)

print("Predictions assigned successfully!")
print(f"Test dataset now has {len(vm_test_ds.df.columns)} columns")

Using assign_scores()

Now we'll explore the various ways to use the assign_scores() method to integrate performance metrics directly into your dataset.

Basic Usage

The assign_scores() method has a simple interface:

dataset.assign_scores(model, metrics, **kwargs)
  • model: A ValidMind model object
  • metrics: Single metric ID or list of metric IDs (can use short names or full IDs)
  • kwargs: Additional parameters passed to the underlying metric implementations

Let's first check what columns we currently have in our test dataset:

print("Current columns in test dataset:")
for i, col in enumerate(vm_test_ds.df.columns, 1):
    print(f"{i:2d}. {col}")

print(f"\nDataset shape: {vm_test_ds.df.shape}")

Single Scorer Assignment

Let's start by assigning a single Scorer - the Brier Score - for our XGBoost model on the test dataset.

# Assign Brier Score for XGBoost model
vm_test_ds.assign_scores(metrics = "validmind.scorer.classification.BrierScore", model = vm_xgb_model)

print("After assigning Brier Score:")
print(f"New column added: {vm_test_ds.df.columns}")
# Display the metric values
vm_test_ds.df.head()

A Scorer returns complex object

The OutlierScore scorer demonstrates how scorers can return complex objects. It returns a dictionary containing per-row outlier detection results. For each row, it includes: - is_outlier: Boolean indicating if the row is an outlier - anomaly_score: Numerical score indicating degree of outlierness - isolation_path: Length of isolation path in the tree

When assigned to a dataset, these dictionary values are automatically unpacked into separate columns with appropriate prefixes.

# Assign Brier Score for XGBoost model
vm_test_ds.assign_scores(metrics = "validmind.scorer.classification.OutlierScore", model = vm_xgb_model)

print("After assigning Score With Confidence:")
print(f"New column added: {vm_test_ds.df.columns}")
# Display the metric values
vm_test_ds.df.head()
# Assign Brier Score for XGBoost model
vm_test_ds.assign_scores("validmind.scorer.classification.OutlierScore")

print("After assigning Score With Confidence:")
print(f"New column added: {vm_test_ds.df.columns}")
# Display the metric values
vm_test_ds.df.head()

Multiple Scorers Assignment

We can assign multiple metrics at once by passing a list of Scorer names. This is more efficient than calling assign_scores() multiple times.

# Assign multiple classification metrics for the Random Forest model
scorer = [
    "validmind.scorer.classification.BrierScore",
    "validmind.scorer.classification.LogLoss",
    "validmind.scorer.classification.Confidence"
]

vm_test_ds.assign_scores(metrics = scorer, model = vm_rf_model)

print("After assigning multiple row metrics for Random Forest:")
rf_columns = [col for col in vm_test_ds.df.columns if 'random_forest_model' in col]
print(f"Random Forest columns: {rf_columns}")

# Display the metric values
vm_test_ds.df[rf_columns].head()

Passing Parameters to Scorer

Many row metrics accept additional parameters that are passed through to the underlying implementations. Let's demonstrate this with the LogLoss metric.

# Assign LogLoss
vm_test_ds.assign_scores(metrics = "validmind.scorer.classification.LogLoss", model = vm_xgb_model, eps = 1e-16)

# We can also assign with different parameters by calling assign_scores again
# Note: This will overwrite the previous column with the same name
print("LogLoss assigned successfully")

# Let's also assign BrierScore and Confidence
vm_test_ds.assign_scores(metrics = ["validmind.scorer.classification.BrierScore","validmind.scorer.classification.Confidence"], model = vm_xgb_model)

print("BrierScore and Confidence assigned successfully")

# Display current XGBoost metric columns
xgb_columns = [col for col in vm_test_ds.df.columns if 'xgboost_model' in col]
print(f"\nXGBoost model columns: {xgb_columns}")

vm_test_ds.df[xgb_columns].head()

Multi-Model scorers

One of the powerful features of assign_scores() is the ability to assign scores from multiple models to the same dataset, enabling detailed model comparison at the prediction level.

# Let's assign a comprehensive set of metrics for both models
comprehensive_metrics = [
    "validmind.scorer.classification.BrierScore",
    "validmind.scorer.classification.LogLoss",
    "validmind.scorer.classification.Confidence",
    "validmind.scorer.classification.Correctness"
]

# Assign for XGBoost model
vm_test_ds.assign_scores(metrics = comprehensive_metrics, model = vm_xgb_model)

# Assign for Random Forest model}
vm_test_ds.assign_scores(metrics = comprehensive_metrics, model = vm_rf_model)

print("Row-level metrics assigned for both models!")

Scorer Metrics

The next section demonstrates how to assign individual metrics that compute scores per row, rather than aggregate metrics. We'll use several important row metrics:

  • Brier Score: Measures how well calibrated the model's probability predictions are for each individual prediction
  • Log Loss: Evaluates how well the predicted probabilities match the true labels on a per-prediction basis
  • Confidence: Measures the model's confidence in its predictions for each row
  • Correctness: Indicates whether each prediction is correct (1) or incorrect (0)

All these metrics provide granular insights into model performance at the individual prediction level.

# Let's add some individual metrics that compute per-row scores
print("Adding individual metrics...")

# Add Brier Score - measures accuracy of probabilistic predictions per row
vm_test_ds.assign_scores(metrics = "validmind.scorer.classification.BrierScore", model = vm_xgb_model)
print("Added Brier Score - lower values indicate better calibrated probabilities")

# Add Log Loss - measures how well the predicted probabilities match true labels per row
vm_test_ds.assign_scores(metrics = "validmind.scorer.classification.LogLoss", model = vm_xgb_model)
print("Added Log Loss - lower values indicate better probability estimates")

# Create a comparison summary showing first few rows of individual metrics
print("\nFirst few rows of individual metrics:")
individual_metrics = [col for col in vm_test_ds.df.columns if any(m in col for m in ['BrierScore', 'LogLoss', 'Confidence', 'Correctness'])]
print(vm_test_ds.df[individual_metrics].head())
vm_test_ds._df.head()

### Custom Scorer Let's see how to create your own custom scorers using the @scorer decorator.

The example below demonstrates a scorer that looks at the class balance in the neighborhood around each data point. For each row, it will give you a score from 0 to 1, where a score closer to 1 means there's a nice even balance of classes in that area of your data. This can help you identify regions where your classes are well-mixed vs regions dominated by a single class.

from validmind.scorer import scorer
import numpy as np

@scorer("my_scorers.TestScorer") 
def test_scorer(model, dataset):
    """Custom scorer that calculates class balance ratio.
    
    Args:
        model: Not used in this scorer
        dataset: The dataset to analyze
        
    Returns:
        numpy.ndarray: Array of class balance ratios between 0 and 1,
        where values closer to 1 indicate better class balance in the local neighborhood
    """
    # Get target values
    y = dataset.df[dataset.target_column].values
    
    # Calculate local class balance in sliding windows
    window_size = 100
    balance_scores = []
    
    for i in range(len(y)):
        start_idx = max(0, i - window_size//2)
        end_idx = min(len(y), i + window_size//2)
        window = y[start_idx:end_idx]
        
        # Calculate ratio of minority class
        class_ratio = np.mean(window)
        # Adjust to be symmetric around 0.5
        balance_score = 1 - abs(0.5 - class_ratio) * 2
        
        balance_scores.append(balance_score)
        
    return np.array(balance_scores)

# Assign the class balance scores to the dataset
vm_test_ds.assign_scores(metrics = "my_scorers.TestScorer", model = vm_xgb_model)
    

Next steps

You can explore the assigned scores right in the notebook as demonstrated above. However, there's even more value in using the ValidMind Platform to work with your model documentation and monitoring.

Work with your model documentation

  1. From the Model Inventory in the ValidMind Platform, go to the model you registered earlier. (Need more help?)

  2. Click and expand the Model Development section.

The scores you've assigned using assign_scores() become part of your model's documentation and can be used in ongoing monitoring workflows. You can view these metrics over time, set up alerts for performance drift, and compare models systematically. Learn more ...

Discover more learning resources

We offer many interactive notebooks to help you work with model scoring and evaluation:

  • Run unit metrics
  • Assign predictions
  • Model comparison workflows

Or, visit our documentation to learn more about ValidMind.

Upgrade ValidMind

After installing ValidMind, you'll want to periodically make sure you are on the latest version to access any new features and other enhancements.

Retrieve the information for the currently installed version of ValidMind:

%pip show validmind

If the version returned is lower than the version indicated in our production open-source code, restart your notebook and run:

%pip install --upgrade validmind

You may need to restart your kernel after running the upgrade package for changes to be applied.

Run tests & test suites
Configure dataset features
  • ValidMind Logo
    ©
    Copyright 2025 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Model workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • Python API
    • Public REST API
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • Slack
    • GitHub
    • Blog
  • Edit this page
  • Report an issue