• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • ValidMind Library
    • Python API
    • Public REST API

    • Training Courses
  • Log In
  1. ValidMind Library
  • ValidMind Library
  • Supported models

  • Quickstart
  • Quickstart for model documentation
  • Quickstart for model validation
  • Install and initialize ValidMind Library
  • Store model credentials in .env files

  • Model Development
  • 1 — Set up ValidMind Library
  • 2 — Start model development process
  • 3 — Integrate custom tests
  • 4 — Finalize testing & documentation

  • Model Validation
  • 1 — Set up ValidMind Library for validation
  • 2 — Start model validation process
  • 3 — Developing a challenger model
  • 4 — Finalize validation & reporting

  • Model Testing
  • Run tests & test suites
    • Intro to Assign Scores
    • Configure dataset features
    • Customize test result descriptions
    • Document multiple results for the same test
    • Enable PII detection in tests
    • Explore test suites
    • Explore tests
    • Dataset Column Filters when Running Tests
    • Load dataset predictions
    • Log metrics over time
    • Run individual documentation sections
    • Run documentation tests with custom configurations
    • Run tests with multiple datasets
    • Intro to Unit Metrics
    • Understand and utilize RawData in ValidMind tests
    • Introduction to ValidMind Dataset and Model Objects
    • Run Tests
      • Run dataset based tests
      • Run comparison tests
  • Test descriptions
    • Data Validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LaggedCorrelationHeatmap
      • LJungBox
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • Nlp
        • CommonWords
        • Hashtags
        • LanguageDetection
        • Mentions
        • PolarityAndSubjectivity
        • Punctuations
        • Sentiment
        • StopWords
        • TextDescription
        • Toxicity
    • Model Validation
      • BertScore
      • BleuScore
      • ClusterSizeDistribution
      • ContextualRecall
      • FeaturesAUC
      • MeteorScore
      • ModelMetadata
      • ModelPredictionResiduals
      • RegardScore
      • RegressionResidualsPlot
      • RougeScore
      • TimeSeriesPredictionsPlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • Embeddings
        • ClusterDistribution
        • CosineSimilarityComparison
        • CosineSimilarityDistribution
        • CosineSimilarityHeatmap
        • DescriptiveAnalytics
        • EmbeddingsVisualization2D
        • EuclideanDistanceComparison
        • EuclideanDistanceHeatmap
        • PCAComponentsPairwisePlots
        • StabilityAnalysisKeyword
        • StabilityAnalysisRandomNoise
        • StabilityAnalysisSynonyms
        • StabilityAnalysisTranslation
        • TSNEComponentsPairwisePlots
      • Ragas
        • AnswerCorrectness
        • AspectCritic
        • ContextEntityRecall
        • ContextPrecision
        • ContextPrecisionWithoutReference
        • ContextRecall
        • Faithfulness
        • NoiseSensitivity
        • ResponseRelevancy
        • SemanticSimilarity
      • Sklearn
        • AdjustedMutualInformation
        • AdjustedRandIndex
        • CalibrationCurve
        • ClassifierPerformance
        • ClassifierThresholdOptimization
        • ClusterCosineSimilarity
        • ClusterPerformanceMetrics
        • CompletenessScore
        • ConfusionMatrix
        • FeatureImportance
        • FowlkesMallowsScore
        • HomogeneityScore
        • HyperParametersTuning
        • KMeansClustersOptimization
        • MinimumAccuracy
        • MinimumF1Score
        • MinimumROCAUCScore
        • ModelParameters
        • ModelsPerformanceComparison
        • OverfitDiagnosis
        • PermutationFeatureImportance
        • PopulationStabilityIndex
        • PrecisionRecallCurve
        • RegressionErrors
        • RegressionErrorsComparison
        • RegressionPerformance
        • RegressionR2Square
        • RegressionR2SquareComparison
        • RobustnessDiagnosis
        • ROCCurve
        • ScoreProbabilityAlignment
        • SHAPGlobalImportance
        • SilhouettePlot
        • TrainingTestDegradation
        • VMeasure
        • WeakspotsDiagnosis
      • Statsmodels
        • AutoARIMA
        • CumulativePredictionProbabilities
        • DurbinWatsonTest
        • GINITable
        • KolmogorovSmirnov
        • Lilliefors
        • PredictionProbabilitiesHistogram
        • RegressionCoeffs
        • RegressionFeatureSignificance
        • RegressionModelForecastPlot
        • RegressionModelForecastPlotLevels
        • RegressionModelSensitivityPlot
        • RegressionModelSummary
        • RegressionPermutationFeatureImportance
        • ScorecardHistogram
    • Ongoing Monitoring
      • CalibrationCurveDrift
      • ClassDiscriminationDrift
      • ClassificationAccuracyDrift
      • ClassImbalanceDrift
      • ConfusionMatrixDrift
      • CumulativePredictionProbabilitiesDrift
      • FeatureDrift
      • PredictionAcrossEachFeature
      • PredictionCorrelation
      • PredictionProbabilitiesHistogramDrift
      • PredictionQuantilesAcrossFeatures
      • ROCCurveDrift
      • ScoreBandsDrift
      • ScorecardHistogramDrift
      • TargetPredictionDistributionPlot
    • Plots
      • BoxPlot
      • CorrelationHeatmap
      • HistogramPlot
      • ViolinPlot
    • Prompt Validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
    • Stats
      • CorrelationAnalysis
      • DescriptiveStats
      • NormalityTests
      • OutlierDetection
  • Test sandbox beta

  • Notebooks
  • Code samples
    • Agents
      • AI Agent Validation with ValidMind - Banking Demo
    • Capital Markets
      • Quickstart for knockout option pricing model documentation
      • Quickstart for Heston option pricing model using QuantLib
    • Code Explainer
      • Quickstart for model code documentation
    • Credit Risk
      • Document an application scorecard model
      • Document an application scorecard model
      • Document a credit risk model
      • Document an application scorecard model
      • Document an Excel-based application scorecard model
    • Custom Tests
      • Implement custom tests
      • Integrate external test providers
    • Model Validation
      • Validate an application scorecard model
    • Nlp and Llm
      • Sentiment analysis of financial data using a large language model (LLM)
      • Summarization of financial data using a large language model (LLM)
      • Sentiment analysis of financial data using Hugging Face NLP models
      • Summarization of financial data using Hugging Face NLP models
      • Automate news summarization using LLMs
      • Prompt validation for large language models (LLMs)
      • RAG Model Benchmarking Demo
      • RAG Model Documentation Demo
    • Ongoing Monitoring
      • Ongoing Monitoring for Application Scorecard
      • Quickstart for ongoing monitoring of models with ValidMind
    • Regression
      • Document a California Housing Price Prediction regression model
    • Time Series
      • Document a time series forecasting model
      • Document a time series forecasting model

  • Reference
  • ValidMind Library Python API
  • ​ValidMind Public REST API

On this page

  • What is the ValidMind Library?
  • Quickstart
  • ​ValidMind for model development
  • ​ValidMind for model validation
  • Learn how to run tests
  • Try the code samples
  • Work with model documentation
  • Edit this page
  • Report an issue

ValidMind Library

Published

October 22, 2025

The ValidMind Library streamlines model development and validation by automating testing. Run tests, log those test results to the ValidMind Platform, and have fully supported drafts of documentation or reporting ready for you to fine-tune.

What is the ValidMind Library?

The ValidMind Library provides a rich collection of documentation tools and test suites, from documenting descriptions of your dataset to validation testing your models for weak spots and overfit areas.

​ValidMind offers two primary methods for automating model documentation:

  • Generate documentation1 — Through automation, the library extracts metadata from associated datasets and models for you and generates model documentation based on a template. You can also add more documentation and tests manually using the documentation editing capabilities in the ValidMind Platform.

  • Run validation tests2 — The library provides a suite of validation tests for common financial services use cases. For cases where these tests do not cover everything you need, you can also extend existing test suites with your own proprietary tests or testing providers.

1 ​ValidMind for model development

2 ​ValidMind for model validation

The ValidMind Library is designed to be model agnostic. If your model is built in Python, the library provides all the standard functionality you may need without requiring you to rewrite any functions.

Important Key ​ValidMind concepts
model documentation
A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses.

Within the realm of model risk management, this documentation serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model's application.

validation report
A formal document produced after a model validation process, outlining the findings, assessments, and recommendations related to a specific model's performance, appropriateness, and limitations. Provides a comprehensive review of the model's conceptual framework, data sources and integrity, calibration methods, and performance outcomes.

Within model risk management, the validation report is crucial for ensuring transparency, demonstrating regulatory compliance, and offering actionable insights for model refinement or adjustments.

documentation template
Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.
validation report template

Serves as a standardized framework for conducting and documenting model validation activities. It outlines the required sections, recommended analyses, and expected validation tests, ensuring consistency and completeness across validation reports. The template helps guide validators through a systematic review process while promoting comparability and traceability of validation outcomes.

​ValidMind templates come with pre-defined sections, similar to test placeholders, including boilerplates and spaces designated for test results, evidence, or findings.

test
A function contained in the library, designed to run a specific quantitative test on the dataset or model. Test results are sent to the ValidMind Platform to generate the model documentation according to the template that is associated with the documentation.

Tests are the building blocks of ​ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.

metrics, custom metrics
Metrics are a subset of tests that do not have thresholds. Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered via the ValidMind Library to be used with the ValidMind Platform.

In the context of ​ValidMind's Jupyter Notebooks, metrics and tests can be thought of as interchangeable concepts.

inputs
Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:
  • model: A single model that has been initialized in ​ValidMind with vm.init_model(). See the Model Documentation or the for more information.
  • dataset: Single dataset that has been initialized in ​ValidMind with vm.init_dataset(). See the Dataset Documentation for more information.
  • models: A list of ​ValidMind models - usually this is used when you want to compare multiple models in your custom tests.
  • datasets: A list of ​ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom tests. (Learn more: [Run tests with multiple datasets(/notebooks/how_to/run_tests_that_require_multiple_datasets.ipynb)])
parameters
Additional arguments that can be passed when running a ​ValidMind test, used to pass additional information to a test, customize its behavior, or provide additional context.
outputs
Custom tests can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.
test suite
A collection of tests which are run together to generate model documentation end-to-end for specific use cases.

For example, the classifier_full_suite test suite runs tests from the tabular_dataset and classifier test suites to fully document the data and model sections for binary classification model use cases.

Quickstart

After you sign up for ​ValidMind to get access, try our quickstarts for model documentation or validation:

Quickstart for model documentation
Learn the basics of using ValidMind to document models as part of a model development workflow. Set up the ValidMind Library in your environment, and generate a draft of documentation using ValidMind tests for a binary classification model.
Quickstart for model validation
Learn the basics of using ValidMind to validate models as part of a model validation workflow. Set up the ValidMind Library in your environment, and generate a draft of a validation report using ValidMind tests for a binary classification model.
No matching items

​ValidMind for model development

Learn how to use ValidMind for your end-to-end model documentation process based on common model development scenarios with our ValidMind for model development series of four introductory notebooks:

1 — Set up the ValidMind Library

Get to know ​ValidMind by setting up the ValidMind Library in your own environment, and registering a sample binary classification model in the ValidMind Platform for use with this series of notebooks.

2 — Start the model development process

Learn to run and log tests with a variety of methods and in different situations with the ValidMind Library, then add the results or evidence to your documentation for the sample model you registered.

3 — Integrate custom tests

After you become familiar with the basics of the ValidMind Library, learn how to supplement ValidMind tests with your own and include them as additional evidence in your documentation.

4 — Finalize testing and documentation

Wrap up by learning how to ensure that custom tests are included in your model's documentation template. By the end of this series, you will have a fully documented sample model ready for review.

No matching items

​ValidMind for model validation

Learn how to use ValidMind for your end-to-end model validation process based on common scenarios with our ValidMind for model validation series of four introductory notebooks:

1 — Set up the ValidMind Library for validation

Get to know ​ValidMind by setting up the ValidMind Library in your own environment, and gaining access as a validator to a sample model in the ValidMind Platform for use with this series of notebooks.

2 — Start the model validation process

Independently verify the data quality tests performed on datasets used to train the dummy champion model using tests from the ValidMind Library, then add the results or evidence to your validation report.

3 — Developing a potential challenger model

After you become familiar with the basics of the ValidMind Library, use it to develop a potential challenger model and run thorough model comparison tests, such as performance, diagnostic, and feature importance tests.

4 — Finalize validation and reporting

Wrap up by learning how to include custom tests and verifying that all tests conducted during model development were run and reported accurately. By the end of this series, you will have a validation report complete with findings ready for review.

No matching items

Learn how to run tests

The ValidMind Library provides many built-in tests and test suites which make it easy for developers to automate their model documentation. Start by running a pre-made test, then modify it, and finally create your own test:

Run tests & test suites

Run dataset based tests
Use the ValidMind Library's run_test function to run built-in or custom tests that take any dataset or model as input. These tests generate outputs in the form of text, tables, and images that get populated in model documentation.
Implement custom tests
Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.
No matching items

Try the code samples

Our code samples showcase the capabilities of the ValidMind Library. Examples that you can build on and adapt for your own use cases include:

All code samples

Integrate external test providers
Register a custom test provider with the ValidMind Library to run your own tests.
Prompt validation for large language models (LLMs)
Run and document prompt validation tests for a large language model (LLM) specialized in sentiment analysis for financial news.
Document a time series forecasting model
Use the FRED sample dataset to train a simple time series model and document that model with the ValidMind Library.
No matching items

Work with model documentation

After you have tried out the ValidMind Library, continue working with your model documentation in the ValidMind Platform:

Working with model documentation

Work with test results
Once generated via the ValidMind Library, view and add the test results to your model documentation in the ValidMind Platform.
Work with content blocks
Make edits to your model documents by adding or removing content blocks directly in the online editor.
No matching items
Supported models
  • ValidMind Logo
    ©
    Copyright 2025 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Model workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • Python API
    • Public REST API
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • Slack
    • GitHub
    • Blog
  • Edit this page
  • Report an issue