• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • ValidMind Library
    • Python API
    • Public REST API

    • Training Courses
  • Log In
  1. Use library features
  • ValidMind Library
  • Supported models

  • Quickstart
  • Quickstart for model documentation
  • Quickstart for model validation
  • Install and initialize ValidMind Library
  • Store model credentials in .env files

  • End-to-End Tutorials
  • Model development
    • 1 — Set up ValidMind Library
    • 2 — Start model development process
    • 3 — Integrate custom tests
    • 4 — Finalize testing & documentation
  • Model validation
    • 1 — Set up ValidMind Library for validation
    • 2 — Start model validation process
    • 3 — Developing a challenger model
    • 4 — Finalize validation & reporting

  • How-To
  • Run tests & test suites
    • Explore tests
      • Explore tests
      • Explore test suites
      • Test sandbox beta
    • Run tests
      • Run dataset based tests
      • Run comparison tests
      • Configuring tests
        • Customize test result descriptions
        • Enable PII detection in tests
        • Dataset Column Filters when Running Tests
        • Run tests with multiple datasets
        • Understand and utilize RawData in ValidMind tests
      • Using tests in documentation
        • Document multiple results for the same test
        • Run individual documentation sections
        • Run documentation tests with custom configurations
    • Custom tests
      • Implement custom tests
      • Integrate external test providers
  • Use library features
    • Data and datasets
      • Introduction to ValidMind Dataset and Model Objects
      • Dataset inputs
        • Configure dataset features
        • Load dataset predictions
    • Metrics
      • Log metrics over time
      • Intro to Unit Metrics
    • Scoring
      • Intro to Assign Scores

  • Notebooks
  • Code samples
    • Agents
      • Document an agentic AI system
    • Capital markets
      • Quickstart for knockout option pricing model documentation
      • Quickstart for Heston option pricing model using QuantLib
    • Code explainer
      • Quickstart for model code documentation
    • Credit risk
      • Document an application scorecard model
      • Document an application scorecard model
      • Document a credit risk model
      • Document an application scorecard model
      • Document an Excel-based application scorecard model
    • Model validation
      • Validate an application scorecard model
    • NLP and LLM
      • Sentiment analysis of financial data using a large language model (LLM)
      • Summarization of financial data using a large language model (LLM)
      • Sentiment analysis of financial data using Hugging Face NLP models
      • Summarization of financial data using Hugging Face NLP models
      • Automate news summarization using LLMs
      • Prompt validation for large language models (LLMs)
      • RAG Model Benchmarking Demo
      • RAG Model Documentation Demo
    • Ongoing monitoring
      • Ongoing Monitoring for Application Scorecard
      • Quickstart for ongoing monitoring of models with ValidMind
    • Regression
      • Document a California Housing Price Prediction regression model
    • Time series
      • Document a time series forecasting model
      • Document a time series forecasting model

  • Reference
  • Test descriptions
    • Data Validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LaggedCorrelationHeatmap
      • LJungBox
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • Nlp
        • CommonWords
        • Hashtags
        • LanguageDetection
        • Mentions
        • PolarityAndSubjectivity
        • Punctuations
        • Sentiment
        • StopWords
        • TextDescription
        • Toxicity
    • Model Validation
      • BertScore
      • BleuScore
      • ClusterSizeDistribution
      • ContextualRecall
      • FeaturesAUC
      • MeteorScore
      • ModelMetadata
      • ModelPredictionResiduals
      • RegardScore
      • RegressionResidualsPlot
      • RougeScore
      • TimeSeriesPredictionsPlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • Embeddings
        • ClusterDistribution
        • CosineSimilarityComparison
        • CosineSimilarityDistribution
        • CosineSimilarityHeatmap
        • DescriptiveAnalytics
        • EmbeddingsVisualization2D
        • EuclideanDistanceComparison
        • EuclideanDistanceHeatmap
        • PCAComponentsPairwisePlots
        • StabilityAnalysisKeyword
        • StabilityAnalysisRandomNoise
        • StabilityAnalysisSynonyms
        • StabilityAnalysisTranslation
        • TSNEComponentsPairwisePlots
      • Ragas
        • AnswerCorrectness
        • AspectCritic
        • ContextEntityRecall
        • ContextPrecision
        • ContextPrecisionWithoutReference
        • ContextRecall
        • Faithfulness
        • NoiseSensitivity
        • ResponseRelevancy
        • SemanticSimilarity
      • Sklearn
        • AdjustedMutualInformation
        • AdjustedRandIndex
        • CalibrationCurve
        • ClassifierPerformance
        • ClassifierThresholdOptimization
        • ClusterCosineSimilarity
        • ClusterPerformanceMetrics
        • CompletenessScore
        • ConfusionMatrix
        • FeatureImportance
        • FowlkesMallowsScore
        • HomogeneityScore
        • HyperParametersTuning
        • KMeansClustersOptimization
        • MinimumAccuracy
        • MinimumF1Score
        • MinimumROCAUCScore
        • ModelParameters
        • ModelsPerformanceComparison
        • OverfitDiagnosis
        • PermutationFeatureImportance
        • PopulationStabilityIndex
        • PrecisionRecallCurve
        • RegressionErrors
        • RegressionErrorsComparison
        • RegressionPerformance
        • RegressionR2Square
        • RegressionR2SquareComparison
        • RobustnessDiagnosis
        • ROCCurve
        • ScoreProbabilityAlignment
        • SHAPGlobalImportance
        • SilhouettePlot
        • TrainingTestDegradation
        • VMeasure
        • WeakspotsDiagnosis
      • Statsmodels
        • AutoARIMA
        • CumulativePredictionProbabilities
        • DurbinWatsonTest
        • GINITable
        • KolmogorovSmirnov
        • Lilliefors
        • PredictionProbabilitiesHistogram
        • RegressionCoeffs
        • RegressionFeatureSignificance
        • RegressionModelForecastPlot
        • RegressionModelForecastPlotLevels
        • RegressionModelSensitivityPlot
        • RegressionModelSummary
        • RegressionPermutationFeatureImportance
        • ScorecardHistogram
    • Ongoing Monitoring
      • CalibrationCurveDrift
      • ClassDiscriminationDrift
      • ClassificationAccuracyDrift
      • ClassImbalanceDrift
      • ConfusionMatrixDrift
      • CumulativePredictionProbabilitiesDrift
      • FeatureDrift
      • PredictionAcrossEachFeature
      • PredictionCorrelation
      • PredictionProbabilitiesHistogramDrift
      • PredictionQuantilesAcrossFeatures
      • ROCCurveDrift
      • ScoreBandsDrift
      • ScorecardHistogramDrift
      • TargetPredictionDistributionPlot
    • Plots
      • BoxPlot
      • CorrelationHeatmap
      • HistogramPlot
      • ViolinPlot
    • Prompt Validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
    • Stats
      • CorrelationAnalysis
      • DescriptiveStats
      • NormalityTests
      • OutlierDetection
  • ValidMind Library Python API
  • ValidMind Public REST API

On this page

  • How-to by topic
  • Edit this page
  • Report an issue

How to use ValidMind Library features

Published

February 13, 2026

Browse our range of Jupyter Notebooks demonstrating how to use the core features of the ValidMind Library. Use these how-to notebooks to get familiar with the library's capabilities and apply them to your own use cases.

How-to by topic

  • Testing
  • Data and datasets
  • Metrics
  • Scoring
Customize test result descriptions
15 min
When you run ValidMind tests, test descriptions are automatically generated with LLM using the test results, the test name, and the static test definitions provided in the test's docstring. While this metadata offers valuable high-level overviews of tests, insights produced by the LLM-based descriptions may not always align with your specific use…
Dataset Column Filters when Running Tests
1 min
To run a test on a dataset but only include certain columns from that dataset, you can use the new columns option. This is done by passing a dictionary for the dataset input for the test instead of the dataset object or dataset input ID directly. This dictionary should have the following keys:
Document multiple results for the same test
11 min
Documentation templates facilitate the presentation of multiple unique test results for a single test.
Enable PII detection in tests
9 min
Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform.
Explore test suites
6 min
Explore ValidMind test suites, pre-built collections of related tests used to evaluate specific aspects of your model. Retrieve available test suites and details for tests within a suite to understand their functionality, allowing you to select the appropriate test suites for your use cases.
Explore tests
7 min
Explore the individual out-the-box tests available in the ValidMind Library, and identify which tests to run to evaluate different aspects of your model. Browse available tests, view their descriptions, and filter by tags or task type to find tests relevant to your use case.
Implement custom tests
16 min
Custom tests extend the functionality of ValidMind, allowing you to document any model or use case with added flexibility.
Integrate external test providers
15 min
Register a custom test provider with the ValidMind Library to run your own tests.
Run comparison tests
13 min
Use the ValidMind Library's run_test function to run built-in or custom tests that take any combination of datasets or models as inputs. Comparison tests allow you to run existing test over different groups of inputs and produces a single consolidated list of outputs in the form of text, tables, and images that get populated in model documentation.
Run dataset based tests
11 min
Use the ValidMind Library's run_test function to run built-in or custom tests that take any dataset or model as input. These tests generate outputs in the form of text, tables, and images that get populated in model documentation.
Run documentation tests with custom configurations
11 min
When running documentation tests, you can configure inputs and parameters for individual tests by passing a config as a parameter.
Run individual documentation sections
9 min
For targeted testing, you can run tests on individual sections or specific groups of sections in your model documentation.
Run tests with multiple datasets
10 min
To support running tests that require more than one dataset, ValidMind provides a mechanim that allows you to pass multiple datasets as inputs.
Understand and utilize RawData in ValidMind tests
6 min
Test functions in ValidMind can return a special object called RawData, which holds intermediate or unprocessed data produced somewhere in the test logic but not returned as part of the test's visible output, such as in tables or figures.
No matching items
Configure dataset features
8 min
When initializing a ValidMind dataset object, you can pass in a list of features to use instead of utilizing all dataset columns when running tests.
Introduction to ValidMind Dataset and Model Objects
14 min
When writing custom tests, it is essential to be aware of the interfaces of the ValidMind Dataset and ValidMind Model, which are used as input arguments.
Load dataset predictions
13 min
To enable tests to make use of predictions, you can load predictions in ValidMind dataset objects in multiple different ways.
No matching items
Intro to Unit Metrics
12 min
To turn complex evidence into actionable insights, you can run a unit metric as a single-value measure to quantify and monitor risks throughout a model's lifecycle.
Log metrics over time
13 min
Learn how to track and visualize the temporal evolution of key model performance metrics with ValidMind.
No matching items
Intro to Assign Scores
10 min
The assign_scores() method is a powerful feature that allows you to compute and add scorer scores as new columns in your dataset. This method takes a model and metric(s) as input, computes the specified metrics from the ValidMind scorer library, and adds them as new columns. The computed metrics provide per-row values, giving you granular…
No matching items
Integrate external test providers
Introduction to ValidMind Dataset and Model Objects
  • ValidMind Logo
    ©
    Copyright 2026 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • Python API
    • Public REST API
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • GitHub
    • LinkedIn
    • Events
    • Blog
  • Edit this page
  • Report an issue