• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • Python Library
    • ValidMind Library

    • ValidMind Academy
    • Training Courses
  • Log In
    • Public Internet
    • ValidMind Platform · US1
    • ValidMind Platform · CA1

    • Private Link
    • Virtual Private ValidMind (VPV)

    • Which login should I use?
  1. Run tests & test suites
  2. Enable PII detection in tests
  • ValidMind Library
  • Supported models

  • Quickstart
  • Quickstart for model documentation
  • Quickstart for model validation
  • Install and initialize ValidMind Library
  • Store model credentials in .env files

  • Model Development
  • 1 — Set up ValidMind Library
  • 2 — Start model development process
  • 3 — Integrate custom tests
  • 4 — Finalize testing & documentation

  • Model Validation
  • 1 — Set up ValidMind Library for validation
  • 2 — Start model validation process
  • 3 — Developing a challenger model
  • 4 — Finalize validation & reporting

  • Model Testing
  • Run tests & test suites
    • Add context to LLM-generated test descriptions
    • Intro to Assign Scores
    • Configure dataset features
    • Document multiple results for the same test
    • Enable PII detection in tests
    • Explore test suites
    • Explore tests
    • Dataset Column Filters when Running Tests
    • Load dataset predictions
    • Log metrics over time
    • Run individual documentation sections
    • Run documentation tests with custom configurations
    • Run tests with multiple datasets
    • Intro to Unit Metrics
    • Understand and utilize RawData in ValidMind tests
    • Introduction to ValidMind Dataset and Model Objects
    • Run Tests
      • Run dataset based tests
      • Run comparison tests
  • Test descriptions
    • Data Validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LaggedCorrelationHeatmap
      • LJungBox
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • Nlp
        • CommonWords
        • Hashtags
        • LanguageDetection
        • Mentions
        • PolarityAndSubjectivity
        • Punctuations
        • Sentiment
        • StopWords
        • TextDescription
        • Toxicity
    • Model Validation
      • BertScore
      • BleuScore
      • ClusterSizeDistribution
      • ContextualRecall
      • FeaturesAUC
      • MeteorScore
      • ModelMetadata
      • ModelPredictionResiduals
      • RegardScore
      • RegressionResidualsPlot
      • RougeScore
      • TimeSeriesPredictionsPlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • Embeddings
        • ClusterDistribution
        • CosineSimilarityComparison
        • CosineSimilarityDistribution
        • CosineSimilarityHeatmap
        • DescriptiveAnalytics
        • EmbeddingsVisualization2D
        • EuclideanDistanceComparison
        • EuclideanDistanceHeatmap
        • PCAComponentsPairwisePlots
        • StabilityAnalysisKeyword
        • StabilityAnalysisRandomNoise
        • StabilityAnalysisSynonyms
        • StabilityAnalysisTranslation
        • TSNEComponentsPairwisePlots
      • Ragas
        • AnswerCorrectness
        • AspectCritic
        • ContextEntityRecall
        • ContextPrecision
        • ContextPrecisionWithoutReference
        • ContextRecall
        • Faithfulness
        • NoiseSensitivity
        • ResponseRelevancy
        • SemanticSimilarity
      • Sklearn
        • AdjustedMutualInformation
        • AdjustedRandIndex
        • CalibrationCurve
        • ClassifierPerformance
        • ClassifierThresholdOptimization
        • ClusterCosineSimilarity
        • ClusterPerformanceMetrics
        • CompletenessScore
        • ConfusionMatrix
        • FeatureImportance
        • FowlkesMallowsScore
        • HomogeneityScore
        • HyperParametersTuning
        • KMeansClustersOptimization
        • MinimumAccuracy
        • MinimumF1Score
        • MinimumROCAUCScore
        • ModelParameters
        • ModelsPerformanceComparison
        • OverfitDiagnosis
        • PermutationFeatureImportance
        • PopulationStabilityIndex
        • PrecisionRecallCurve
        • RegressionErrors
        • RegressionErrorsComparison
        • RegressionPerformance
        • RegressionR2Square
        • RegressionR2SquareComparison
        • RobustnessDiagnosis
        • ROCCurve
        • ScoreProbabilityAlignment
        • SHAPGlobalImportance
        • SilhouettePlot
        • TrainingTestDegradation
        • VMeasure
        • WeakspotsDiagnosis
      • Statsmodels
        • AutoARIMA
        • CumulativePredictionProbabilities
        • DurbinWatsonTest
        • GINITable
        • KolmogorovSmirnov
        • Lilliefors
        • PredictionProbabilitiesHistogram
        • RegressionCoeffs
        • RegressionFeatureSignificance
        • RegressionModelForecastPlot
        • RegressionModelForecastPlotLevels
        • RegressionModelSensitivityPlot
        • RegressionModelSummary
        • RegressionPermutationFeatureImportance
        • ScorecardHistogram
    • Ongoing Monitoring
      • CalibrationCurveDrift
      • ClassDiscriminationDrift
      • ClassificationAccuracyDrift
      • ClassImbalanceDrift
      • ConfusionMatrixDrift
      • CumulativePredictionProbabilitiesDrift
      • FeatureDrift
      • PredictionAcrossEachFeature
      • PredictionCorrelation
      • PredictionProbabilitiesHistogramDrift
      • PredictionQuantilesAcrossFeatures
      • ROCCurveDrift
      • ScoreBandsDrift
      • ScorecardHistogramDrift
      • TargetPredictionDistributionPlot
    • Plots
      • BoxPlot
      • CorrelationHeatmap
      • HistogramPlot
      • ViolinPlot
    • Prompt Validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
    • Stats
      • CorrelationAnalysis
      • DescriptiveStats
      • NormalityTests
      • OutlierDetection
  • Test sandbox beta

  • Notebooks
  • Code samples
    • Capital Markets
      • Quickstart for knockout option pricing model documentation
      • Quickstart for Heston option pricing model using QuantLib
    • Code Explainer
      • Quickstart for model code documentation
    • Credit Risk
      • Document an application scorecard model
      • Document an application scorecard model
      • Document a credit risk model
      • Document an application scorecard model
      • Document an Excel-based application scorecard model
    • Custom Tests
      • Implement custom tests
      • Integrate external test providers
    • Model Validation
      • Validate an application scorecard model
    • Nlp and Llm
      • Sentiment analysis of financial data using a large language model (LLM)
      • Summarization of financial data using a large language model (LLM)
      • Sentiment analysis of financial data using Hugging Face NLP models
      • Summarization of financial data using Hugging Face NLP models
      • Automate news summarization using LLMs
      • Prompt validation for large language models (LLMs)
      • RAG Model Benchmarking Demo
      • RAG Model Documentation Demo
    • Ongoing Monitoring
      • Ongoing Monitoring for Application Scorecard
      • Quickstart for ongoing monitoring of models with ValidMind
    • Regression
      • Document a California Housing Price Prediction regression model
    • Time Series
      • Document a time series forecasting model
      • Document a time series forecasting model

  • Reference
  • ValidMind Library Python API
  • ​ValidMind Public REST API

On this page

  • About ValidMind
    • Before you begin
    • New to ValidMind?
    • Key concepts
  • Setting up
    • Install the ValidMind Library with PII detection
    • Initialize the ValidMind Library
  • Using PII detection
    • Create a custom test that outputs PII
    • Run test under different PII detection modes
    • Override detection
    • Review logged test results
  • Troubleshooting
  • Learn more
  • Edit this page
  • Report an issue
  1. Run tests & test suites
  2. Enable PII detection in tests

Enable PII detection in tests

Learn how to enable and configure Personally Identifiable Information (PII) detection when running tests with the ValidMind Library. Choose whether or not to include PII in test descriptions generated, or whether or not to include PII in test results logged to the ValidMind Platform.

About ValidMind

ValidMind is a suite of tools for managing model risk, including risk associated with AI and statistical models.

You use the ValidMind Library to automate documentation and validation tests, and then use the ValidMind Platform to collaborate on model documentation. Together, these products simplify model risk management, facilitate compliance with regulations and institutional standards, and enhance collaboration between yourself and model validators.

Before you begin

This notebook assumes you have basic familiarity with Python, including an understanding of how functions work. If you are new to Python, you can still run the notebook but we recommend further familiarizing yourself with the language.

If you encounter errors due to missing modules in your Python environment, install the modules with pip install, and then re-run the notebook. For more help, refer to Installing Python Modules.

New to ValidMind?

If you haven't already seen our documentation on the ValidMind Library, we recommend you begin by exploring the available resources in this section. There, you can learn more about documenting models and running tests, as well as find code samples and our Python Library API reference.

For access to all features available in this notebook, you'll need access to a ValidMind account.

Register with ValidMind

Key concepts

Model documentation: A structured and detailed record pertaining to a model, encompassing key components such as its underlying assumptions, methodologies, data sources, inputs, performance metrics, evaluations, limitations, and intended uses. It serves to ensure transparency, adherence to regulatory requirements, and a clear understanding of potential risks associated with the model’s application.

Documentation template: Functions as a test suite and lays out the structure of model documentation, segmented into various sections and sub-sections. Documentation templates define the structure of your model documentation, specifying the tests that should be run, and how the results should be displayed.

Tests: A function contained in the ValidMind Library, designed to run a specific quantitative test on the dataset or model. Tests are the building blocks of ValidMind, used to evaluate and document models and datasets, and can be run individually or as part of a suite defined by your model documentation template.

Metrics: A subset of tests that do not have thresholds. In the context of this notebook, metrics and tests can be thought of as interchangeable concepts.

Custom metrics: Custom metrics are functions that you define to evaluate your model or dataset. These functions can be registered with the ValidMind Library to be used in the ValidMind Platform.

Inputs: Objects to be evaluated and documented in the ValidMind Library. They can be any of the following:

  • model: A single model that has been initialized in ValidMind with vm.init_model().
  • dataset: Single dataset that has been initialized in ValidMind with vm.init_dataset().
  • models: A list of ValidMind models - usually this is used when you want to compare multiple models in your custom metric.
  • datasets: A list of ValidMind datasets - usually this is used when you want to compare multiple datasets in your custom metric. (Learn more: Run tests with multiple datasets)

Parameters: Additional arguments that can be passed when running a ValidMind test, used to pass additional information to a metric, customize its behavior, or provide additional context.

Outputs: Custom metrics can return elements like tables or plots. Tables may be a list of dictionaries (each representing a row) or a pandas DataFrame. Plots may be matplotlib or plotly figures.

Test suites: Collections of tests designed to run together to automate and generate model documentation end-to-end for specific use-cases.

Example: the classifier_full_suite test suite runs tests from the tabular_dataset and classifier test suites to fully document the data and model sections for binary classification model use-cases.

Setting up

Install the ValidMind Library with PII detection

Recommended Python versions

Python 3.8 <= x <= 3.11

To use PII detection powered by Microsoft Presidio, install the library with the explicit [pii-detection] extra specifier:

%pip install -q "validmind[pii-detection]"

Initialize the ValidMind Library

ValidMind generates a unique code snippet for each registered model to connect with your developer environment. You initialize the ValidMind Library with this code snippet, which ensures that your documentation and tests are uploaded to the correct model when you run the notebook.

Get your code snippet

  1. In a browser, log in to ValidMind.

  2. In the left sidebar, navigate to Inventory and click + Register Model.

  3. Enter the model details and click Continue. (Need more help?)

  4. Go to Getting Started and click Copy snippet to clipboard.

Next, load your model identifier credentials from an .env file or replace the placeholder with your own code snippet:

# Load your model identifier credentials from an `.env` file

%load_ext dotenv
%dotenv .env

# Or replace with your code snippet

import validmind as vm

vm.init(
    # api_host="...",
    # api_key="...",
    # api_secret="...",
    # model="...",
)

Using PII detection

Create a custom test that outputs PII

To demonstrate the feature, we'll need a test that outputs PII. First we'll create a custom test that returns:

  • A description string containing PII (name, email, phone)
  • A small table containing PII in columns

This output mirrors the structure used in other custom test notebooks and will exercise both table and description PII detection paths. However, if structured detection is unavailable, the library falls back to token-level text scans when possible.

import pandas as pd

from validmind import test

@test("pii_demo.PIIDetection")
def pii_custom_test():
    """A custom test that returns demo PII.
    This default test description will display when PII is not sent to the LLM to generate test descriptions based on test result data."""
    return pd.DataFrame(
        {
            "name": ["Jane Smith", "John Doe", "Alice Johnson"],
            "email": [
                "jane.smith@bank.example",
                "john.doe@company.example",
                "alice.johnson@service.example",
            ],
            "phone": ["(212) 555-9876", "(415) 555-1234", "(646) 555-5678"],
        }
    )
Want to learn more about custom tests?

Check out our extended introduction to custom tests — Implement custom tests

Run test under different PII detection modes

Next, let's import the run_test function provided by the validmind.tests module to run our custom test via a function called run_pii_test() that catches exceptions to observe blocking behavior when PII is present:

import os
from validmind.tests import run_test

# Run test and tag result with unique `result_id`
def run_pii_test(result_id=""):
    try:
        test_name = f"pii_demo.PIIDetection:{result_id}"
        result = run_test(test_name)

        # Check if the test description was generated by LLM
        if not result._was_description_generated:
            print("PII detected: LLM-generated test description skipped")
        else:
            print("No PII detected or detection disabled: Test description generated by LLM")

        # Try logging test results to the ValidMind Platform
        result.log()
        print("No PII detected or detection disabled: Test results logged to the ValidMind Platform")
    except Exception as e:
        print("PII detected: Test results not logged to the ValidMind Platform")

We'll then switch the VALIDMIND_PII_DETECTION environment variable across modes in the below examples.

Note that since we are running a custom test that does not exist in your model's default documentation template, we'll receive output indicating that a test-driven block doesn't currently exist in your model's documentation for that particular test ID.

That's expected, as when we run custom tests the results logged need to be manually added to your documentation within the ValidMind Platform or added to your documentation template.

disabled

When detection is set to disabled, tests run and generate test descriptions. Logging tests with .log() will also send test descriptions and test results to the ValidMind Platform as usual:

print("\n=== Mode: disabled ===")
os.environ["VALIDMIND_PII_DETECTION"] = "disabled"

# Run test and tag result with unique ID `disabled`
run_pii_test("disabled")

test_results

When detection is set for test_results, tests run and generate test descriptions for review in your environment, but logging tests will not send descriptions or test results to the ValidMind Platform:

print("\n=== Mode: test_results ===")
os.environ["VALIDMIND_PII_DETECTION"] = "test_results"

# Run test and tag result with unique ID `results_blocked`
run_pii_test("results_blocked")

test_descriptions

When detection is set for test_descriptions, tests run but will not generate test descriptions, and logging tests will not send descriptions but will send test results to the ValidMind Platform:

print("\n=== Mode: test_descriptions ===")
os.environ["VALIDMIND_PII_DETECTION"] = "test_descriptions"

# Run test and tag result with unique ID `desc_blocked`
run_pii_test("desc_blocked")

all

When detection is set to all, tests run will not generate test descriptions or log test results to the ValidMind Platform.

print("\n=== Mode: all ===")
os.environ["VALIDMIND_PII_DETECTION"] = "all"

# Run test and tag result with unique ID `all_blocked`
run_pii_test("all_blocked")

Override detection

You can override blocking by passing unsafe=True to result.log(unsafe=True), but this is not recommended outside controlled workflows.

To demonstrate, let's rerun our custom test with some override scenarios.

Override test result logging

First, let's rerun our custom test with detection set to all, which will send the test results but not the test descriptions to the ValidMind Platform:

print("\n=== Mode: all & unsafe=True ===")
os.environ["VALIDMIND_PII_DETECTION"] = "all"

# Run test and tag result with unique ID `override_results`
try:
    result = run_test("pii_demo.PIIDetection:override_results")

    # Check if the test description was generated by LLM
    if not result._was_description_generated:
        print("PII detected: LLM-generated test description skipped")
    else:
        print("No PII detected or detection disabled: Test description generated by LLM")

    # Try logging test results to the ValidMind Platform
    result.log(unsafe=True)
    print("No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform")
except Exception as e:
    print("PII detected: Test results not logged to the ValidMind Platform")

Override test descriptions and test result logging

To send both the test descriptions and test results via override, set the VALIDMIND_PII_DETECTION environment variable to test_results while including the override flag:

print("\n=== Mode: test_results & unsafe=True ===")
os.environ["VALIDMIND_PII_DETECTION"] = "test_results"

# Run test and tag result with unique ID `override_both`
try:
    result = run_test("pii_demo.PIIDetection:override_both")

    # Check if the test description was generated by LLM
    if not result._was_description_generated:
        print("PII detected: LLM-generated test description skipped")
    else:
        print("No PII detected, detection disabled, or override set: Test description generated by LLM")

    # Try logging test results to the ValidMind Platform
    result.log(unsafe=True)
    print("No PII detected, detection disabled, or override set: Test results logged to the ValidMind Platform")
except Exception as e:
    print("PII detected: Test results not logged to the ValidMind Platform")

Review logged test results

Now let's take a look at the results that were logged to the ValidMind Platform:

  1. From the Inventory in the ValidMind Platform, go to the model you registered earlier.

  2. In the left sidebar that appears for your model, click Documentation under Documents.

  3. Click on any section heading to expand that section to add a new test-driven block (Need more help?).

  4. Under TEST-DRIVEN in the sidebar, click Custom.

  5. Confirm that you're able to insert the following logged results:

    • pii_demo.PIIDetection:disabled
    • pii_demo.PIIDetection:desc_blocked
    • pii_demo.PIIDetection:override_results
    • pii_demo.PIIDetection:override_both

Troubleshooting

Learn more

We offer many interactive notebooks to help you document models:

  • Run tests & test suites
  • Code samples

Or, visit our documentation to learn more about ValidMind.

Document multiple results for the same test
Explore test suites
  • ValidMind Logo
    ©
    Copyright 2025 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Model workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • API Reference
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • Slack
    • GitHub
    • Blog
  • Edit this page
  • Report an issue