• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • Python Library
    • ValidMind Library

    • ValidMind Academy
    • Training Courses
  • Log In
    • Public Internet
    • ValidMind Platform · US1
    • ValidMind Platform · CA1

    • Private Link
    • Virtual Private ValidMind (VPV)

    • Which login should I use?
  1. tests
  2. data_validation
  3. TimeSeriesMissingValues
  • ValidMind Library Python API

  • Python API
  • 2.8.25
  • init
  • init_dataset
  • init_model
  • init_r_model
  • get_test_suite
  • log_metric
  • preview_template
  • print_env
  • reload
  • run_documentation_tests
  • run_test_suite
  • tags
  • tasks
  • test
  • log_text
  • experimental_agent
  • RawData
    • RawData
    • inspect
    • serialize

  • Submodules
  • __version__
  • datasets
    • classification
      • customer_churn
      • taiwan_credit
    • credit_risk
      • lending_club
      • lending_club_bias
    • nlp
      • cnn_dailymail
      • twitter_covid_19
    • regression
      • fred
      • lending_club
  • errors
  • test_suites
    • classifier
    • cluster
    • embeddings
    • llm
    • nlp
    • parameters_optimization
    • regression
    • statsmodels_timeseries
    • summarization
    • tabular_datasets
    • text_data
    • time_series
  • tests
    • data_validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • CommonWords
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • Hashtags
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LJungBox
      • LaggedCorrelationHeatmap
      • LanguageDetection
      • Mentions
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • PolarityAndSubjectivity
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • Punctuations
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • Sentiment
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • StopWords
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TextDescription
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • Toxicity
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • nlp
    • model_validation
      • AdjustedMutualInformation
      • AdjustedRandIndex
      • AutoARIMA
      • BertScore
      • BleuScore
      • CalibrationCurve
      • ClassifierPerformance
      • ClassifierThresholdOptimization
      • ClusterCosineSimilarity
      • ClusterPerformanceMetrics
      • ClusterSizeDistribution
      • CompletenessScore
      • ConfusionMatrix
      • ContextualRecall
      • CumulativePredictionProbabilities
      • DurbinWatsonTest
      • FeatureImportance
      • FeaturesAUC
      • FowlkesMallowsScore
      • GINITable
      • HomogeneityScore
      • HyperParametersTuning
      • KMeansClustersOptimization
      • KolmogorovSmirnov
      • Lilliefors
      • MeteorScore
      • MinimumAccuracy
      • MinimumF1Score
      • MinimumROCAUCScore
      • ModelMetadata
      • ModelParameters
      • ModelPredictionResiduals
      • ModelsPerformanceComparison
      • OverfitDiagnosis
      • PermutationFeatureImportance
      • PopulationStabilityIndex
      • PrecisionRecallCurve
      • PredictionProbabilitiesHistogram
      • ROCCurve
      • RegardScore
      • RegressionCoeffs
      • RegressionErrors
      • RegressionErrorsComparison
      • RegressionFeatureSignificance
      • RegressionModelForecastPlot
      • RegressionModelForecastPlotLevels
      • RegressionModelSensitivityPlot
      • RegressionModelSummary
      • RegressionPerformance
      • RegressionPermutationFeatureImportance
      • RegressionR2Square
      • RegressionR2SquareComparison
      • RegressionResidualsPlot
      • RobustnessDiagnosis
      • RougeScore
      • SHAPGlobalImportance
      • ScoreProbabilityAlignment
      • ScorecardHistogram
      • SilhouettePlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesPredictionsPlot
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • TrainingTestDegradation
      • VMeasure
      • WeakspotsDiagnosis
      • sklearn
      • statsmodels
      • statsutils
    • prompt_validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
      • ai_powered_test
  • unit_metrics
  • vm_models

On this page

  • TimeSeriesMissingValues
    • Purpose
    • Test Mechanism
    • Signs of High Risk
    • Strengths
    • Limitations
  • Edit this page
  • Report an issue
  1. tests
  2. data_validation
  3. TimeSeriesMissingValues

validmind.TimeSeriesMissingValues

TimeSeriesMissingValues

@tags('time_series_data')

@tasks('regression')

defTimeSeriesMissingValues(dataset:validmind.vm_models.VMDataset,min_threshold:int=1):

Validates time-series data quality by confirming the count of missing values is below a certain threshold.

Purpose

This test is designed to validate the quality of a historical time-series dataset by verifying that the number of missing values is below a specified threshold. As time-series models greatly depend on the continuity and temporality of data points, missing values could compromise the model's performance. Consequently, this test aims to ensure data quality and readiness for the machine learning model, safeguarding its predictive capacity.

Test Mechanism

The test method commences by validating if the dataset has a datetime index; if not, an error is raised. It establishes a lower limit threshold for missing values and performs a missing values check on each column of the dataset. An object for the test result is created stating whether the number of missing values is within the specified threshold. Additionally, the test calculates the percentage of missing values alongside the raw count.

Signs of High Risk

  • The number of missing values in any column of the dataset surpasses the threshold, marking a failure and a high-risk scenario. The reasons could range from incomplete data collection, faulty sensors to data preprocessing errors.

Strengths

  • Effectively identifies missing values which could adversely affect the model’s performance.
  • Applicable and customizable through the threshold parameter across different data sets.
  • Goes beyond raw numbers by calculating the percentage of missing values, offering a more relative understanding of data scarcity.

Limitations

  • Although it identifies missing values, the test does not provide solutions to handle them.
  • The test demands that the dataset should have a datetime index, hence limiting its use only to time series analysis.
  • The test's sensitivity to the 'min_threshold' parameter may raise false alarms if set too strictly or may overlook problematic data if set too loosely.
  • Solely focuses on the 'missingness' of the data and might fall short in addressing other aspects of data quality.
TimeSeriesLinePlot
TimeSeriesOutliers
  • ValidMind Logo
    ©
    Copyright 2025 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Model workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • API Reference
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • Slack
    • GitHub
    • Blog
  • Edit this page
  • Report an issue