• Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support
    • Releases

    • ValidMind Library
    • Python API
    • Public REST API

    • Training Courses
  • Log In
  1. Test descriptions
  2. Data Validation
  3. IQROutliersBarPlot
  • ValidMind Library
  • Supported models

  • Quickstart
  • Quickstart for model documentation
  • Quickstart for model validation
  • Install and initialize ValidMind Library
  • Store model credentials in .env files

  • Model Development
  • 1 — Set up ValidMind Library
  • 2 — Start model development process
  • 3 — Integrate custom tests
  • 4 — Finalize testing & documentation

  • Model Validation
  • 1 — Set up ValidMind Library for validation
  • 2 — Start model validation process
  • 3 — Developing a challenger model
  • 4 — Finalize validation & reporting

  • Model Testing
  • Run tests & test suites
    • Intro to Assign Scores
    • Configure dataset features
    • Customize test result descriptions
    • Document multiple results for the same test
    • Enable PII detection in tests
    • Explore test suites
    • Explore tests
    • Dataset Column Filters when Running Tests
    • Load dataset predictions
    • Log metrics over time
    • Run individual documentation sections
    • Run documentation tests with custom configurations
    • Run tests with multiple datasets
    • Intro to Unit Metrics
    • Understand and utilize RawData in ValidMind tests
    • Introduction to ValidMind Dataset and Model Objects
    • Run Tests
      • Run dataset based tests
      • Run comparison tests
  • Test descriptions
    • Data Validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LaggedCorrelationHeatmap
      • LJungBox
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • Nlp
        • CommonWords
        • Hashtags
        • LanguageDetection
        • Mentions
        • PolarityAndSubjectivity
        • Punctuations
        • Sentiment
        • StopWords
        • TextDescription
        • Toxicity
    • Model Validation
      • BertScore
      • BleuScore
      • ClusterSizeDistribution
      • ContextualRecall
      • FeaturesAUC
      • MeteorScore
      • ModelMetadata
      • ModelPredictionResiduals
      • RegardScore
      • RegressionResidualsPlot
      • RougeScore
      • TimeSeriesPredictionsPlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • Embeddings
        • ClusterDistribution
        • CosineSimilarityComparison
        • CosineSimilarityDistribution
        • CosineSimilarityHeatmap
        • DescriptiveAnalytics
        • EmbeddingsVisualization2D
        • EuclideanDistanceComparison
        • EuclideanDistanceHeatmap
        • PCAComponentsPairwisePlots
        • StabilityAnalysisKeyword
        • StabilityAnalysisRandomNoise
        • StabilityAnalysisSynonyms
        • StabilityAnalysisTranslation
        • TSNEComponentsPairwisePlots
      • Ragas
        • AnswerCorrectness
        • AspectCritic
        • ContextEntityRecall
        • ContextPrecision
        • ContextPrecisionWithoutReference
        • ContextRecall
        • Faithfulness
        • NoiseSensitivity
        • ResponseRelevancy
        • SemanticSimilarity
      • Sklearn
        • AdjustedMutualInformation
        • AdjustedRandIndex
        • CalibrationCurve
        • ClassifierPerformance
        • ClassifierThresholdOptimization
        • ClusterCosineSimilarity
        • ClusterPerformanceMetrics
        • CompletenessScore
        • ConfusionMatrix
        • FeatureImportance
        • FowlkesMallowsScore
        • HomogeneityScore
        • HyperParametersTuning
        • KMeansClustersOptimization
        • MinimumAccuracy
        • MinimumF1Score
        • MinimumROCAUCScore
        • ModelParameters
        • ModelsPerformanceComparison
        • OverfitDiagnosis
        • PermutationFeatureImportance
        • PopulationStabilityIndex
        • PrecisionRecallCurve
        • RegressionErrors
        • RegressionErrorsComparison
        • RegressionPerformance
        • RegressionR2Square
        • RegressionR2SquareComparison
        • RobustnessDiagnosis
        • ROCCurve
        • ScoreProbabilityAlignment
        • SHAPGlobalImportance
        • SilhouettePlot
        • TrainingTestDegradation
        • VMeasure
        • WeakspotsDiagnosis
      • Statsmodels
        • AutoARIMA
        • CumulativePredictionProbabilities
        • DurbinWatsonTest
        • GINITable
        • KolmogorovSmirnov
        • Lilliefors
        • PredictionProbabilitiesHistogram
        • RegressionCoeffs
        • RegressionFeatureSignificance
        • RegressionModelForecastPlot
        • RegressionModelForecastPlotLevels
        • RegressionModelSensitivityPlot
        • RegressionModelSummary
        • RegressionPermutationFeatureImportance
        • ScorecardHistogram
    • Ongoing Monitoring
      • CalibrationCurveDrift
      • ClassDiscriminationDrift
      • ClassificationAccuracyDrift
      • ClassImbalanceDrift
      • ConfusionMatrixDrift
      • CumulativePredictionProbabilitiesDrift
      • FeatureDrift
      • PredictionAcrossEachFeature
      • PredictionCorrelation
      • PredictionProbabilitiesHistogramDrift
      • PredictionQuantilesAcrossFeatures
      • ROCCurveDrift
      • ScoreBandsDrift
      • ScorecardHistogramDrift
      • TargetPredictionDistributionPlot
    • Plots
      • BoxPlot
      • CorrelationHeatmap
      • HistogramPlot
      • ViolinPlot
    • Prompt Validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
    • Stats
      • CorrelationAnalysis
      • DescriptiveStats
      • NormalityTests
      • OutlierDetection
  • Test sandbox beta

  • Notebooks
  • Code samples
    • Agents
      • AI Agent Validation with ValidMind - Banking Demo
    • Capital Markets
      • Quickstart for knockout option pricing model documentation
      • Quickstart for Heston option pricing model using QuantLib
    • Code Explainer
      • Quickstart for model code documentation
    • Credit Risk
      • Document an application scorecard model
      • Document an application scorecard model
      • Document a credit risk model
      • Document an application scorecard model
      • Document an Excel-based application scorecard model
    • Custom Tests
      • Implement custom tests
      • Integrate external test providers
    • Model Validation
      • Validate an application scorecard model
    • Nlp and Llm
      • Sentiment analysis of financial data using a large language model (LLM)
      • Summarization of financial data using a large language model (LLM)
      • Sentiment analysis of financial data using Hugging Face NLP models
      • Summarization of financial data using Hugging Face NLP models
      • Automate news summarization using LLMs
      • Prompt validation for large language models (LLMs)
      • RAG Model Benchmarking Demo
      • RAG Model Documentation Demo
    • Ongoing Monitoring
      • Ongoing Monitoring for Application Scorecard
      • Quickstart for ongoing monitoring of models with ValidMind
    • Regression
      • Document a California Housing Price Prediction regression model
    • Time Series
      • Document a time series forecasting model
      • Document a time series forecasting model

  • Reference
  • ValidMind Library Python API
  • ​ValidMind Public REST API

On this page

  • IQROutliersBarPlot
    • Purpose
    • Test Mechanism
    • Signs of High Risk
    • Strengths
    • Limitations
  • Edit this page
  • Report an issue
  1. Test descriptions
  2. Data Validation
  3. IQROutliersBarPlot

IQROutliersBarPlot

Visualizes outlier distribution across percentiles in numerical data using the Interquartile Range (IQR) method.

Purpose

The InterQuartile Range Outliers Bar Plot (IQROutliersBarPlot) metric aims to visually analyze and evaluate the extent of outliers in numeric variables based on percentiles. Its primary purpose is to clarify the dataset's distribution, flag possible abnormalities in it, and gauge potential risks associated with processing potentially skewed data, which can affect the machine learning model's predictive prowess.

Test Mechanism

The examination invokes a series of steps:

  1. For every numeric feature in the dataset, the 25th percentile (Q1) and 75th percentile (Q3) are calculated before deriving the Interquartile Range (IQR), the difference between Q1 and Q3.
  2. Subsequently, the metric calculates the lower and upper thresholds by subtracting Q1 from the threshold times IQR and adding Q3 to threshold times IQR, respectively. The default threshold is set at 1.5.
  3. Any value in the feature that falls below the lower threshold or exceeds the upper threshold is labeled as an outlier.
  4. The number of outliers are tallied for different percentiles, such as [0-25], [25-50], [50-75], and [75-100].
  5. These counts are employed to construct a bar plot for the feature, showcasing the distribution of outliers across different percentiles.

Signs of High Risk

  • A prevalence of outliers in the data, potentially skewing its distribution.
  • Outliers dominating higher percentiles (75-100) which implies the presence of extreme values, capable of severely influencing the model's performance.
  • Certain features harboring most of their values as outliers, which signifies that these features might not contribute positively to the model's forecasting ability.

Strengths

  • Effectively identifies outliers in the data through visual means, facilitating easier comprehension and offering insights into the outliers' possible impact on the model.
  • Provides flexibility by accommodating all numeric features or a chosen subset.
  • Task-agnostic in nature; it is viable for both classification and regression tasks.
  • Can handle large datasets as its operation does not hinge on computationally heavy operations.

Limitations

  • Its application is limited to numerical variables and does not extend to categorical ones.
  • Only reveals the presence and distribution of outliers and does not provide insights into how these outliers might affect the model's predictive performance.
  • The assumption that data is unimodal and symmetric may not always hold true. In cases with non-normal distributions, the results can be misleading.
HighPearsonCorrelation
IQROutliersTable
  • ValidMind Logo
    ©
    Copyright 2025 ValidMind Inc.
    All Rights Reserved.
    Cookie preferences
    Legal
  • Get started
    • Model development
    • Model validation
    • Setup & admin
  • Guides
    • Access
    • Configuration
    • Model inventory
    • Model documentation
    • Model validation
    • Model workflows
    • Reporting
    • Monitoring
    • Attestation
  • Library
    • For developers
    • For validators
    • Code samples
    • Python API
    • Public REST API
  • Training
    • Learning paths
    • Courses
    • Videos
  • Support
    • Troubleshooting
    • FAQ
    • Get help
  • Community
    • Slack
    • GitHub
    • Blog
  • Edit this page
  • Report an issue