# Install the ValidMind Library
%pip install -q validmind
# Initialize the ValidMind Library
import validmind as vm
# Import the `xgboost` library with an alias
import xgboost as xgbUnderstand and utilize RawData in ValidMind tests
Test functions in ValidMind can return a special object called RawData, which holds intermediate or unprocessed data produced somewhere in the test logic but not returned as part of the test's visible output, such as in tables or figures.
- The
RawDatafeature allows you to customize the output of tests, making it a powerful tool for creating custom tests and post-processing functions. RawDatais useful when running post-processing functions with tests to recompute tabular outputs, redraw figures, or even create new outputs entirely.
In this notebook, you'll learn how to access, inspect, and utilize RawData from ValidMind tests.
Setup
Before we can run our examples, we'll need to set the stage to enable running tests with the ValidMind Library. Since the focus of this notebook is on the RawData object, this section will merely summarize the steps instead of going into greater detail.
To learn more about running tests with ValidMind: Run tests and test suites
Installation and intialization
First, let's make sure that the ValidMind Library is installed and ready to go, and our Python environment set up for data analysis:
Load the sample dataset
Then, we'll import a sample ValidMind dataset and preprocess it:
# Import the `customer_churn` sample dataset
from validmind.datasets.classification import customer_churn
raw_df = customer_churn.load_data()
# Preprocess the raw dataset
train_df, validation_df, test_df = customer_churn.preprocess(raw_df)
# Separate features and targets
x_train = train_df.drop(customer_churn.target_column, axis=1)
y_train = train_df[customer_churn.target_column]
x_val = validation_df.drop(customer_churn.target_column, axis=1)
y_val = validation_df[customer_churn.target_column]
# Create an `XGBClassifier` object
model = xgb.XGBClassifier(early_stopping_rounds=10)
model.set_params(
eval_metric=["error", "logloss", "auc"],
)
# Train the model using the validation set
model.fit(
x_train,
y_train,
eval_set=[(x_val, y_val)],
verbose=False,
)Initialize the ValidMind objects
Before you can run tests, you'll need to initialize a ValidMind dataset object, as well as a ValidMind model object that can be passed to other functions for analysis and tests on the data:
# Initialize the dataset object
vm_raw_dataset = vm.init_dataset(
dataset=raw_df,
input_id="raw_dataset",
target_column=customer_churn.target_column,
class_labels=customer_churn.class_labels,
__log=False,
)
# Initialize the datasets into their own dataset objects
vm_train_ds = vm.init_dataset(
dataset=train_df,
input_id="train_dataset",
target_column=customer_churn.target_column,
__log=False,
)
vm_test_ds = vm.init_dataset(
dataset=test_df,
input_id="test_dataset",
target_column=customer_churn.target_column,
__log=False,
)
# Initialize a model object
vm_model = vm.init_model(
model,
input_id="model",
__log=False,
)
# Assign predictions to the datasets
vm_train_ds.assign_predictions(
model=vm_model,
)
vm_test_ds.assign_predictions(
model=vm_model,
)RawData usage examples
Once you're set up to run tests, you can then try out the following examples:
- Using
RawDatafrom the ROC Curve Test
- Pearson Correlation Matrix
- Precision-Recall Curve
- Using
RawDatain custom tests
- Using
RawDatain comparison tests
Using RawData from the ROC Curve Test
In this introductory example, we run the ROC Curve test, inspect its RawData output, and then create a custom ROC curve using the raw data values.
First, let's run the default ROC Curve test for comparsion with later iterations:
from validmind.tests import run_test
# Run the ROC Curve test normally
result_roc = run_test(
"validmind.model_validation.sklearn.ROCCurve",
inputs={"dataset": vm_test_ds, "model": vm_model},
generate_description=False,
)Now let's assume we want to create a custom version of the above figure. First, let's inspect the raw data that this test produces so we can see what we have to work with.
RawData objects have a inspect() method that will pretty print the attributes of the object to be able to quickly see the data and its types:
# Inspect the RawData output from the ROC test
print("RawData from ROC Curve Test:")
result_roc.raw_data.inspect()As we can see, the ROC Curve returns a RawData object with the following attributes: - fpr: A list of false positive rates - tpr: A list of true positive rates - auc: The area under the curve
This should be enough to create our own custom ROC curve via a post-processing function without having to create a whole new test from scratch and without having to recompute any of the data:
import matplotlib.pyplot as plt
from validmind.vm_models.result import TestResult
def custom_roc_curve(result: TestResult):
# Extract raw data from the test result
fpr = result.raw_data.fpr
tpr = result.raw_data.tpr
auc = result.raw_data.auc
# Create a custom ROC curve plot
fig = plt.figure()
plt.plot(fpr, tpr, label=f"Custom ROC (AUC = {auc:.2f})", color="blue")
plt.plot([0, 1], [0, 1], linestyle="--", color="gray", label="Random Guess")
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("Custom ROC Curve from RawData")
plt.legend()
# close the plot to avoid it automatically being shown in the notebook
plt.close()
# remove existing figure
result.remove_figure(0)
# add new figure
result.add_figure(fig)
return result
# test it on the existing result
modified_result = custom_roc_curve(result_roc)
# show the modified result
modified_result.show()Now that we have created a post-processing function and verified that it works on our existing test result, we can use it directly in run_test() from now on:
result = run_test(
"validmind.model_validation.sklearn.ROCCurve",
inputs={"dataset": vm_test_ds, "model": vm_model},
post_process_fn=custom_roc_curve,
generate_description=False,
)Pearson Correlation Matrix
In this next example, try commenting out the post_process_fn argument in the following cell and see what happens between different runs:
import plotly.graph_objects as go
def custom_heatmap(result: TestResult):
corr_matrix = result.raw_data.correlation_matrix
heatmap = go.Heatmap(
z=corr_matrix.values,
x=list(corr_matrix.columns),
y=list(corr_matrix.index),
colorscale="Viridis",
)
fig = go.Figure(data=[heatmap])
fig.update_layout(title="Custom Heatmap from RawData")
plt.close()
result.remove_figure(0)
result.add_figure(fig)
return result
result_corr = run_test(
"validmind.data_validation.PearsonCorrelationMatrix",
inputs={"dataset": vm_test_ds},
generate_description=False,
# COMMENT OUT `post_process_fn`
post_process_fn=custom_heatmap,
)Precision-Recall Curve
Then, let's try the same thing with the Precision-Recall Curve test:
def custom_pr_curve(result: TestResult):
precision = result.raw_data.precision
recall = result.raw_data.recall
fig = plt.figure()
plt.plot(recall, precision, label="Precision-Recall Curve")
plt.xlabel("Recall")
plt.ylabel("Precision")
plt.title("Custom Precision-Recall Curve from RawData")
plt.legend()
plt.close()
result.remove_figure(0)
result.add_figure(fig)
return result
result_pr = run_test(
"validmind.model_validation.sklearn.PrecisionRecallCurve",
inputs={"dataset": vm_test_ds, "model": vm_model},
generate_description=False,
# COMMENT OUT `post_process_fn`
post_process_fn=custom_pr_curve,
)Using RawData in custom tests
These examples demonstrate some very simple ways to use the RawData feature of ValidMind tests. The majority of ValidMind-developed tests return some form of raw data that can be used to customize the output of the test, but you can also create your own tests that return RawData objects and use them in the same way.
Let's take a look at how this can be done in custom tests. To start, define and run your custom test:
import pandas as pd
from validmind import test, RawData
from validmind.vm_models import VMDataset, VMModel
@test("custom.MyCustomTest")
def MyCustomTest(dataset: VMDataset, model: VMModel) -> tuple[go.Figure, RawData]:
"""Custom test that produces a figure and a RawData object"""
# pretend we are using the dataset and model to compute some data
# ...
# create some fake data that will be used to generate a figure
data = pd.DataFrame({"x": [10, 20, 30, 40, 50], "y": [10, 20, 30, 40, 50]})
# create the figure (scatter plot)
fig = go.Figure(data=go.Scatter(x=data["x"], y=data["y"]))
# now let's create a RawData object that holds the "computed" data
raw_data = RawData(scatter_data_df=data)
# finally, return both the figure and the raw data
return fig, raw_data
my_result = run_test(
"custom.MyCustomTest",
inputs={"dataset": vm_test_ds, "model": vm_model},
generate_description=False,
)We can see that the test result shows the figure. But since we returned a RawData object, we can also inspect the contents and see how we could use it to customize or regenerate the figure in the post-processing function:
my_result.raw_data.inspect()We can see that we get a nicely-formatted preview of the dataframe we stored in the raw data object. Let's go ahead and use it to re-plot our data:
def custom_plot(result: TestResult):
data = result.raw_data.scatter_data_df
# use something other than a scatter plot
fig = go.Figure(data=go.Bar(x=data["x"], y=data["y"]))
fig.update_layout(title="Custom Bar Chart from RawData")
fig.update_xaxes(title="X Axis")
fig.update_yaxes(title="Y Axis")
result.remove_figure(0)
result.add_figure(fig)
return result
result = run_test(
"custom.MyCustomTest",
inputs={"dataset": vm_test_ds, "model": vm_model},
post_process_fn=custom_plot,
generate_description=False,
)Using RawData in comparison tests
When running comparison tests, the RawData object will contain the raw data for each individual test result as well as the comparison results between the test results. To support this, the RawData object contains the model and dataset input_ids for each of the datasets and models in the test, so that the post-processing function can use them to customize the output. The example below shows how to use the RawData object to customize the output of a comparison test and add a table to the test result that shows the confusion matrix for each individual test result as well as the comparison results between the test results.
When designing post-processing functions that need to handle both individual and comparison test results, you can check the structure of the raw data to determine which case you're dealing with. In the example below, we check if confusion_matrix is a list (comparison test with multiple matrices) or a single matrix (individual test). For comparison tests, the function creates two tables: one showing the confusion matrices for each test case, and another showing the percentage drift between them. For individual tests, it creates a single table with the confusion matrix values. This pattern of checking the raw data structure can be applied to other tests to create versatile post-processing functions that work in both scenarios.
def cm_table(result: TestResult):
# For individual results
if not isinstance(result.raw_data.confusion_matrix, list):
# Extract values from single confusion matrix
cm = result.raw_data.confusion_matrix
tn, fp = cm[0, 0], cm[0, 1]
fn, tp = cm[1, 0], cm[1, 1]
# Create DataFrame for individual matrix
cm_df = pd.DataFrame({
'TN': [tn],
'FP': [fp],
'FN': [fn],
'TP': [tp]
})
# Add individual table
result.add_table(cm_df, title="Confusion Matrix")
# For comparison results
else:
cms = result.raw_data.confusion_matrix
cm1, cm2 = cms[0], cms[1]
# Create individual results table
rows = []
for i, cm in enumerate(cms):
rows.append({
'dataset': result.raw_data.dataset[i],
'model': result.raw_data.model[i],
'TN': cm[0, 0],
'FP': cm[0, 1],
'FN': cm[1, 0],
'TP': cm[1, 1]
})
individual_df = pd.DataFrame(rows)
# Calculate percentage differences
diff_df = pd.DataFrame({
'TN_drift (%)': [(cm2[0, 0] - cm1[0, 0]) / cm1[0, 0] * 100],
'FP_drift (%)': [(cm2[0, 1] - cm1[0, 1]) / cm1[0, 1] * 100],
'FN_drift (%)': [(cm2[1, 0] - cm1[1, 0]) / cm1[1, 0] * 100],
'TP_drift (%)': [(cm2[1, 1] - cm1[1, 1]) / cm1[1, 1] * 100]
}).round(2)
# Add both tables
result.add_table(individual_df, title="Individual Confusion Matrices")
result.add_table(diff_df, title="Confusion Matrix Drift")
return resultLet's first run the confusion matrix test on a single dataset-model pair to see how our post-processing function handles individual results:
from validmind.tests import run_test
result_cm = run_test(
"validmind.model_validation.sklearn.ConfusionMatrix",
inputs={
"dataset": vm_test_ds,
"model": vm_model,
},
post_process_fn=cm_table,
generate_description=False,
)Now let's run a comparison test between test and train datasets to see how the function handles multiple results:
result_cm = run_test(
"validmind.model_validation.sklearn.ConfusionMatrix",
input_grid={
"dataset": [vm_test_ds, vm_train_ds],
"model": [vm_model]
},
post_process_fn=cm_table,
generate_description=False,
)Let's inspect the raw data to see how comparison tests structure their data - notice how the RawData object contains not just the confusion matrices for both datasets, but also tracks which dataset and model each result came from:
result_cm.raw_data.inspect()