validmind.lending_club
compute_scores
defcompute_scores(probabilities:np.ndarray) → np.ndarray:
feature_engineering
deffeature_engineering(df:pd.DataFrame,verbose:bool=True) → pd.DataFrame:
get_demo_test_config
defget_demo_test_config(x_test:Optional[np.ndarray]=None,y_test:Optional[np.ndarray]=None) → Dict[str, Any]:
Get demo test configuration.
Arguments
x_test
: Test features DataFramey_test
: Test target Series
Returns
- Test configuration dictionary
init_vm_objects
definit_vm_objects(scorecard):
load_data
defload_data(source:str='online',verbose:bool=True) → pd.DataFrame:
Load data from either an online source or offline files, automatically dropping specified columns for offline data.
Arguments
source
: 'online' for online data, 'offline' for offline files. Defaults to 'online'.
Returns
- DataFrame containing the loaded data.
load_scorecard
defload_scorecard():
load_test_config
defload_test_config(scorecard):
preprocess
defpreprocess(df:pd.DataFrame,verbose:bool=True) → pd.DataFrame:
split
defsplit(df:pd.DataFrame,validation_split:Optional[float]=None,test_size:float=0.2,add_constant:bool=False,verbose:bool=True) → Tuple[np.ndarray, np.ndarray, np.ndarray, np.ndarray]:
Split dataset into train, validation (optional), and test sets.
Arguments
df
: Input DataFramevalidation_split
: If None, returns train/test split. If float, returns train/val/test splittest_size
: Proportion of data for test set (default: 0.2)add_constant
: Whether to add constant column for statsmodels (default: False)
Returns
- If validation_size is None: train_df, test_df If validation_size is float: train_df, validation_df, test_df
woe_encoding
defwoe_encoding(df:pd.DataFrame,verbose:bool=True) → pd.DataFrame: