PhillipsPerronArch
Assesses the stationarity of time series data in each feature of the ML model using the Phillips-Perron test.
Purpose
The Phillips-Perron (PP) test is used to determine the stationarity of time series data for each feature in a dataset, which is crucial for forecasting tasks. It tests the null hypothesis that a time series is unit-root non-stationary. This is vital for understanding the stochastic behavior of the data and ensuring the robustness and validity of predictions generated by regression analysis models.
Test Mechanism
The PP test is conducted for each feature in the dataset as follows: - A data frame is created from the dataset. - For each column, the Phillips-Perron method calculates the test statistic, p-value, lags used, and number of observations. - The results are then stored for each feature, providing a metric that indicates the stationarity of the time series data.
Signs of High Risk
- A high p-value, indicating that the series has a unit root and is non-stationary.
- Test statistic values exceeding critical values, suggesting non-stationarity.
- High ‘usedlag’ value, pointing towards autocorrelation issues that may degrade model performance.
Strengths
- Resilience against heteroskedasticity in the error term.
- Effective for long time series data.
- Helps in determining whether the time series is stationary, aiding in the selection of suitable forecasting models.
Limitations
- Applicable only within a univariate time series framework.
- Relies on asymptotic theory, which may reduce the test’s power for small sample sizes.
- Non-stationary time series must be converted to stationary series through differencing, potentially leading to loss of important data points.