Decoded – A Primer on ML-Based Forecasting
Machine learning (ML) has revolutionised numerous fields, and forecasting is no exception. Several forecasting competitions, such as those hosted on Kaggle have demonstrated the strong empirical performance of ML methods in forecasting tasks. Despite their success, understanding the underlying mechanisms that contribute to their effectiveness remains a complex challenge. Here, we decode some key steps.
The rapid advancement of ML has permeated the field of forecasting, driving the development of sophisticated methods such as autoregressive neural networks and gradient-boosting models. These methods have shown remarkable results, significantly outperforming traditional approaches in various forecasting competitions. ML-based methods have consistently secured top positions in several competitions, with neural networks and gradient-boosting models delivering substantial improvements over traditional benchmarks like exponential smoothing.
A key factor in the success of these methods is cross-learning, where a single model learns patterns across multiple time series, enhancing its predictive capabilities. However, the complexity of ML methods, characterised by numerous components such as preprocessing, feature engineering, and hyperparameter tuning, poses a significant barrier to understanding why and how these methods excel in certain contexts but falter in others. To address this, a structured framework for regression-based ML forecasting methods, outlined by Danish researcher Casper Solheim Bojer can be found below:
A Basic Framework for Regression-Based ML Forecasting Methods
The proposed framework decomposes regression-based ML methods into five key areas, each encompassing various components that contribute to the forecasting process. This decomposition not only provides a common language for researchers but also facilitates the systematic study of ML methods. The five areas are:
- Preprocessing involves cleaning and preparing the data for modelling. Two critical tasks in this phase are:
- Handling Outliers: These are unusually high or low values in the data, often due to errors or special events. While outliers can distort forecasts if not addressed, removing them may lead to underestimating future uncertainties if such events could recur.
- Imputation of missing values: This involves replacing missing values in the time series. Methods range from simple techniques like the last observation carried forward to complex approaches like matrix factorisation.
- Dataset Construction: This involves transforming raw data into a suitable format for the ML model. This includes:
- Feature Engineering: Transforming time series data into informative features that improve model performance. Common features include lagged values, rolling statistics, and external factors.
- Target Engineering: Transforming the prediction target to enhance model accuracy. Techniques include stationarizing the series, dealing with non-constant variance, and scaling the series.
- Training Set Construction: Deciding what data to use for training and how to weigh it. This includes sub-setting, data augmentation, and weighting data points.
- Model Training and Validation involve the following:
- Model Selection: Any supervised ML model, such as LightGBM or neural networks, optimised using loss functions like mean squared error or quantile loss.
- Multi-Step Strategy: Strategies for producing multi-step forecasts, such as recursive, direct, hybrid, and multiple output strategies.
- Pooling/Cross-Learning: Strategies for assigning time series to models, ranging from local models (one per series) to global models (one for all series).
- Model Evaluation: Techniques to prevent overfitting and ensure robust model selection and performance evaluation, including cross-validation strategies.
- Ensembling combines the outputs of multiple models to improve performance. It involves creating diverse models through variations in components like features and hyperparameters and combining their forecasts using methods like stacking or simple averaging.
- Postprocessing adjusts the model outputs based on expert judgment or additional data-driven approaches. This includes trend adjustments, clipping forecasts, and hierarchical reconciliation to ensure forecasts are consistent across different aggregation levels.
Application of the Framework
To illustrate the framework’s utility, it was applied to analyse the winning and third-place solutions from the 2022 M5 Uncertainty competition hosted by the University of Nicosia. Despite using similar input data, these solutions employed different strategies across all components, highlighting the vast design space and complexity of ML methods.
To advance ML forecasting, it is crucial to identify which components contribute to a method’s success. Ablation testing, a common technique in ML, can be adapted to decompose and systematically evaluate the impact of each component. By turning off or varying specific components, researchers can isolate their effects on performance, providing insights into which elements are essential and which add unnecessary complexity.
Research Opportunities
The framework reveals several underexplored areas ripe for further research. One critical area is the effectiveness of different pooling strategies in cross-learning. Understanding when and where these strategies outperform local models could significantly enhance forecasting accuracy. Additionally, the interdependencies between feature and target engineering in cross-learning contexts warrant further investigation.
Another promising area is cross-validation. Although substantial research exists, the rapid evolution of ML methods suggests that revisiting and refining cross-validation strategies could yield valuable improvements. Similarly, the design and application of new ensembling techniques and postprocessing methods present fertile ground for advancing ML-based forecasting.
—
The proposed framework for regression-based ML forecasting methods provides a structured approach to understanding and improving these complex systems. By decomposing methods into their constituent components, the framework facilitates systematic study and comparison, paving the way for more effective and transparent forecasting models. As researchers continue to explore and refine these methods, the potential for ML to revolutionise forecasting will become increasingly realised.
Read “Understanding machine learning-based forecasting methods: A decomposition framework and research opportunities” (Casper Solheim Bojer, Aalborg University, Denmark) here.