Forecasting Daily Turnover Using Xgboost Algorithm – A Case Study

Forecasting Daily Turnover Using Xgboost Algorithm – A Case Study

Abstract

The goal of this paper was to investigate use of the Extreme Gradient Boosting XGBoost algorithm as a forecasting tool. The data provided by the Rossman Company, with a request to design an innovative prediction method, has been used as a base for this case study. The data contains details about micro- and macro-environment, as well as turnover of 1115 stores. Performance of the algorithm was compared to classical forecasting models SARIMAX and Holt–Winters, using time-series cross validation and tests for statistical importance in prediction quality differences. Metrics of root mean squared percentage error (RMSPE), Theil’s coefficient and adjusted correlation coefficient were analyzed. Results where then passed to Rossman for verification on a separate validation set, via Kaggle.com platform. Study results confirmed, that XGBoost, after using proper data preparation and training method, achieves better results than classical models.

Publication
Forecasting Daily Turnover Using Xgboost Algorithm – A Case Study
Avatar
Filip Wójcik
Senior Data Scientist and PhD candidate

Data scientist and University researcher, passionate of machine learning and statistical analysis. In the same time - experienced software developer with experience in different technologies (from .NET to open-source).