Fresh Food Demand Forecasting

Predicting demand for short-shelf-life products is critical because the penalty for error is immediate spoilage or lost sales.

Why Perishable Demand is Harder

Unlike durable goods, fresh food demand is highly volatile due to:

Classical vs. Machine Learning Methods

  1. Classical: SARIMAX (Seasonal ARIMA with eXogenous variables) and Holt-Winters provide solid baselines, effectively capturing weekly seasonality.
\Phi(B)\Phi_s(B^s)(1-B)^d(1-B^s)^D y_t = \Theta(B)\Theta_s(B^s) \epsilon_t + \beta X_t
  1. Machine Learning: Gradient Boosted Trees (like LightGBM, XGBoost, and Random Forest) excel at capturing nonlinear interactions (e.g., a promotion on a Tuesday during a rainstorm). Despite the rise of deep learning, tree-based models often still outperform Transformers for short-horizon, daily retail fresh food orders due to their simpler feature spaces and robust performance on structured data.
  2. Deep Learning and Probabilistic Forecasting: Temporal Fusion Transformers (TFT) and other advanced architectures are increasingly used for multi-horizon forecasting. Modern models have largely shifted from point forecasting to probabilistic forecasting, directly linking prediction intervals to operational risk and inventory decisions.

Feature Engineering for Fresh Food

Crucial features for an ML model predicting daily SKU-level demand:

Forecast Reconciliation

Predictions are often generated at the store-SKU level, but purchasing decisions happen at the DC-category level. Hierarchical reconciliation ensures that the sum of store forecasts equals the regional forecast, preventing bullwhip effects across the network.

The Asymmetric Cost of Over- vs Under-Forecasting

As discussed in PerishableInventoryTheory, the cost of over-forecasting (c_o, leading to waste) is often higher than the cost of under-forecasting (c_u, lost margin). Therefore, models are trained using an asymmetric loss function, such as the pinball loss for quantile regression:

L_\tau(y, \hat{y}) = \begin{cases} \tau (y - \hat{y}) & \text{if } y \geq \hat{y} \\ (1 - \tau) (\hat{y} - y) & \text{if } y < \hat{y} \end{cases}

Setting \tau < 0.5 biases the model to under-forecast, explicitly trading off out-of-stocks against food waste (see FreshFoodWasteScience).

Worked Example: A supermarket sells fresh baked croissants.

Industry Comparisons

References