LightGBM V7
Dual gradient boosting model for temperature mean and variance prediction
Dual Model Architecture
The prediction engine runs two separate LightGBM models trained on historical weather data. The first model predicts the mean temperature for each station on a given date. The second model predicts the variance, capturing uncertainty in the forecast. Together, they define a Gaussian distribution over possible temperatures.
38 Engineered Features
Each prediction uses 38 carefully engineered features derived from ensemble weather forecasts, historical patterns, and temporal signals. Features include ensemble mean, spread, min/max across GFS, ECMWF, ICON, and GEM models, day-of-year cyclical encodings, station latitude/longitude, and rolling historical accuracy metrics.
Optuna Hyperparameter Tuning
Hyperparameters are optimized using Optuna with a Bayesian TPE sampler over 200 trials. The objective minimizes out-of-sample Brier score on calibrated bracket probabilities, ensuring the model is tuned end-to-end for the actual trading task rather than raw temperature accuracy.