Pytorch is a scientific library operated by Facebook, It was first launched in 2016, and it is a python package that uses the power of GPU’s(graphic processing unit), It is one of the most popular deep learning frameworks used by machine learning and data scientists on a daily basis. It is a very flexible and fast deep learning framework. It provides Tensors supports for GPU acceleration and construction of neural nets in tape-based auto-grading systems.
This library has the potential to alter deep learning and AI performance. This library is completely python-based, so it’s easy to create a neural network.
Now PyTorch is capable of handling a full pipeline in deep learning and AI projects, but some of the things can be pretty messy like using PyTorch for Forecasting, so a third party is introduced by Jan Beitner Pytorch Forecasting”
Forecasting time series is important, and with the help of this third-party framework made on top of PyTorch, we can do time series forecasting just like Tensorflow very easily. Forecasting is in the industry for a very long time, and it is used by many businesses for making an extra profit by just predicting the future outcome and keeping them on the safe side. Now with the help of deep learning, traditional forecasting methods have surpassed the last three years.
Pytorch Forecasting is a framework made on top of PyTorch Light used to ease time series forecasting with the help of neural networks for real-world use-cases. It is having state of the art time series forecasting architectures that can be easily trained with input data points. This library has many benefits as they composed all the different requirements into specific classes like:
- TimeSeriesDataSet is used to prepare the dataset for training with PyTorch, this class takes care of the variable transformation, random sampling, and missing value filling, etc.
- BaseModel class is used to provide data visualization such as showing predicting values vs real values.
Some of the other features we get with Pytorch forecasting are:
- Faster model training as it is built on PyTorch lightning which allows you to train the model on CPU as well as multiple GPU.
- Temporal fusion Transformer: An architecture developed by Oxford University and Google for Interpretable Multi-horizon Time Series forecasting that beat Amazon’s DeepAR with 39-69% in benchmarks.
- N-BEATS model
- DeepAR model: Most popular baseline model for time-series forecasting.
- Ranger optimizer for faster model training.
- Hyperparameter tuning using optuna
Installation
First, install Pytorch as the forecasting library is inherited from Pytorch, install PyTorch with this command:
pip install torch -f https://download.pytorch.org/whl/torch_stable.html
Install PyTorch forecasting using pip:
pip install pytorch-forecasting
Or you can install it with conda forge command:
conda install -c conda-forge pytorch-forecasting
Getting Started with Pytorch Forecasting
We are going to use the Stallion Dataset from Kaggle, this dataset contains the following files:
- pricesalespromotion.csv: Holds the price, sales & promotion in dollars.
- historicalvolume.csv: contains sales data.
- weather.csv: the average maximum temperature at Agency monthly/
- Industrysodasales.csv: Holds industry level soda sales
- eventcalendar.csv: Holds event details (sports, carnivals, etc.)
- industry_volume.csv: industry actual beer volume
- demographics.csv: demographic details
This data set contains the sales of various beverages. Our goal is to predict six months of sold volume by stock-keeping units(SKU).
This dataset is already included in pytorch forecasting library so import it using below commands:
from pytorch_forecasting.data.examples import get_stallion_data
data = get_stallion_data() # load data into pandas dataframe
Data Cleaning
Below commands are compulsory data cleaning code for things like one-hot encoding and modifying columns.
data["date"].dt.monthdata["time_idx"] -= data["time_idx"].min() data["time_idx"] = data["date"].dt.year * 12 + data["date"].dt.monthdata["time_idx"] # adding features # categories must be string data["month"] = data.date.dt.month.astype(str).astype("category") data["log_volume"] = np.log(data.volume + 1e-8) data["avg_volume_by_sku"] = ( data .groupby(["time_idx", "sku"], observed=True) .volume.transform("mean") ) data["avg_volume_by_agency"] = ( data .groupby(["time_idx", "agency"], observed=True) .volume.transform("mean") ) # we will encode special days into one variable and apply Reverse one-hot encoding special_days = [ "easter_day", "good_friday", "new_year", "christmas", "labor_day", "independence_day", "revolution_day_memorial", "regional_games", "fifa_u_17_world_cup", "football_gold_cup", "beer_capital", "music_fest" ] data[special_days] = ( data[special_days] .apply(lambda x: x.map({0: "-", 1: x.name})) .astype("category") ) # sample data data.sample(10, random_state=521)
Let’s convert the dataset into Pytorch forecasting format.
from pytorch_forecasting.data import ( TimeSeriesDataSet, GroupNormalizer ) max_prediction_length = 6 # forecast of 6 months max_encoder_length = 24 # using history of 24 months training_cutoff = data["time_idx"].max() - max_prediction_length training = TimeSeriesDataSet( data[lambda x: x.time_idx <= training_cutoff], time_idx="time_idx", target="volume", group_ids=["agency", "sku"], min_encoder_length=0, # allowing predictions without history max_encoder_length=max_encoder_length, min_prediction_length=1, max_prediction_length=max_prediction_length, static_categoricals=["agency", "sku"], static_reals=[ "avg_population_2017", "avg_yearly_household_income_2017" ], time_varying_known_categoricals=["special_days", "month"], # group of categorical variables can be treated as one variable variable_groups={"special_days": special_days}, time_varying_known_reals=[ "time_idx", "price_regular", "discount_in_percent" ], time_varying_unknown_categoricals=[], time_varying_unknown_reals=[ "volume", "log_volume", "industry_volume", "soda_volume", "avg_max_temp", "avg_volume_by_agency", "avg_volume_by_sku", ], target_normalizer=GroupNormalizer( groups=["agency", "sku"], coerce_positive=1.0 ), # use softplus with beta=1.0 and normalize by group add_relative_time_idx=True, # add as feature add_target_scales=True, # add as feature add_encoder_length=True, # add as feature ) # creating validation set (predict=True) which means to predict the # last max_prediction_length points in time for each series validation = TimeSeriesDataSet.from_dataset( training, data, predict=True, stop_randomization=True ) # create dataloaders for model batch_size = 128 train_dataloader = training.to_dataloader( train=True, batch_size=batch_size, num_workers=0 ) val_dataloader = validation.to_dataloader( train=False, batch_size=batch_size * 10, num_workers=0 )
Train the Temporal Fusion Transformer
import pytorch_lightning as pl from pytorch_lightning.callbacks import ( EarlyStopping, LearningRateLogger ) from pytorch_lightning.loggers import TensorBoardLogger from pytorch_forecasting.metrics import QuantileLoss from pytorch_forecasting.models import TemporalFusionTransformer # training will stop, when validation loss does not improve early_stop_callback = EarlyStopping( monitor="val_loss", min_delta=1e-4, patience=10, verbose=False, mode="min" ) lr_logger = LearningRateLogger() logger = TensorBoardLogger("lightning_logs") # creating trainer trainer = pl.Trainer( max_epochs=30, gpus=0, gradient_clip_val=0.1, early_stop_callback=early_stop_callback, limit_train_batches=30, # running validation for every 30 batches callbacks=[lr_logger], logger=logger, ) # initialise the model tft = TemporalFusionTransformer.from_dataset( training, learning_rate=0.03, hidden_size=16, # biggest influence network size attention_head_size=1, dropout=0.2, hidden_continuous_size=8, output_size=7, # by default QuantileLoss has 7 quantiles loss=QuantileLoss(), log_interval=10, # log example for every 10 batches reduce_on_plateau_patience=4, # reduce learning automatically ) tft.size() # 29.6k parameters in model # fit network trainer.fit( tft, train_dataloader=train_dataloader, val_dataloaders=val_dataloader )
During training, you can also view the tensorboard for prediction visualization using tensorboard –logdir=lightning_logs.
Code of this tutorial is available here.
Conclusion
As you have seen how easy it is to train and analyze the time series data using the Pytorch forecasting framework, you can also evaluate the trained model using matrices. MAE, another feature of this framework is an interpretation of trained models. You can also see the variable importances by the design of neural networks, read more about PyTorch forecasting here
Pytorch forecasting is open sourced at GitHub here if you want to contribute or submit an issue the community support for this library is very overwhelming. And for learning more about go there, models go to these links: