DARTS - A magical python library for Timeseries forecasting

Introduction

Time series forecasting is the process of creating models using past time-stamped data in order to create scientific projections and guide strategic decision-making in the future. Time series forecasting is useful across many industries, including:

  • Forecasting demand from consumers for each product

  • Healthcare planning, diagnosis, and forecasting of pandemic spread

  • Cybersecurity, anomaly detection, and preventive maintenance

  • Determine if the infrastructure is now able to manage traffic now and in the future.

Due to the temporal ordering of the data, which must be taken into account during feature engineering and modelling, time series forecasting differs slightly from standard machine learning use cases.

In order to train a time-series forecasting model, you end up using Pandas for pre-processing, a stats model for statistical tests and seasonality, sci-kit-learn or Facebook Prophet for forecasting, and custom code for backtesting and model selection.

Data scientists find it difficult to anticipate time series from beginning to end since different libraries have varying APIs and data formats. The scikit-learn package offers a uniform API for end-to-end machine learning modelling and is available for usage with standard machine learning use cases.

The main objective of Darts, which aims to be a scikit-learn for time series, is to streamline the whole time series forecasting methodology. The darts package and its implementation will be covered in this article.

The usage of the DARTS library

Darts is a Python module for simple time series manipulation and forecasting. It provides models that may be constructed similarly to sci-kit-learn models, including deep neural networks and well-known models like ARIMA (using fit and predict APIs).

The following are some characteristics of the Darts package:

  • It is based on the immutable TimeSeries class.

  • Its fit() and predict() APIs, which are similar to those in Scikit-Learn, are unified and user-friendly

  • It provides a range of models, including both traditional models and cutting-edge ML/DL techniques.

  • It offers APIs for end-to-end time series forecasting use cases, including information discovery, data pretreatment, forecasting, and model choice and assessment.

Code illustration

Let's also go through, investigate, and put into practice the core features of the open-source Darts package for the Monthly Air Passengers Dataset from 1949 to 1960.

Installation

Installation of the faker library is a pretty much easy task. We need to use only one line of code as we usually do for installing any other python library.

!pip install darts

Load the Timeseries

To further visualise the Air Passenger dataset, load it, divide it into training and validation sets of data and then combine the two.

from darts.datasets import AirPassengersDataset

series = AirPassengersDataset().load()
training, validation = series.split_before(0.80)
training.plot()
validation.plot()

Forecasting

Many traditional and cutting-edge time-series modelling methods, such as ARIMA, Theta, Exponential Smoothing, N-Beats, Facebook Prophet, etc., are implemented in Darts.

from darts.models import Theta

# Fit the theta model
model_theta = Theta()
model_theta.fit(training)

# Validation prediction
pred_theta = model_theta.predict(len(validation))

# Visualize
training.plot(label='Training')
validation.plot(label='True')
pred_theta.plot(label='Prediction')

Model Evaluation and Hyperparameter tuning

Darts provides implementation to calculate your model's performance and optimise your model by adjusting the estimator's hyperparameters.

from darts.utils.utils import SeasonalityMode

parameters = {'theta': [0.5, 1., 1.5, 2., 2.5],
              'season_mode': [SeasonalityMode.MULTIPLICATIVE, SeasonalityMode.ADDITIVE]}

best_model, best_params, best_score = Theta.gridsearch(parameters=parameters, series=training, start=0.5, forecast_horizon=12)
print(best_model)
print(best_params)
print(best_score)

Conclusion

Darts is a practical tool that provides current, user-friendly ML implementation tailored to time series use cases.

Static covariates, AutoML implementation, anomaly detection, and pre-trained models may not be available in the Darts package.

References

0
Subscribe to my newsletter

Read articles from Sanjay Nandakumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Sanjay Nandakumar
Sanjay Nandakumar

Data scientist | ML Engineer | Statistician