DARTS - A magical python library for Timeseries forecasting
Introduction
Time series forecasting is the process of creating models using past time-stamped data in order to create scientific projections and guide strategic decision-making in the future. Time series forecasting is useful across many industries, including:
Forecasting demand from consumers for each product
Healthcare planning, diagnosis, and forecasting of pandemic spread
Cybersecurity, anomaly detection, and preventive maintenance
Determine if the infrastructure is now able to manage traffic now and in the future.
Due to the temporal ordering of the data, which must be taken into account during feature engineering and modelling, time series forecasting differs slightly from standard machine learning use cases.
In order to train a time-series forecasting model, you end up using Pandas for pre-processing, a stats model for statistical tests and seasonality, sci-kit-learn or Facebook Prophet for forecasting, and custom code for backtesting and model selection.
Data scientists find it difficult to anticipate time series from beginning to end since different libraries have varying APIs and data formats. The scikit-learn package offers a uniform API for end-to-end machine learning modelling and is available for usage with standard machine learning use cases.
The main objective of Darts, which aims to be a scikit-learn for time series, is to streamline the whole time series forecasting methodology. The darts package and its implementation will be covered in this article.
The usage of the DARTS library
Darts is a Python module for simple time series manipulation and forecasting. It provides models that may be constructed similarly to sci-kit-learn models, including deep neural networks and well-known models like ARIMA (using fit and predict APIs).
The following are some characteristics of the Darts package:
It is based on the immutable TimeSeries class.
Its fit() and predict() APIs, which are similar to those in Scikit-Learn, are unified and user-friendly
It provides a range of models, including both traditional models and cutting-edge ML/DL techniques.
It offers APIs for end-to-end time series forecasting use cases, including information discovery, data pretreatment, forecasting, and model choice and assessment.
Code illustration
Let's also go through, investigate, and put into practice the core features of the open-source Darts package for the Monthly Air Passengers Dataset from 1949 to 1960.
Installation
Installation of the faker library is a pretty much easy task. We need to use only one line of code as we usually do for installing any other python library.
!pip install darts
Load the Timeseries
To further visualise the Air Passenger dataset, load it, divide it into training and validation sets of data and then combine the two.
from darts.datasets import AirPassengersDataset
series = AirPassengersDataset().load()
training, validation = series.split_before(0.80)
training.plot()
validation.plot()
Forecasting
Many traditional and cutting-edge time-series modelling methods, such as ARIMA, Theta, Exponential Smoothing, N-Beats, Facebook Prophet, etc., are implemented in Darts.
from darts.models import Theta
# Fit the theta model
model_theta = Theta()
model_theta.fit(training)
# Validation prediction
pred_theta = model_theta.predict(len(validation))
# Visualize
training.plot(label='Training')
validation.plot(label='True')
pred_theta.plot(label='Prediction')
Model Evaluation and Hyperparameter tuning
Darts provides implementation to calculate your model's performance and optimise your model by adjusting the estimator's hyperparameters.
from darts.utils.utils import SeasonalityMode
parameters = {'theta': [0.5, 1., 1.5, 2., 2.5],
'season_mode': [SeasonalityMode.MULTIPLICATIVE, SeasonalityMode.ADDITIVE]}
best_model, best_params, best_score = Theta.gridsearch(parameters=parameters, series=training, start=0.5, forecast_horizon=12)
print(best_model)
print(best_params)
print(best_score)
Conclusion
Darts is a practical tool that provides current, user-friendly ML implementation tailored to time series use cases.
Static covariates, AutoML implementation, anomaly detection, and pre-trained models may not be available in the Darts package.
References
Subscribe to my newsletter
Read articles from Sanjay Nandakumar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Sanjay Nandakumar
Sanjay Nandakumar
Data scientist | ML Engineer | Statistician