Understanding Concept Drift and Data Drift in Machine Learning

Machine Learning models are great when they have high accuracy. However, when deploying to the real world, it is often observed that the accuracy is not as high as it was during the training environment.

This type of scenario is usually associated with "Data Drift" and "Concept Drift".

Model Staleness

As things develop, model performance deteriorates. This is known as Model Staleness or Model Decay. The model quality metric is the final yardstick. Accuracy, mean mistake rate, or another downstream business KPI, such as click-through rate(CPR), can all be considered.

Although no model is impermanent, the rate of deterioration varies.

A model could go for years without an upgrade. For instance, specific language or computer vision models. Or any decision-making system in a stable, isolated setting.

Others may require daily retraining using the most recent data.

Data Drift

When the statistical characteristics of the entering data change over time, data drift happens. Consider, for instance, a model created for an e-commerce platform to forecast user behavior. The performance of the model may deteriorate as a result of new client cohorts exhibiting characteristics that differ from those in the training data. Data drift can be found by keeping an eye on metrics like statistical measurements and feature distributions.

Suppose a scenario, Daraz developed a model for predicting the likelihood of a person buying a product so that Daraz can send them personalized offers. Now, when the model was developed, product advertisement was only performed through Facebook. However, recently Daraz started advertising through Twitter. As a result, unknown data are being fed to the models. Hence, the accuracy of the model drops.

This type of scenario is known as Data Drift.

Concept Drift

Concept drift happens when previously observed patterns stop holding.

The distributions (such as user demographics, word frequency, etc.) may even stay the same, in contrast to the data drift. Instead, the connections between the inputs and outputs of the model alter.

In essence, what we are attempting to predict evolves in its very meaning. Depending on the scale, this could render the model obsolete or less accurate.

Conceptual drift comes in a variety of forms.

Gradual Concept Drift

Gradual or incremental drift is the one we expect.

The world changes and the model grows old. Its quality decline follows the gradual changes in external factors.

For example, think of a scenario. You are tasked to build a live facial recognition system for the security of a newly built company. Now initially, few people are coming in and out of the building so the system works fine. However, after some time more people starts coming. As a result, your model might not work as well as it did before. Before the features( images of people coming in and out ) increased. This is an example of gradual concept drift. This type of concept drift is expected.

Sudden Concept Drift

External changes could be more abrupt or significant. They are difficult to miss. As we mentioned in our most recent essay, a prime example is the COVID-19 epidemic.

The patterns of commerce and mobility changed essentially overnight. Even models that were deemed to be "stable" in other ways were affected.

Models for predicting demand would not predict that, as happened to Stitch Fix, sales of yoga pants would increase by 350% or that, due to border closures, the majority of flights would be canceled.

Recurring Concept Drift

Recurrent concept drift is pretty much “seasonality”. However, we are cognizant of the fact that seasonality is a prevalent phenomenon in machine learning using time series data. Therefore, if we anticipate this kind of drift, such as a distinct pattern on weekends or during specific holidays throughout the year, we just need to make sure that we train the model using data that reflects this seasonality. Only when a brand-new pattern emerges that the model is unfamiliar with this type of data drift typically becomes an issue in production.

For example, look at the figure below.

From the point of model monitoring, this “drift” has no importance. Weekends happen every week, and we don’t need an alert. Unless we see a new pattern, of course.

How to treat this not-exactly-drift in production?

Teach your model the seasonality.

If it first sees some special events or seasons in production, you can use other similar events as examples. For instance, if a new bank holiday is introduced, you can assume a similarity with a known one.

If needed, domain experts can help to add manual post-processing rules or corrective coefficients on top of the model output.

Concept Drift and Data Drift in Machine Learning

Table of contents