A Dataiku Technical White Paper

A Primer on Data Drift

& Drift Detection Techniques

Monitoring model performance drift is a crucial step in production machine learning (ML); however, in practice, it proves challenging for many reasons. In this white paper:

  • Take a brief but deep dive into the underlying definitions and logic behind drift.
  • Explore a real-life use case presenting different scenarios that can cause models and data to drift.
  • Delve into two techniques (the domain classifier and the black-box shift detector) used to assess whether dataset drift is occurring.

Get a Copy of the Tech White Paper

from the Experts at Dataiku


About the Authors

Simona Maggio & Du Phan

After obtaining a PhD in Biomedical Image Processing in 2011, Simona Maggio worked in several companies (CEA, Thales, Rakuten) as a Research Engineer in Computer Vision and Natural Language Processing for applications ranging from video surveillance to document digitization and e-commerce. She's now Senior Research Scientist at Dataiku, exploring MLOps topics, such as model debugging, robustness and interpretability.

Du Phan is a Machine Learning engineer at Dataiku, where he works in democratizing data science. In the past few years, he has been dealing with a variety of data problems, from geospatial analysis to deep learning. His work now focuses on different facets and challenges of MLOps.