Dataiku Ebook

Active Learning
at Work

Active Learning: Benefits, Challenges, & Use Cases

Bad data — missing data, errors, unlabeled data — can skew results of machine learning models, making it harmful to overall AI efforts.

This ebook discusses the role of active learning, a process that automates data labeling through machine learning algorithms, in addressing data quality. 

DTKU_How-to-Improve-Data-Quality_Ebook_WEB

Get the Ebook

What Is Active Learning in Machine Learning?

Read on to Discover:

Before running any cutting-edge machine learning algorithms or deploying any models, the data involved needs to be high quality and labeled. However, data collection, including cleaning and wrangling, is a tedious, time consuming, and iterative process that typically involves data labeling and model training.

Active learning is a framework allowing users to reduce the cost of data labeling necessary for a model to reach the required accuracy. It can be used:

  • When not all data can be annotated because it is too costly or complicated.
  • To speed up the labeling procedure by leveraging previously labeled data.
  • To optimize the order in which unlabeled data is processed.