Dataiku Participates in the Creation of the First Scikit-learn Consortium

Scikit-learn, the most widely used and only community driven general machine learning library in the world, announces the creation of an industrial consortium, in part supported by enterprise AI software maker Dataiku.

NEW YORK - September 25th - Launched in 2007 by members of the Python scientific community, the Scikit-learn project has been accelerated as part of Inria's research into functional imaging of the brain. Today, ten years later, Inria announces the creation of a consortium of corporate sponsors to accelerate development, including enterprise AI development platform maker, Dataiku, alongside Microsoft, NVidia, Intel, AXA, Boston Consulting Group, and BNP Paribas Cardiff.


Mostly unknown to the general public, Scikit-learn is one of the flagship libraries in the field of advanced machine learning. The consortium hopes that, beyond financial support, the initial group of global innovators will help to promote the project to a broader audience and create greater visibility among institutions.


"Today, more than 500,000 data scientists use Scikit-learn daily around the world. It's easy to imagine that the combined salary and the value created by these users of Scikit-learn is over $100 billion a year. The benefits of this project are extraordinary. By becoming a sponsor of this consortium, we are making a fantastic investment for the future of data science in the world," said Florian Douetteau, CEO of Dataiku.


"In addition, at Dataiku, we have been integrating Scikit-learn into our offering since 2013. Dataiku provides us with a clickable version of Scikit-learn to enable everyone to use it, from the business analyst to the most advanced data scientist. This corresponds to a trend of technology innovation, where big scientific advances are often accelerated in communities by open source, then made known to everyone else through the work of software companies," he says.


A software library developed in Python, Scikit-learn is dedicated to machine learning. Its simple and powerful predictive models make it possible to extract power insights from data using many different models, from the efficient linear model on texts to random forests, and well adapted to heterogeneous databases. Scikit-learn's competitors are Tensorflow, supported by Google, and Spark MLlib, supported by Databricks, which are also integrated with Dataiku. Compared to TensorFlow which focuses on Deep Learning, Scikit-learn provides an unparalleled diversity of algorithms.


Today, Scikit-learn is used by the biggest players in the technology: AirBnb for the detection of fraud, Uber for the prediction of the demand, or by Spotify for the recommendation of music.


To learn more visit:

About Dataiku

Dataiku is the centralized data platform that moves businesses along their data journey from analytics at scale to enterprise AI. More than 200 customers across retail, e-commerce, health care, finance, transportation, the public sector, manufacturing, pharmaceuticals, and more use Dataiku to power self-service analytics while also ensuring the operationalization of machine learning models in production.

Dataiku was founded in 2013 and raised a seed round of €3 million followed by $14 million Series A round led by FirstMark Capital in October 2016. In 2017, Dataiku doubled in size and tripled its revenue, culminating in a September 2017 announcement of their $28M Series B funding round led by Battery Ventures along with FirstMark Capital, Alven Capital, and Serena Capital. They currently employ more than 175 people between the headquarters in New York and offices in Paris, London, and Munich.


Addison Huegel, Media Relations
1 (1 415) 315-9629


Team Members
Countries with Dataiku users

All About Dataiku