Programma
Login

MLOps: how we do it

In a time where ML seems to be everywhere, MLOps practices are still in an early stage of development. In this talk we’ll have a look at practical challenges we faced at Prima, how we solved them and what we learnt from it.

Abstract

Description

Deploying ML models to production and implementing MLOps principles can be challenging, but the benefits are definitely worth the effort. In this talk I will present what you should look at if you want to start developing your MLOps practices by telling what we have done and how we do MLOps at Prima.

0 - Consistency

There is no way you can keep your model results consistent if you change language between development and production. The first lesson we learnt is that the people involved in developing the model should be the same involved in releasing it. The challenge then is: how do we make this process as smooth as possible? These people will need tools that allow them to productionize their code, fast.

1 - The transformers

If you ever tried running some pandas processing on a single row, you know how inefficient that is. Deploying Pandas code into production will ensure you are deploying slow code (provided you work one prediction at a time, not in batch). This is often unacceptable for real time applications. On the other hand translating a preprocessing pipeline from pandas to “vanilla” Python is costly, inefficient and error prone.

I will show you how we managed to develop an abstraction layer that allows us to get the best of both worlds, with the happy side effect of making our data scientist notebooks much more readable and maintainable.

2 - It’s all about features

A hard learned lesson is that once your models ingest a feature you must take it to production. The problem is: doing that may be as hard as productionalize the model itself.

I will show how feature stores can help data scientists and ML engineers reuse and share features and transformations. While discussing their benefits I will present how we first started developing our own feature library a year ago and why we are now moving to a more comprehensive third party feature store.

Slides: https://docs.google.com/presentation/d/1sD0JIC4RoyfJsdc55JdwdThD97GCsT4BegUdA2SjvfI/edit?usp=sharing

Speaker
Lorenzo Gelmi
Argomento
PyData
Livello audience
Beginner
Lingua
Inglese
Durata
30 minuti
Speaker name:
Lorenzo Gelmi
Torna al programma
      Powered by Vercel Logo