Run Commands

AutoML Pipeline

Learn how to build an automated machine learning pipeline.

You can use Pachyderm to build an automated machine learning pipeline that trains a model on a CSV file.

Before You Start #

Tutorial #

Our Docker image’s user code for this tutorial is built on top of the python:3.7-slim-buster base image. It also uses the mljar-supervised package to perform automated feature engineering, model selection, and hyperparameter tuning, making it easy to train high-quality machine learning models on structured data.

1. Create a Project & Input Repo #


2. Create a Jsonnet Pipeline #


The model automatically starts training. Once complete, the trained model and evaluation metrics are output to the AutoML output repo.

3. Upload the Dataset #


Repeat the previous step as many times as you want. Each time, Pachyderm automatically retrains the model and outputs the new model and evaluation metrics to the AutoML output repo.

User Code Assets #

The Docker image used in this tutorial was built with the following assets: