Archived Pachyderm Docs
🔍
2.6.x
Latest
2.8.x
2.7.x
2.6.x
🌙
Get Started
Beginner Tutorial
First-Time Setup
Connect to Existing Instance
Language Clients
Learn
Key Features
Target Audience
Basic Concepts
Intro to Data Versioning
Intro to Pipelines
Intro to Console
View Project
View List
View Pipelines
View Jobs
View Outputs
View Inputs
Developer Workflow
CI/CD Integration
Create a Machine Learning Workflow
The Push Images Flag
Working with Pipelines
Diagrams
High-Level Architecture Diagram
Glossary
Ancestry Syntax
Branch
Commit
Commit Set
DAG
Data Parallelism
Datum
Deferred Processing
Distributed Computing
File
Glob Pattern
Global Identifier
History
Input Repository
Job
NLP
Output Repository
Pachyderm Worker
Pipeline
Pipeline Inputs
Pipeline Specification
Provenance
Task Parallelism
User Code
Set Up
Cloud Deploy
AWS Deployment
Azure Deployment
GCP Deployment
Console Setup
Set Up AWS Secret Manager
Local Deploy
Docker Desktop
Minikube
On-Prem Deploy
Pachctl
Pachctl Auto-completion
Authentication & IdP Connectors
Auth0
Okta
Authorization
Add Roles to User
Add Roles to Group
IAM
Connection
Environment Variables
Kubernetes RBAC
Import a Kubernetes Context
Log Aggregation (Loki)
Non-Default Namespaces
Enterprise Edition
Activate Enterprise via Helm
Activate Enterprise via PachCTL
Features Overview
Enterprise Server (ES)
Activate ES for Multi-Cluster
Activate ES for Single-Cluster
Register a Cluster via Helm
Register a Cluster via PachCTL
Server Management
Server Setup
S3 Gateway API
TLS (SSL, HTTPS)
Tracing (Jaeger)
Manage
Helm Chart Values (HCVs)
Deploy Target HCVs
Global HCVs
Console HCVs
Enterprise Server HCVs
ETCD HCVs
Ingress HCVs
Loki HCVs
PachD HCVs
PachW HCVs
Kube Event Tail HCVs
PGBouncer HCVs
PostgreSQL Subchart HCVs
CloudSQL Auth Proxy HCVs
OpenID Connect HCVs
Test Connection HCVs
Proxy HCVs
S3 Gateway
AWS CLI
Boto3
Credentials
MinIO
Unsupported Operations
Backup & Restore
Cluster Backup
Enterprise Server Backup
Upgrade
PachCTL Shell
Check IdP User
Supported Releases & Features
Cluster Access
Deactivate Authorization
GPUs
Log In via IdP
Revoke User Access
Scaling Limits (CE)
Secrets
Sidecar S3 Gateway
Storage Optimization
Usage Metrics
Monitor with Prometheus
Metrics
Uninstall
Prepare Data
Datum Batching
Defer Processing via Staging Branch
Ingest Data
Mount Volumes
Skip Failed Datums
SQL Ingest
Time-Windowed Data
Transactions
👉 Build Pipelines & DAGs
Pipeline Specification (PPS)
Autoscaling PPS
Datum Set Spec PPS
Datum Timeout PPS
Datum Tries PPS
Description PPS
Egress PPS
Input Cron PPS
Input Cross PPS
Input Group PPS
Input Join PPS
Input PFS PPS
Input Union PPS
Job Timeout PPS
Metadata PPS
Output Branch PPS
Parallelism Spec PPS
Pod Patch PPS
Pod Spec PPS
Reprocess Spec PPS
Resource Limits PPS
Resource Requests PPS
s3 Out PPS
Scheduling Spec PPS
Service PPS
Sidecar Resource Limits PPS
Sidecar Resource Requests PPS
Spec Commit PPS
Spout PPS
Transform PPS
Full Pipeline Specification
Pipeline Ops
Create a Pipeline
Delete a Pipeline
Jsonnet Pipeline Specifications
Update a Pipeline
Project Ops
Create a Project
Set a Project as Current
Add a Project Resource
Grant Project Access
Delete a Project
Branch Ops
Copy Files
Process Specific Commits
Set Branch Triggers
Set Output Branch
Datum Ops
Get Metadata
Inspect Datum
Provenance Ops
List Global Commits & Jobs
List Global ID Sub Commits
Track Downstream
Delete Branch Head
Squash Non-Head Commits
Delete File From History
Tutorials
Standard ML Pipeline
AutoML Pipeline
Multi-Pipeline DAG
Data Parallelism Pipeline
Task Parallelism Pipeline
Docker Image + User Code
Export Data
Egress To An SQL Database
Export via Egress
Export via PachCTL
Mount a Repo Locally
S3 Gateway Operations
Create S3 Bucket
Delete an S3 Object
Delete Empty S3 Bucket
Get an S3 Object
List S3 Buckets
List S3 Objects
Write an S3 Object
Integrate
Google BigQuery
JupyterLab
Docker Installation Guide
Local Installation Guide
User Guide
Troubleshooting
Label Studio
Superb AI
Weights and Biases
Run Commands
Pachctl)
Pachctl auth)
Pachctl auth activate)
Pachctl auth check)
Pachctl auth check project)
Pachctl auth check repo)
Pachctl auth deactivate)
Pachctl auth get config)
Pachctl auth get groups)
Pachctl auth get robot token)
Pachctl auth get)
Pachctl auth get cluster)
Pachctl auth get enterprise)
Pachctl auth get project)
Pachctl auth get repo)
Pachctl auth login)
Pachctl auth logout)
Pachctl auth revoke)
Pachctl auth roles for permission)
Pachctl auth rotate root token)
Pachctl auth set config)
Pachctl auth set)
Pachctl auth set cluster)
Pachctl auth set enterprise)
Pachctl auth set project)
Pachctl auth set repo)
Pachctl auth use auth token)
Pachctl auth whoami)
Pachctl buildinfo)
Pachctl completion)
Pachctl completion bash)
Pachctl completion zsh)
Pachctl config)
Pachctl config delete)
Pachctl config delete context)
Pachctl config get)
Pachctl config get active context)
Pachctl config get active enterprise context)
Pachctl config get context)
Pachctl config get metrics)
Pachctl config import kube)
Pachctl config list)
Pachctl config list context)
Pachctl config set)
Pachctl config set active context)
Pachctl config set active enterprise context)
Pachctl config set context)
Pachctl config set metrics)
Pachctl config update)
Pachctl config update context)
Pachctl connect)
Pachctl copy)
Pachctl copy file)
Pachctl create)
Pachctl create branch)
Pachctl create pipeline)
Pachctl create project)
Pachctl create repo)
Pachctl create secret)
Pachctl debug)
Pachctl debug analyze)
Pachctl debug binary)
Pachctl debug dump)
Pachctl debug log level)
Pachctl debug profile)
Pachctl delete)
Pachctl delete all)
Pachctl delete branch)
Pachctl delete commit)
Pachctl delete file)
Pachctl delete job)
Pachctl delete pipeline)
Pachctl delete project)
Pachctl delete repo)
Pachctl delete secret)
Pachctl delete transaction)
Pachctl diff)
Pachctl diff file)
Pachctl draw)
Pachctl draw pipeline)
Pachctl edit)
Pachctl edit pipeline)
Pachctl enterprise)
Pachctl enterprise deactivate)
Pachctl enterprise get state)
Pachctl enterprise heartbeat)
Pachctl enterprise pause status)
Pachctl enterprise pause)
Pachctl enterprise register)
Pachctl enterprise sync contexts)
Pachctl enterprise unpause)
Pachctl exit)
Pachctl find)
Pachctl find commit)
Pachctl finish)
Pachctl finish commit)
Pachctl finish transaction)
Pachctl fsck)
Pachctl get)
Pachctl get file)
Pachctl glob)
Pachctl glob file)
Pachctl idp)
Pachctl idp create client)
Pachctl idp create connector)
Pachctl idp delete client)
Pachctl idp delete connector)
Pachctl idp get client)
Pachctl idp get config)
Pachctl idp get connector)
Pachctl idp list client)
Pachctl idp list connector)
Pachctl idp set config)
Pachctl idp update client)
Pachctl idp update connector)
Pachctl inspect)
Pachctl inspect branch)
Pachctl inspect cluster)
Pachctl inspect commit)
Pachctl inspect datum)
Pachctl inspect file)
Pachctl inspect job)
Pachctl inspect pipeline)
Pachctl inspect project)
Pachctl inspect repo)
Pachctl inspect secret)
Pachctl inspect transaction)
Pachctl kube events)
Pachctl license)
Pachctl license activate)
Pachctl license add cluster)
Pachctl license delete all)
Pachctl license delete cluster)
Pachctl license get state)
Pachctl license list clusters)
Pachctl license update cluster)
Pachctl list)
Pachctl list branch)
Pachctl list commit)
Pachctl list datum)
Pachctl list file)
Pachctl list job)
Pachctl list pipeline)
Pachctl list project)
Pachctl list repo)
Pachctl list secret)
Pachctl list transaction)
Pachctl logs)
Pachctl loki)
Pachctl mount)
Pachctl port forward)
Pachctl put)
Pachctl put file)
Pachctl restart)
Pachctl restart datum)
Pachctl resume)
Pachctl resume transaction)
Pachctl run)
Pachctl run cron)
Pachctl run pfs load test)
Pachctl run pps load test)
Pachctl shell)
Pachctl squash)
Pachctl squash commit)
Pachctl start)
Pachctl start commit)
Pachctl start pipeline)
Pachctl start transaction)
Pachctl stop)
Pachctl stop job)
Pachctl stop pipeline)
Pachctl stop transaction)
Pachctl subscribe)
Pachctl subscribe commit)
Pachctl unmount)
Pachctl update)
Pachctl update pipeline)
Pachctl update project)
Pachctl update repo)
Pachctl version)
Pachctl wait)
Pachctl wait commit)
Pachctl wait job)
Debug
Common Issues
Debug Pipelines
Troubleshooting Deployments
View Audit Logs
View Kubernetes Logs
SDKs
Contribute
Coding Conventions
Contributor Setup
Developing on Windows with VSCode
Documentation Style Guide
Shared
Home
2.6.x
Build Pipelines …
Build Pipelines & DAGs
Build pipelines & DAGs for every use case.