Shashank Srikanth, Romain Cledat
Metaflow — a framework we began and open-sourced in 2019 — now powers a variety of ML and AI programs throughout Netflix and at many different firms. It’s nicely beloved by customers for serving to them take their ML/AI workflows from prototype to manufacturing, permitting them to concentrate on constructing cutting-edge programs that convey pleasure and leisure to audiences worldwide.
Metaflow permits customers to:
- Iterate and ship shortly by minimizing friction
- Function programs reliably in manufacturing with minimal overhead, at Netflix scale.
Metaflow works with many battle-hardened tooling to handle the second level — amongst them Maestro, our newly open-sourced workflow orchestrator that powers practically each ML and AI system at Netflix and serves as a spine for Metaflow itself.
On this publish, we concentrate on the primary level and introduce a brand new Metaflow performance, Spin, that helps customers speed up their iterative growth course of. By the top, you’ll have a strong understanding of Spin’s capabilities and discover ways to attempt it out your self with Metaflow 2.19.
Iterative growth in ML and AI workflows
To grasp our method to enhancing the ML and AI growth expertise, it helps to think about how these workflows differ from conventional software program engineering.
ML and AI growth revolves not simply round code but additionally round information and fashions, that are massive, mutable, and computationally costly to course of. Iteration cycles can contain long-running information transformations, mannequin coaching, and stochastic processes that yield barely totally different outcomes from run to run. These traits make quick, stateful iteration a essential a part of productive growth.
That is the place notebooks — comparable to Jupyter, Observable, or Marimo — shine. Their capability to protect state in reminiscence permits builders to load a dataset as soon as and iteratively discover, rework, and visualize it with out reloading or recomputing from scratch. This persistent, interactive atmosphere turns what would in any other case be a gradual, inflexible loop right into a fluid, exploratory workflow — completely suited to the wants of ML and AI practitioners.
As a result of ML and AI growth is computationally intensive, stochastic, and data- and model-centric, instruments that optimize iteration velocity should deal with state administration as a first-class design concern. Any system aiming to enhance the event expertise on this area should due to this fact allow fast, incremental experimentation with out shedding continuity between iterations.
New: speedy, iterative growth with spin
At first look, Metaflow code appears to be like like a workflow — much like Airflow — however there’s one other means to have a look at it: every Metaflow @step serves as a checkpoint boundary. On the finish of each step, Metaflow routinely persists all occasion variables as artifacts, permitting the execution to renew seamlessly from that time onward. The beneath animation reveals this conduct in motion:
In a way, we will contemplate a @step much like a pocket book cell: it’s the smallest unit of execution that updates state upon completion. It does have just a few variations that handle the problems with pocket book cells:
- The execution order is specific and deterministic: no surprises attributable to out-of-order cell execution;
- The state shouldn’t be hidden: state is explicitly saved as
self.variables as shared state, which may be found and inspected; - The state is versioned and continued making outcomes extra reproducible.
Whereas Metaflow’s resume characteristic can approximate the incremental and iterative growth method of notebooks, it restarts execution from the chosen step onward, introducing extra latency between iterations. In distinction, a pocket book permits near-instant suggestions by letting customers tweak and rerun particular person cells whereas seamlessly reusing information from earlier cells held in reminiscence.
The brand new spin command in Metaflow 2.19 addresses this hole. Just like executing a single pocket book cell, it shortly executes a single Metaflow @step — with all of the state carried over from the dad or mum step. Because of this, customers can develop and debug Metaflow steps as simply as a cell in a pocket book.
The impact turns into clear when contemplating the three complementary execution modes — run, resume, and spin — facet by facet, mapping them to the corresponding pocket book conduct:
One other main distinction isn’t simply what will get executed, however what will get recorded. Each run and resume create a full, versioned run with full metadata and artifacts, whereas spin skips monitoring altogether. It’s constructed for quick, throw-away iterations throughout growth.
The one-minute clip beneath illustrates a typical iterative growth workflow that alternates between run and spin. On this instance, we’re constructing a movement that reads a dataset from a Parquet file and trains a separate mannequin for every product class, specializing in computer-related classes.
As proven within the video, we begin by making a movement from scratch and working a minimal model of it to persist check artifacts — on this case, a Parquet dataset. From there, we will use spin to iterate on one step at a time, incrementally constructing out the movement, for instance, by including the parallel coaching steps demonstrated within the clip.
As soon as the movement has been iterated on regionally, it may be seamlessly deployed to manufacturing orchestrators like Maestro or Argo, and scaled up on compute platforms comparable to AWS Batch, Titus, Kubernetes and extra. Thus, the expertise is as clean as creating in a pocket book, however the final result is a production-ready, scalable workflow, carried out as an idiomatic Python mission!
Spin up clean growth in VSCode/Cursor
As an alternative of typing run and spin manually within the terminal, we will bind them to keyboard shortcuts. For instance, the straightforward metaflow-dev VS Code extension (works with Cursor as nicely) maps Ctrl+Choose+R to run and Ctrl+Choose+S to spin. Simply hack away, hit Ctrl+Choose+S, and the extension will save your file and spin the step you might be presently modifying.
One space the place spin actually shines is in creating mini-dashboards and reviews with Metaflow Playing cards. Visualization is one other robust level of notebooks however the mixture of spin and playing cards makes Metaflow a really compelling various for creating real-time and post-execution visualizations. Creating playing cards is inherently iterative and visible (very like constructing internet pages) the place you wish to tweak code and see the outcomes immediately. This workflow is available with the mix of VSCode/Cursor, which features a built-in web-view, the native card viewer, and spin.
To see the trio of instruments — together with the VS Code extension — in motion, on this brief clip we add observability to the prepare step that we constructed within the earlier instance:
A significant advantage of Metaflow Playing cards is that we don’t must deploy any additional companies, information streams, and databases for observability. Simply develop visible outputs as above, deploy the movement, and wehave an entire system in manufacturing with reporting and visualizations included.
Spin to the following stage: injecting inputs, inspecting outputs
Spin does extra than simply run code — it additionally lets us take full management of a spun @step’s inputs and outputs, enabling a variety of superior patterns.
In distinction to notebooks, we will spin any arbitrary @step in a movement utilizing state from any previous run, making it straightforward to check features with totally different inputs. For instance, if now we have a number of fashions produced by separate runs, we might spin an inference step, supplying a unique mannequin run every time.
We are able to additionally override artifact values or inject arbitrary Python objects — much like a pocket book cell — for spin. Merely specify a Python module with an ARTIFACTS dictionary:
ARTIFACTS = {
"mannequin": "kmeans",
"ok": 15
}and level spin on the module:
spin prepare --artifacts-module artifacts.pyBy default spin doesn’t persist artifacts, however we will simply change this by including --persist. Even on this case, artifacts are usually not continued within the standard Metaflow datastore however to a directory-specific location which you’ll be able to simply clear up after testing. We are able to entry the outcomes with the Shopper API as standard — simply specify the listing you wish to examine with inspect_spin:
from metaflow import inspect_spininspect_spin(".")
Circulate("TrainingFlow").latest_run["train"].job["model"].information
With the ability to examine and modify a step’s inputs and outputs on the fly unlocks a strong use case: unit testing particular person steps. We are able to use spin programmatically by way of the Runner API and assert the outcomes:
from metaflow import Runnerwith Runner("movement.py").spin("prepare", persist=True) as spin:
assert spin.job["model"].information == "kmeans"
Making AI brokers spin
Along with dashing up growth for people, spin seems to be surprisingly useful for coding brokers too. There are two main benefits to educating AI the best way to spin:
- It accelerates the event loop. Brokers don’t naturally perceive what’s gradual, or why velocity issues, in order that they have to be nudged to favor sooner instruments over slower ones.
- It helps floor errors sooner and contextualizes them to a particular piece of code, growing the possibility that the agent is ready to repair errors by itself.
Metaflow customers are already utilizing Claude Code; spin makes this even simpler. Within the instance beneath, we added the next part in a CLAUDE.md file:
## Creating Metaflow code
Comply with this incremental growth workflow that ensures fast iterations
and proper outcomes. You need to create a movement incrementally, step-by-step
following this course of:
1. Create a movement skeleton with empty `@step`s.
2. Add a knowledge loading step.
3. `run` the movement.
4. Populate the following step and use `spin` to check it with the proper inputs.
5. `run` the movement to file outputs from the brand new step.
5. Iterate on (4–5) till all steps have been carried out and work accurately.
6. `run` the entire movement to make sure last correctness.To check a movement, run the movement as follows
```
python movement.py - atmosphere=pypi run
```
Do that as soon as earlier than working `spin`.
As you might be constructing the movement, you `spin` to check steps shortly.
As an illustration
```
python movement.py - atmosphere=pypi spin prepare
```
Simply primarily based on these fast directions, the agent is ready to use spin successfully. Check out the next inspirational instance that one-shots Claude to create a movement, alongside the strains of our earlier examples, which trains a classifier to foretell product classes:
Within the video, we will see Claude utilizing spin across the 45-second mark to check a preprocess step. The step initially fails attributable to a traditional information science pitfall: throughout testing, Claude samples solely a small subset of knowledge, inflicting some courses to be underrepresented. The primary spin surfaces the difficulty, which Claude then fixes by switching to stratified sampling — and eventually does one other spin to verify the repair, earlier than continuing to finish the duty.
The inside loop of end-to-end ML/AI
To circle again to the place we began, our motivation for including spin — and for creating Metaflow within the first place — is to speed up growth cycles so we will ship extra pleasure to our subscribers, sooner. Finally, we consider there’s no single magic characteristic that makes this attainable. It takes all components of an ML/AI platform working collectively coherently — spin included.
From this attitude, it’s helpful to position spin within the context of different Metaflow options. It’s designed for the innermost loop of mannequin and business-logic growth, with the additional benefit of supporting unit testing throughout deployment, as proven within the total blueprint of the Metaflow toolchain beneath.
On this diagram, the strong blue bins symbolize totally different Metaflow instructions, whereas the blue textual content denotes decorators and different options. Specifically, be aware the Shared Performance field — one other key focus space for us over the previous 12 months — which incorporates configuration administration and customized decorators. These capabilities let domain-specific groups and platform suppliers tailor Metaflow to their very own use instances. Following our ethos of composability, all of those options combine seamlessly with spin as nicely.
One other key design philosophy of Metaflow is to let tasks begin small and easy, including complexity solely when it turns into vital. So don’t be overwhelmed by the diagram above. To get began, set up Metaflow simply with
pip set up metaflowand take your first child @steps for a spin! Take a look at the docs and for questions, assist, and suggestions, be a part of the pleasant Metaflow Group Slack.
Acknowledgments
We want to thank our companions at Outerbounds, and significantly Ville Tuulos, Savin Goyal, and Madhur Tandon, for his or her collaboration on this characteristic, from preliminary ideation to evaluation, testing and documentation. We might additionally wish to acknowledge the remainder of the Mannequin Improvement and Administration staff (Maria Alder, David J. Berg, Shaojing Li, Rui Lin, Nissan Pow, Chaoying Wang, Regina Wang, Seth Yang, Darin Yu) for his or her enter and feedback.
