Entertainer.newsEntertainer.news
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards

Subscribe to Updates

Get the latest Entertainment News and Updates from Entertainer News

What's Hot

Footballer Michael Ballack tearfully breaks his silence 5 years after the tragic death of his son Emilio, 18

March 6, 2026

How To Change Your Appearance

March 6, 2026

Is ‘Grey’s Anatomy’ Setting Up Jules Millin’s Departure Next? (VIDEO)

March 6, 2026
Facebook Twitter Instagram
Friday, March 6
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
Facebook Twitter Tumblr LinkedIn
Entertainer.newsEntertainer.news
Subscribe Login
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards
Entertainer.newsEntertainer.news
Home Introducing Configurable Metaflow | by Netflix Technology Blog | Dec, 2024
Web Series

Introducing Configurable Metaflow | by Netflix Technology Blog | Dec, 2024

Team EntertainerBy Team EntertainerDecember 20, 2024Updated:December 20, 2024No Comments14 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
Introducing Configurable Metaflow | by Netflix Technology Blog | Dec, 2024
Share
Facebook Twitter LinkedIn Pinterest Email


Netflix Technology Blog
Netflix TechBlog
13 min learn

·

15 hours in the past

David J. Berg*, David Casler^, Romain Cledat*, Qian Huang*, Rui Lin*, Nissan Pow*, Nurcan Sonmez*, Shashank Srikanth*, Chaoying Wang*, Regina Wang*, Darin Yu*
*: Mannequin Improvement Workforce, Machine Studying Platform
^: Content material Demand Modeling Workforce

A month in the past at QConSF, we showcased how Netflix makes use of Metaflow to energy a various set of ML and AI use circumstances, managing 1000’s of distinctive Metaflow flows. This adopted a earlier weblog on the identical matter. Many of those initiatives are underneath fixed growth by devoted groups with their very own enterprise targets and growth greatest practices, such because the system that helps our content material choice makers, or the system that ranks which language subtitles are most precious for a selected piece of content material.

As a central ML and AI platform group, our position is to empower our accomplice groups with instruments that maximize their productiveness and effectiveness, whereas adapting to their particular wants (not the opposite approach round). This has been a guiding design precept with Metaflow since its inception.

Metaflow infrastructure stack

Standing on the shoulders of our in depth cloud infrastructure, Metaflow facilitates easy accessibility to information, compute, and production-grade workflow orchestration, in addition to built-in greatest practices for widespread considerations corresponding to collaboration, versioning, dependency administration, and observability, which groups use to setup ML/AI experiments and methods that work for them. In consequence, Metaflow customers at Netflix have been capable of run thousands and thousands of experiments over the previous few years with out losing time on low-level considerations.

Whereas Metaflow goals to be un-opinionated about a number of the higher ranges of the stack, some groups inside Netflix have developed their very own opinionated tooling. As a part of Metaflow’s adaptation to their particular wants, we consistently attempt to perceive what has been developed and, extra importantly, what gaps these options are filling.

In some circumstances, we decide that the hole being addressed may be very group particular, or too opinionated at too excessive a degree within the stack, and we subsequently determine to not develop it inside Metaflow. In different circumstances, nevertheless, we understand that we will develop an underlying assemble that aids in filling that hole. Word that even in that case, we don’t all the time intention to fully fill the hole and as an alternative deal with extracting a extra normal decrease degree idea that may be leveraged by that exact person but additionally by others. One such recurring sample we seen at Netflix is the necessity to deploy units of intently associated flows, usually as half of a bigger pipeline involving desk creations, ETLs, and deployment jobs. Often, practitioners wish to experiment with variants of those flows, testing new information, new parameterizations, or new algorithms, whereas maintaining the general construction of the move or flows intact.

A pure answer is to make flows configurable utilizing configuration information, so variants might be outlined with out altering the code. To date, there hasn’t been a built-in answer for configuring flows, so groups have constructed their bespoke options leveraging Metaflow’s JSON-typed Parameters, IncludeFile, and deploy-time Parameters or deploying their very own home-grown answer (usually with nice ache). Nevertheless, none of those options make it straightforward to configure all facets of the move’s habits, decorators specifically.

Requests for a function like Metaflow Config

Exterior Netflix, we’ve got seen related ceaselessly requested questions on the Metaflow group Slack as proven within the person quotes above:

Immediately, to reply the FAQ, we introduce a brand new — small however mighty — function in Metaflow: a Config object. Configs complement the prevailing Metaflow constructs of artifacts and Parameters, by permitting you to configure all facets of the move, decorators specifically, previous to any run beginning. On the finish of the day, artifacts, Parameters and Configs are all saved as artifacts by Metaflow however they differ in when they’re continued as proven within the diagram under:

Totally different information artifacts in Metaflow

Mentioned one other approach:

  • An artifact is resolved and continued to the datastore on the finish of every job.
  • A parameter is resolved and continued at the beginning of a run; it could possibly subsequently be modified as much as that time. One widespread use case is to make use of triggers to go values to a run proper earlier than executing. Parameters can solely be used inside your step code.
  • A config is resolved and continued when the move is deployed. When utilizing a scheduler corresponding to Argo Workflows, deployment occurs when create’ing the move. Within the case of a neighborhood run, “deployment” occurs simply previous to the execution of the run — consider “deployment” as gathering all that’s wanted to run the move. Not like parameters, configs can be utilized extra broadly in your move code, significantly, they can be utilized in step or move degree decorators in addition to to set defaults for parameters. Configs can after all even be used inside your move.

For instance, you may specify a Config that reads a pleasantly human-readable configuration file, formatted as TOML. The Config specifies a triggering ‘@schedule’ and ‘@useful resource’ necessities, in addition to application-specific parameters for this particular deployment:

[schedule]
cron = "0 * * * *"

[model]
optimizer = "adam"
learning_rate = 0.5

[resources]
cpu = 1

Utilizing the newly launched Metaflow 2.13, you may configure a move with a Config like above, as demonstrated by this move:

import pprint
from metaflow import FlowSpec, step, Config, sources, config_expr, schedule

@schedule(cron=config_expr("config.schedule.cron"))
class ConfigurableFlow(FlowSpec):
config = Config("config", default="myconfig.toml", parser="tomllib.hundreds")

@sources(cpu=config.sources.cpu)
@step
def begin(self):
print("Config loaded:")
pprint.pp(self.config)
self.subsequent(self.finish)

@step
def finish(self):
go

if __name__ == "__main__":
ConfigurableFlow()

There’s a lot happening within the code above, a couple of highlights:

  • you may discuss with configs earlier than they’ve been outlined utilizing ‘config_expr’.
  • you may outline arbitrary parsers — utilizing a string means the parser doesn’t even must be current remotely!

From the developer’s viewpoint, Configs behave like dictionary-like artifacts. For comfort, they help the dot-syntax (when attainable) for accessing keys, making it straightforward to entry values in a nested configuration. You may also unpack the entire Config (or a subtree of it) with Python’s commonplace dictionary unpacking syntax, ‘**config’. The usual dictionary subscript notation can also be out there.

Since Configs flip into dictionary artifacts, they get versioned and continued mechanically as artifacts. You’ll be able to entry Configs of any previous runs simply by way of the Consumer API. In consequence, your information, fashions, code, Parameters, Configs, and execution environments are all saved as a constant bundle — neatly organized in Metaflow namespaces — paving the way in which for simply reproducible, constant, low-boilerplate, and now simply configurable experiments and sturdy manufacturing deployments.

Whereas you may get far by accompanying your move with a easy config file (saved in your favourite format, because of user-definable parsers), Configs unlock numerous superior use circumstances. Think about these examples from the up to date documentation:

A significant good thing about Config over earlier extra hacky options for configuring flows is that they work seamlessly with different options of Metaflow: you may run steps remotely and deploy flows to manufacturing, even when counting on customized parsers, with out having to fret about packaging Configs or parsers manually or maintaining Configs constant throughout duties. Configs additionally work with the Runner and Deployer.

When used along with a configuration supervisor like Hydra, Configs allow a sample that’s extremely related for ML and AI use circumstances: orchestrating experiments over a number of configurations or sweeping over parameter areas. Whereas Metaflow has all the time supported sweeping over parameter grids simply utilizing foreaches, it hasn’t been simply attainable to change the move itself, e.g. to vary @sources or @pypi/@conda dependencies for each experiment.

In a typical case, you set off a Metaflow move that consumes a configuration file, altering how a run behaves. With Hydra, you may invert the management: it’s Hydra that decides what will get run primarily based on a configuration file. Because of Metaflow’s new Runner and Deployer APIs, you may create a Hydra app that operates Metaflow programmatically — for example, to deploy and execute tons of of variants of a move in a large-scale experiment.

Check out two fascinating examples of this sample within the documentation. As a teaser, this video exhibits Hydra orchestrating deployment of tens of Metaflow flows, every of which benchmarks PyTorch utilizing a various variety of CPU cores and tensor sizes, updating a visualization of the ends in real-time because the experiment progresses:

Instance utilizing Hydra with Metaflow

To provide a motivating instance of what configurations seem like at Netflix in apply, let’s take into account Metaboost, an inner Netflix CLI software that helps ML practitioners handle, develop and execute their cross-platform initiatives, considerably just like the open-source Hydra mentioned above however with particular integrations to the Netflix ecosystem. Metaboost is an instance of an opinionated framework developed by a group already utilizing Metaflow. The truth is, part of the inspiration for introducing Configs in Metaflow got here from this very use case.

Metaboost serves as a single interface to a few totally different inner platforms at Netflix that handle ETL/Workflows (Maestro), Machine Studying Pipelines (Metaflow) and Information Warehouse Tables (Kragle). On this context, having a single configuration system to handle a ML venture holistically provides customers elevated venture coherence and decreased venture danger.

Configuration in Metaboost

Ease of configuration and templatizing are core values of Metaboost. Templatizing in Metaboost is achieved by way of the idea of bindings, whereby we will bind a Metaflow pipeline to an arbitrary label, after which create a corresponding bespoke configuration for that label. The binding-connected configuration is then merged into a world set of configurations containing such data as GIT repository, department, and many others. Binding a Metaflow, can even sign to Metaboost that it ought to instantiate the Metaflow move as soon as per binding into our orchestration cluster.

Think about a ML practitioner on the Netflix Content material ML group, sourcing options from tons of of columns in our information warehouse, and creating a mess of fashions in opposition to a rising suite of metrics. When a model new content material metric comes alongside, with Metaboost, the primary model of the metric’s predictive mannequin can simply be created by merely swapping the goal column in opposition to which the mannequin is skilled.

Subsequent variations of the mannequin will consequence from experimenting with hyper parameters, tweaking function engineering, or conducting function diets. Metaboost’s bindings, and their integration with Metaflow Configs, might be leveraged to scale the variety of experiments as quick as a scientist can create experiment primarily based configurations.

Scaling experiments with Metaboost bindings — backed by Metaflow Config

Think about a Metaboost ML venture named `demo` that creates and hundreds information to customized tables (ETL managed by Maestro), after which trains a easy mannequin on this information (ML Pipeline managed by Metaflow). The venture construction of this repository may seem like the next:

├── metaflows
│ ├── customized -> customized python code, utilized by
| | | Metaflow
│ │ ├── information.py
│ │ └── mannequin.py
│ └── coaching.py -> defines our Metaflow pipeline
├── schemas
│ ├── demo_features_f.tbl.yaml -> desk DDL, shops our ETL
| | output, Metaflow enter
│ └── demo_predictions_f.tbl.yaml -> desk DDL,
| shops our Metaflow output
├── settings
│ ├── settings.configuration.EXP_01.yaml -> defines the additive
| | config for Experiment 1
│ ├── settings.configuration.EXP_02.yaml -> defines the additive
| | config for Experiment 2
│ ├── settings.configuration.yaml -> defines our international
| | configuration
│ └── settings.atmosphere.yaml -> defines parameters primarily based on
| git department (e.g. READ_DB)
├── exams
├── workflows
│ ├── sql
│ ├── demo.demo_features_f.sch.yaml -> Maestro workflow, defines ETL
│ └── demo.foremost.sch.yaml -> Maestro workflow, orchestrates
| ETLs and Metaflow
└── metaboost.yaml -> defines our venture for
Metaboost

The configuration information within the settings listing above comprise the next YAML information:

# settings.configuration.yaml (international configuration)
mannequin:
fit_intercept: True
conda:
numpy: '1.22.4'
"scikit-learn": '1.4.0'
# settings.configuration.EXP_01.yaml
target_column: metricA
options:
- runtime
- content_type
- top_billed_talent
# settings.configuration.EXP_02.yaml
target_column: metricA
options:
- runtime
- director
- box_office

Metaboost will merge every experiment configuration (*.EXP*.yaml) into the worldwide configuration (settings.configuration.yaml) individually at Metaboost command initialization. Let’s check out how Metaboost combines these configurations with a Metaboost command:

(venv-demo) ~/initiatives/metaboost-demo [branch=demoX] 
$ metaboost metaflow settings present --yaml-path=configuration

binding=EXP_01:
mannequin: -> outlined in setting.configuration.yaml (international)
fit_intercept: true
conda: -> outlined in setting.configuration.yaml (international)
numpy: 1.22.4
"scikit-learn": 1.4.0
target_column: metricA -> outlined in setting.configuration.EXP_01.yaml
options: -> outlined in setting.configuration.EXP_01.yaml
- runtime
- content_type
- top_billed_talent

binding=EXP_02:
mannequin: -> outlined in setting.configuration.yaml (international)
fit_intercept: true
conda: -> outlined in setting.configuration.yaml (international)
numpy: 1.22.4
"scikit-learn": 1.4.0
target_column: metricA -> outlined in setting.configuration.EXP_02.yaml
options: -> outlined in setting.configuration.EXP_02.yaml
- runtime
- director
- box_office

Metaboost understands it ought to deploy/run two impartial cases of coaching.py — one for the EXP_01 binding and one for the EXP_02 binding. You may also see that Metaboost is conscious that the tables and ETL workflows are not certain, and will solely be deployed as soon as. These particulars of which artifacts to bind and which to go away unbound are encoded within the venture’s top-level metaboost.yaml file.

(venv-demo) ~/initiatives/metaboost-demo [branch=demoX] 
$ metaboost venture record

Tables (metaboost desk record):
schemas/demo_predictions_f.tbl.yaml (binding=default):
table_path=prodhive/demo_db/demo_predictions_f
schemas/demo_features_f.tbl.yaml (binding=default):
table_path=prodhive/demo_db/demo_features_f

Workflows (metaboost workflow record):
workflows/demo.demo_features_f.sch.yaml (binding=default):
cluster=sandbox, workflow.id=demo.branch_demox.demo_features_f
workflows/demo.foremost.sch.yaml (binding=default):
cluster=sandbox, workflow.id=demo.branch_demox.foremost

Metaflows (metaboost metaflow record):
metaflows/coaching.py (binding=EXP_01): -> EXP_01 occasion of coaching.py
cluster=sandbox, workflow.id=demo.branch_demox.EXP_01.coaching
metaflows/coaching.py (binding=EXP_02): -> EXP_02 occasion of coaching.py
cluster=sandbox, workflow.id=demo.branch_demox.EXP_02.coaching

Under is an easy Metaflow pipeline that fetches information, executes function engineering, and trains a LinearRegression mannequin. The work to combine Metaboost Settings right into a person’s Metaflow pipeline (carried out utilizing Metaflow Configs) is as straightforward as including a single mix-in to the FlowSpec definition:

from metaflow import FlowSpec, Parameter, conda_base, step
from customized.information import feature_engineer, get_data
from metaflow.metaboost import MetaboostSettings

@conda_base(
libraries=MetaboostSettings.get_deploy_time_settings("configuration.conda")
)
class DemoTraining(FlowSpec, MetaboostSettings):
prediction_date = Parameter("prediction_date", kind=int, default=-1)

@step
def begin(self):
# get show_settings() without cost with the mixin
# and get handy debugging information
self.show_settings(exclude_patterns=["artifact*", "system*"])

self.subsequent(self.get_features)

@step
def get_features(self):
# function engineers on our extracted information
self.fe_df = feature_engineer(
# hundreds information from our ETL pipeline
information=get_data(prediction_date=self.prediction_date),
options=self.settings.configuration.options +
[self.settings.configuration.target_column]
)

self.subsequent(self.prepare)

@step
def prepare(self):
from sklearn.linear_model import LinearRegression

# trains our mannequin
self.mannequin = LinearRegression(
fit_intercept=self.settings.configuration.mannequin.fit_intercept
).match(
X=self.fe_df[self.settings.configuration.features],
y=self.fe_df[self.settings.configuration.target_column]
)
print(f"Match slope: {self.mannequin.coef_[0]}")
print(f"Match intercept: {self.mannequin.intercept_}")

self.subsequent(self.finish)

@step
def finish(self):
go

if __name__ == "__main__":
DemoTraining()

The Metaflow Config is added to the FlowSpec by mixing within the MetaboostSettings class. Referencing a configuration worth is as straightforward as utilizing the dot syntax to drill into whichever parameter you’d like.

Lastly let’s check out the output from our pattern Metaflow above. We execute experiment EXP_01 with

metaboost metaflow run --binding=EXP_01

which upon execution will merge the configurations right into a single settings file (proven beforehand) and serialize it as a yaml file to the .metaboost/settings/compiled/ listing.

You’ll be able to see the precise command and args that had been sub-processed within the Metaboost Execution part under. Please observe the –config argument pointing to the serialized yaml file, after which subsequently accessible through self.settings. Additionally observe the handy printing of configuration values to stdout throughout the begin step utilizing a combined in operate named show_settings().

(venv-demo) ~/initiatives/metaboost-demo [branch=demoX] 
$ metaboost metaflow run --binding=EXP_01

Metaboost Execution:
- python3.10 /root/repos/cdm-metaboost-irl/metaflows/coaching.py
--no-pylint --package-suffixes=.py --environment=conda
--config settings
.metaboost/settings/compiled/settings.branch_demox.EXP_01.coaching.mP4eIStG.yaml
run --prediction_date20241006

Metaflow 2.12.39+nflxfastdata(2.13.5);nflx(2.13.5);metaboost(0.0.27)
executing DemoTraining for person:dcasler
Validating your move...
The graph appears to be like good!
Bootstrapping Conda atmosphere... (this might take a couple of minutes)
All packages already cached in s3.
All environments already cached in s3.

Workflow beginning (run-id 50), see it within the UI at
https://metaflowui.prod.netflix.internet/DemoTraining/50

[50/start/251640833] Process is beginning.
[50/start/251640833] Configuration Values:
[50/start/251640833] settings.configuration.conda.numpy = 1.22.4
[50/start/251640833] settings.configuration.options.0 = runtime
[50/start/251640833] settings.configuration.options.1 = content_type
[50/start/251640833] settings.configuration.options.2 = top_billed_talent
[50/start/251640833] settings.configuration.mannequin.fit_intercept = True
[50/start/251640833] settings.configuration.target_column = metricA
[50/start/251640833] settings.atmosphere.READ_DATABASE = data_warehouse_prod
[50/start/251640833] settings.atmosphere.TARGET_DATABASE = demo_dev
[50/start/251640833] Process completed efficiently.

[50/get_features/251640840] Process is beginning.
[50/get_features/251640840] Process completed efficiently.

[50/train/251640854] Process is beginning.
[50/train/251640854] Match slope: 0.4702672504331096
[50/train/251640854] Match intercept: -6.247919678070083
[50/train/251640854] Process completed efficiently.

[50/end/251640868] Process is beginning.
[50/end/251640868] Process completed efficiently.

Accomplished! See the run within the UI at
https://metaflowui.prod.netflix.internet/DemoTraining/50

Takeaways

Metaboost is an integration software that goals to ease the venture growth, administration and execution burden of ML initiatives at Netflix. It employs a configuration system that mixes git primarily based parameters, international configurations and arbitrarily certain configuration information to be used throughout execution in opposition to inner Netflix platforms.

Integrating this configuration system with the brand new Config in Metaflow is extremely easy (by design), solely requiring customers so as to add a mix-in class to their FlowSpec — just like this instance in Metaflow documentation — after which reference the configuration values in steps or decorators. The instance above templatizes a coaching Metaflow for the sake of experimentation, however customers may simply as simply use bindings/configs to templatize their flows throughout goal metrics, enterprise initiatives or some other arbitrary traces of labor.

It couldn’t be simpler to get began with Configs! Simply

pip set up -U metaflow

to get the most recent model and head to the up to date documentation for examples. In case you are impatient, you’ll find and execute all config-related examples on this repository as nicely.

In case you have any questions or suggestions about Config (or different Metaflow options), you may attain out to us on the Metaflow group Slack.

We wish to thank Outerbounds for his or her collaboration on this function; for rigorously testing it and creating a repository of examples to showcase a number of the prospects supplied by this function.



Source link

Blog Configurable Dec Introducing Metaflow Netflix Technology
Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleTaylor Swift Keeps Her Love Story With Travis Kelce Alive at the Expense of Environmental Concerns
Next Article Pop stars making messy music, including Chappell Roan, Charli XCX and Sabrina Carpenter
Team Entertainer
  • Website

Related Posts

LITTLE HOUSE ON THE PRAIRIE Series Renewed for Season 2 at Netflix Ahead of the Season 1 Premiere — GeekTyrant

March 4, 2026

Optimizing Recommendation Systems with JDK’s Vector API | by Netflix Technology Blog | Mar, 2026

March 3, 2026

Skip ‘Wuthering Heights’ and Watch This 21st Century Period Romance Before It Leaves Netflix

March 1, 2026

Mount Mayhem at Netflix: Scaling Containers on Modern CPUs | by Netflix Technology Blog

February 28, 2026
Recent Posts
  • Footballer Michael Ballack tearfully breaks his silence 5 years after the tragic death of his son Emilio, 18
  • How To Change Your Appearance
  • Is ‘Grey’s Anatomy’ Setting Up Jules Millin’s Departure Next? (VIDEO)
  • DJ Mac “WYFL” Riddim Interview: ‘Manifestation Is Real’

Archives

  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021

Categories

  • Actress
  • Awards
  • Behind the Camera
  • BollyBuzz
  • Celebrity
  • Edit Picks
  • Glam & Style
  • Global Bollywood
  • In the Frame
  • Insta Inspector
  • Interviews
  • Movies
  • Music
  • News
  • News & Gossip
  • News & Gossips
  • OTT
  • Podcast
  • Power & Purpose
  • Press Release
  • Spotlight Stories
  • Spotted!
  • Star Luxe
  • Television
  • Trending
  • Uncategorized
  • Web Series
NAVIGATION
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
  • About us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us
Copyright © 2026 Entertainer.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?