Entertainer.newsEntertainer.news
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards

Subscribe to Updates

Get the latest Entertainment News and Updates from Entertainer News

What's Hot

The Top 5 Clinics to Get Mounjaro in Abu Dhabi

March 6, 2026

Nicola Peltz Beckham breaks silence following Brooklyn’s cryptic birthday message from parents

March 6, 2026

Sarah Ferguson Essentially Homeless Amid Epstein Scandal – Friends & Even Her Daughters Are Shutting Her Out!

March 6, 2026
Facebook Twitter Instagram
Friday, March 6
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
Facebook Twitter Tumblr LinkedIn
Entertainer.newsEntertainer.news
Subscribe Login
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards
Entertainer.newsEntertainer.news
Home Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale | by Netflix Technology Blog | Sep, 2025
Web Series

Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale | by Netflix Technology Blog | Sep, 2025

Team EntertainerBy Team EntertainerSeptember 22, 2025Updated:September 23, 2025No Comments10 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
Scaling Muse: How Netflix Powers Data-Driven Creative Insights at Trillion-Row Scale | by Netflix Technology Blog | Sep, 2025
Share
Facebook Twitter LinkedIn Pinterest Email


Netflix Technology Blog

By Andrew Pierce, Chris Thrailkill, Victor Chiapaikeo

At Netflix, we prioritize getting well timed knowledge and insights into the palms of the individuals who can act on them. Certainly one of our key inner purposes for this goal is Muse. Muse’s final objective is to assist Netflix members uncover content material they’ll love by guaranteeing our promotional media is as efficient and genuine as potential. It achieves this by equipping artistic strategists and launch managers with data-driven insights displaying which paintings or video clips resonate finest with world or regional audiences and flagging outliers resembling probably deceptive (clickbait-y) belongings. These sorts of purposes fall underneath On-line Analytical Processing (OLAP), a class of programs designed for advanced querying and knowledge exploration. Nonetheless, enabling Muse to assist new, extra superior filtering and grouping capabilities whereas sustaining excessive efficiency and knowledge accuracy has been a problem. Earlier posts have touched on paintings personalization and our impressions structure. On this publish, we’ll focus on some steps we’ve taken to evolve the Muse knowledge serving layer to allow new capabilities whereas sustaining excessive efficiency and knowledge accuracy.

Press enter or click on to view picture in full dimension
Muse software

An Evolving Structure

Like many early analytics purposes, Muse started as a easy dashboard powered by batch knowledge pipelines (Spark¹) and a modest Druid² cluster. As the appliance advanced, so did consumer calls for. Customers wished new options like outlier detection and notification supply, media comparability and playback, and superior filtering, all whereas requiring decrease latency and supporting ever-growing datasets (within the order of trillions of rows a 12 months). One of the crucial difficult necessities was enabling dynamic evaluation of promotional media efficiency by “viewers” affinities: internally outlined, algorithmically inferred labels representing collections of viewers with related tastes. Answering questions like “Does particular promotional media resonate extra with Character Drama followers or Pop Tradition lovers?” required augmenting already voluminous impression and playback knowledge. Supporting filtering and grouping by these many-to-many viewers relationships led to a combinatorial explosion in knowledge quantity, pushing the boundaries of our authentic structure.

To deal with these complexities and assist the evolving wants of our customers, we undertook a big evolution of Muse’s structure. As we speak’s Muse is a React app that queries a GraphQL layer served with a set of Spring Boot GRPC microservices. Within the the rest of this publish, we’ll deal with steps we took to scale the information microservice, its backing ETL, and our Druid cluster. Particularly, we’ve modified the information mannequin to depend on HyperLogLog (HLL) sketches, used Hole for entry to in-memory, precomputed aggregates, and brought a sequence of steps to tune Druid. To make sure the accuracy of those modifications, we relied closely on inner debugging instruments to validate pre- and post-changes.

Press enter or click on to view picture in full dimension
Muse’s Present Structure

Transferring to HyperLogLog (HLL) Sketches for Distinct Counts

A number of the most vital metrics we observe are impressions, the variety of instances an asset is proven to a consumer inside a time window, and certified performs, which hyperlinks a playback occasion with a minimal period again to a selected impression. Calculating these metrics requires counting distinct customers. Nonetheless, performing distinct counts in distributed programs is resource-intensive and difficult. For example, to find out what number of distinctive profiles have ever seen a specific asset, we have to evaluate every new set of profile ids with these from all days earlier than it, probably spanning months and even years.

For efficiency, we are able to commerce accuracy. The Apache Datasketches library permits us to get distinct depend estimates which are inside a 1–2% error. That is tunable with a precision parameter referred to as logK (0.8% in our case with logK of 17). We construct sketches in two locations:

  1. Throughout Druid ingest: we use the HLLSketchBuild aggregator with Druid rollup set to true to cut back our knowledge in preparation for quick distinct counting
  2. Throughout our Spark ETL: we persist precomputed aggregates like all-time impressions per asset within the type of HLL sketches. Every day, we merge a brand new HLL sketch into the present one utilizing a mix of hll_union and hll_union_agg (features added by our very personal Ryan Berti)
Press enter or click on to view picture in full dimension
We use Datasketches in our ETL and serving programs

HLL has been an enormous efficiency increase for us each inside the serving and ETL layer. Throughout our most typical OLAP question patterns, we’ve seen latencies scale back by approx 50%. However, working APPROX_COUNT_DISTINCT over massive date ranges on the Druid cluster for very massive titles exhausts restricted threads, particularly in high-concurrency conditions. To additional offload Druid question quantity and protect cluster threads, we’ve additionally relied extensively on the Hole library.

Hole as a Learn-Solely Key Worth Retailer for Precomputed Aggregates

Our in-house Hollow³ infrastructure permits us to simply create Hole feeds — basically extremely compressed and performant in-memory key/worth shops — from Iceberg⁴ tables. On this setup, devoted producer servers pay attention for modifications to Iceberg tables, and when updates happen, they push the newest knowledge to downstream shoppers. On the buyer facet, our Spring Boot purposes take heed to bulletins from these producers and robotically refresh in-memory caches with the newest dataset.

This structure has enabled us emigrate a number of knowledge entry patterns from Druid to Hole, particularly ones with a restricted variety of parameter combos per title. Certainly one of these was fetching distinct filter dimensions. For instance, whereas most Netflix-branded titles are launched globally, licensed titles usually have rights restrictions that restrict their availability to particular international locations and time home windows. In consequence, a specific licensed title may solely be obtainable to members in Germany and Luxembourg.

Press enter or click on to view picture in full dimension
Distinct international locations queried from a Hole feed for the belongings for Manta Manta

Previously, retrieving these distinct nation values per asset required issuing a SELECT DISTINCT question to our Druid cluster. With Hole, we keep a feed of distinct dimension values, permitting us to carry out stream operations just like the one beneath straight on a cached dataset.

/**
* Returns the potential filter values for a dimension resembling international locations
*/
public Record<Dimension> getDimensions(lengthy movieId, String dimensionId) {
// Entry in-memory Hole feed with close to prompt question time
Map<String, Record<Dimension>> dimensions = dimensionsHollowConsumer.lookup(movieId);
return dimensions.getOrDefault(dimensionId, Record.of()).stream()
.sorted(Comparator.evaluating(Dimension::getName))
.toList();
}

Though it provides complexity to our service by requiring extra intricate request routing and a better reminiscence footprint, pre-computed aggregates have given us better stability and efficiency. Within the case of fetching distinct dimensions, we’ve noticed question instances drop from a whole lot of milliseconds to simply tens of milliseconds. Extra importantly, this shift has offloaded excessive concurrency calls for from our Druid cluster, leading to extra constant question efficiency. Along with this use case, cached pre-computed aggregates additionally energy options resembling retrieving just lately launched titles, accessing all-time asset metrics, and serving numerous items of title metadata.

Tuning Druid

Even with the efficiencies gained from HLL sketches and Hole feeds, guaranteeing that our Druid cluster operates performantly has been an ongoing problem. Luckily, at Netflix, we’re within the firm of a number of Apache Druid PMC members like Maytas Monsereenusorn and Jesse Tuğlu who’ve helped us wring out each ounce of efficiency. A number of the key optimizations we’ve carried out embody:

  • Rising dealer depend relative to historic nodes: We purpose for a broker-to-historical ratio near the really useful 1:15, which helps enhance question throughput.
  • Tuning phase sizes: By focusing on the 300–700 MB “candy spot” for phase sizes, primarily utilizing the tuningConfig.targetRowsPerSegment parameter throughout ingestion — we be sure that every phase a single historic thread scans shouldn’t be overly massive.
  • Leveraging Druid lookups for knowledge enrichment: Since joins could be prohibitively costly in Druid, we use lookups at question time for any key column enrichment.
  • Optimizing search predicates: We be sure that all search predicates function on bodily columns somewhat than digital ones, creating mandatory columns throughout ingestion with transformSpec.transforms.
  • Filtering and slimming knowledge sources at ingest: By making use of filters inside transformSpec.filter and eradicating all unused columns in dimensionsSpec.dimensions, we hold our knowledge sources lean and enhance the opportunity of larger rollup yield.
  • Use of multi-value dimensions: Exploiting the Druid multi-value dimension function was key to overcoming the “many-to-many” combinatorial quandary when integrating viewers filtering and grouping performance talked about within the “An Evolving Structure” part above.

Collectively, these optimizations, mixed with earlier ones, have decreased our p99 Druid latencies by roughly 50%.

Validation & Rollout

Rolling out these modifications to our metrics system required a radical validation and launch technique. Our method prioritized each knowledge integrity and consumer belief, leveraging a mix of automation, focused tooling, and incremental publicity to manufacturing visitors. On the core of our technique was a parallel stack deployment: each the legacy and new metric stacks operated side-by-side inside the Muse Knowledge microservice. This setup allowed us to validate knowledge high quality, monitor real-world efficiency, and mitigate threat by enabling seamless fallback at any stage.

Press enter or click on to view picture in full dimension

We adopted a two-pronged validation course of:

  • Automated Offline Validation: Utilizing Jupyter Notebooks, we automated the sampling and comparability of key metrics throughout each the legacy and new stacks. Our sampling set included a consultant combine: just lately accessed titles, high-profile launches, and edge-case titles with distinctive dealing with necessities. This allowed us to catch delicate discrepancies in metrics early within the course of. Iterative testing on this set guided fixes, resembling tuning the HLL logK parameter and benchmarking end-to-end latency enhancements.
  • In-App Knowledge Comparability Tooling: To facilitate speedy triage, we constructed a developer-facing comparability function inside our software that shows knowledge from each the legacy and new metric stacks facet by facet. The instrument robotically highlights any important variations, making it straightforward to shortly spot and examine discrepancies recognized throughout offline validation or reported by customers.

We carried out a number of launch finest practices to mitigate threat and keep stability:

  • Staggered Implementation by Software Section: We developed and deployed the brand new metric stack in phases, specializing in particular software segments. This meant constructing out assist for asset sorts like paintings and video individually after which additional dividing by CEE part (Discover, Exploit). By implementing modifications phase by phase, we have been in a position to isolate points early, validate each bit independently, and scale back total threat throughout the migration.
  • Shadow Testing (“Darkish Launch”): Previous to exposing the brand new stack to finish customers, we mirrored manufacturing visitors asynchronously to the brand new implementation. This allowed us to validate real-world latency and catch potential faults in a dwell setting, with out impacting the precise consumer expertise.
  • Granular Function Flagging: We carried out fine-grained function flags to manage publicity inside every phase. This allowed us to focus on particular consumer teams or titles and immediately roll again or alter the rollout scope if any points have been detected, guaranteeing speedy mitigation with minimal disruption.

Learnings and Subsequent Steps

Our journey with Muse examined the boundaries of a number of elements of the stack: the ETL layer, the Druid layer, and the information serving layer. Whereas some decisions, like leveraging Netflix’s in-house Hole infrastructure, have been influenced by obtainable assets, easy rules like offloading question quantity, pre-filtering of rows and columns earlier than Druid rollup, and optimizing search predicates (together with a little bit of HLL magic) went a great distance in permitting us to assist new capabilities whereas sustaining efficiency. Moreover, engineering finest practices like producing side-by-side implementations and backwards-compatible modifications enabled us to roll out revisions steadily whereas sustaining rigorous validation requirements. Wanting forward, we’ll proceed to construct on this basis by supporting a wider vary of content material sorts like Stay and Video games, incorporating synopsis knowledge, deepening our understanding of how belongings work collectively to affect member selecting, and incorporating new metrics to differentiate between “efficient” and “genuine” promotional belongings, in service of serving to members discover content material that really resonates with them.



Source link

Blog Creative DataDriven Insights Muse Netflix Powers Scale Scaling Sep Technology TrillionRow
Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleCardi B Proves She Wrote Her Own Lyrics After Joe Budden Doubts
Next Article Friday TV Ratings: Big Brother: Unlocked, Dateline NBC, 20/20, Totally Funny Animals, NCAA Football – canceled + renewed TV shows, ratings
Team Entertainer
  • Website

Related Posts

LITTLE HOUSE ON THE PRAIRIE Series Renewed for Season 2 at Netflix Ahead of the Season 1 Premiere — GeekTyrant

March 4, 2026

Optimizing Recommendation Systems with JDK’s Vector API | by Netflix Technology Blog | Mar, 2026

March 3, 2026

Skip ‘Wuthering Heights’ and Watch This 21st Century Period Romance Before It Leaves Netflix

March 1, 2026

CNLawBlog – Simple Legal Insights You Can Trust

March 1, 2026
Recent Posts
  • The Top 5 Clinics to Get Mounjaro in Abu Dhabi
  • Nicola Peltz Beckham breaks silence following Brooklyn’s cryptic birthday message from parents
  • Sarah Ferguson Essentially Homeless Amid Epstein Scandal – Friends & Even Her Daughters Are Shutting Her Out!
  • Tuesday TV Ratings: RJ Decker, Best Medicine, NCIS, NBA Basketball, WWE NXT – canceled + renewed TV shows, ratings

Archives

  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021

Categories

  • Actress
  • Awards
  • Behind the Camera
  • BollyBuzz
  • Celebrity
  • Edit Picks
  • Glam & Style
  • Global Bollywood
  • In the Frame
  • Insta Inspector
  • Interviews
  • Movies
  • Music
  • News
  • News & Gossip
  • News & Gossips
  • OTT
  • Podcast
  • Power & Purpose
  • Press Release
  • Spotlight Stories
  • Spotted!
  • Star Luxe
  • Television
  • Trending
  • Uncategorized
  • Web Series
NAVIGATION
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
  • About us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us
Copyright © 2026 Entertainer.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?