Entertainer.newsEntertainer.news
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards

Subscribe to Updates

Get the latest Entertainment News and Updates from Entertainer News

What's Hot

Ryan Gosling and Eva Mendes make their first couple appearance in 13 years

March 7, 2026

Seth MacFarlane & Cast Interview

March 7, 2026

One Suspect Charged for Foolio’s Murder Wants Separate Trial

March 7, 2026
Facebook Twitter Instagram
Saturday, March 7
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
Facebook Twitter Tumblr LinkedIn
Entertainer.newsEntertainer.news
Subscribe Login
  • Home
  • Celebrity
  • Movies
  • Music
  • Web Series
  • Podcast
  • OTT
  • Television
  • Interviews
  • Awards
Entertainer.newsEntertainer.news
Home How Netflix Accurately Attributes eBPF Flow Logs | by Netflix Technology Blog | Apr, 2025
Web Series

How Netflix Accurately Attributes eBPF Flow Logs | by Netflix Technology Blog | Apr, 2025

Team EntertainerBy Team EntertainerApril 8, 2025Updated:April 9, 2025No Comments12 Mins Read
Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp VKontakte Email
How Netflix Accurately Attributes eBPF Flow Logs | by Netflix Technology Blog | Apr, 2025
Share
Facebook Twitter LinkedIn Pinterest Email


Netflix Technology Blog
Netflix TechBlog

By Cheng Xie, Bryan Shultz, and Christine Xu

In a earlier weblog publish, we described how Netflix makes use of eBPF to seize TCP circulation logs at scale for enhanced cloud community insights. On this publish, we delve deeper into how Netflix solved a core drawback: precisely attributing circulation IP addresses to workload identities.

FlowExporter is a sidecar that runs alongside all Netflix workloads within the AWS Cloud. It makes use of eBPF and TCP tracepoints to observe TCP socket state modifications. When a TCP socket closes, FlowExporter generates a circulation log document that features the IP addresses, ports, timestamps, and extra socket statistics. On common, 5 million information are produced per second.

In cloud environments, IP addresses are reassigned to completely different workloads as workload cases are created and terminated, so IP addresses alone can’t present insights on which workloads are speaking. To make the circulation logs helpful, every IP deal with have to be attributed to its corresponding workload id. FlowCollector, a backend service, collects circulation logs from FlowExporter cases throughout the fleet, attributes the IP addresses, and sends these attributed flows to Netflix’s Information Mesh for subsequent stream and batch processing.

The eBPF circulation logs present a complete view of service topology and community well being throughout Netflix’s in depth microservices fleet, whatever the programming language, RPC mechanism, or application-layer protocol utilized by particular person workloads.

Precisely attributing circulation IP addresses to workload identities has been a major problem since our eBPF circulation logs have been launched.

As famous in our earlier weblog publish, our preliminary attribution strategy relied on Sonar, an inside IP deal with monitoring service that emits an occasion each time an IP deal with in Netflix’s AWS VPCs is assigned or unassigned to a workload. FlowCollector consumes a stream of IP deal with change occasions from Sonar and makes use of this data to attribute circulation IP addresses in real-time.

The elemental downside of this technique is that it could actually result in misattribution. Delays and failures are inevitable in distributed methods, which can delay IP deal with change occasions from reaching FlowCollector. For example, an IP deal with could initially be assigned to workload X however later reassigned to workload Y. Nevertheless, if the change occasion for this reassignment is delayed, FlowCollector will proceed to imagine that the IP deal with belongs to workload X, leading to misattributed flows. Moreover, occasion timestamps could also be inaccurate relying on how they’re captured.

Misattribution rendered the circulation information unreliable for decision-making. Customers usually depend upon circulation logs to validate workload dependencies, however misattribution creates confusion. With out professional data of anticipated dependencies, customers would battle to establish or verify misattribution. Furthermore, misattribution occurred steadily for essential companies with a big footprint resulting from frequent IP deal with modifications. Total, misattribution makes fleet-wide dependency evaluation impractical.

As a workaround, we made FlowCollector maintain obtained flows for quarter-hour earlier than attribution, permitting time for delayed IP deal with change occasions. Whereas this strategy decreased misattribution, it didn’t remove it. Furthermore, the ready interval made the info much less recent, lowering its utility for real-time evaluation.

Absolutely eliminating misattribution is essential as a result of it solely takes a single misattributed circulation to provide an incorrect workload dependency. Fixing this drawback required a whole rethinking of our strategy. Over the previous 12 months, Netflix developed a brand new attribution technique that has lastly eradicated misattribution, as detailed in the remainder of this publish.

Every socket has two IP addresses: an area IP deal with and a distant IP deal with. Beforehand, we used the identical technique to attribute each. Nevertheless, attributing the native IP deal with needs to be an easier process for the reason that native IP deal with belongs to the occasion the place FlowExporter captures the socket. Due to this fact, FlowExporter ought to decide the native workload id from its setting and attribute the native IP deal with earlier than sending the circulation to FlowCollector.

That is easy for workloads operating instantly on EC2 cases, as Netflix’s Metatron provisions workload id certificates to every EC2 occasion at boot time. FlowExporter can merely learn these certificates from the native disk to find out the native workload id.

Attributing native IP addresses for container workloads operating on Netflix’s container platform, Titus, is more difficult. FlowExporter runs on the container host stage, the place every host manages a number of container workloads with completely different identities. When FlowExporter’s eBPF packages obtain a socket occasion from TCP tracepoints within the kernel, the socket could have been created by one of many container workloads or by the host itself. Due to this fact, FlowExporter should decide which workload to attribute the socket’s native IP deal with to. To resolve this drawback, we leveraged IPMan, Netflix’s container IP deal with project service. IPManAgent, a daemon operating on each container host, is accountable for assigning and unassigning IP addresses. As container workloads are launched, IPManAgent writes an IP-address-to-workload-ID mapping to an eBPF map, which FlowExporter’s eBPF packages can then use to search for the workload ID related to a socket native IP deal with.

One other problem was to accommodate Netflix’s IPv6 to IPv4 translation mechanism on Titus. To facilitate IPv6 migration, Netflix developed a mechanism that allows IPv6-only containers to speak with IPv4 locations with out incurring NAT64 overhead. This mechanism intercepts join syscalls and replaces the underlying socket with one which makes use of a shared IPv4 deal with assigned to the container host. This confuses FlowExporter as a result of the kernel reviews the identical native IPv4 deal with for sockets created by completely different container workloads. To disambiguate, native port data is moreover required. We modified Titus to write down a mapping of (native IPv4 deal with, native port) to the workload ID into an eBPF map each time a join syscall is intercepted. FlowExporter’s eBPF packages then use this map to accurately attribute sockets created by the interpretation mechanism.

With these issues solved, we will now precisely attribute the native IP deal with of each circulation.

As soon as the native IP deal with attribution drawback is solved, precisely attributing distant IP addresses turns into possible. Now, every circulation reported by FlowExporter contains the native IP deal with, the native workload id, and connection begin/finish timestamps. As FlowCollector receives these flows, it could actually be taught the time ranges throughout which every workload owns a given IP deal with. For example, if FlowCollector sees a circulation with native IP deal with 10.0.0.1 related to workload X that begins at t1 and ends at t2, it could actually deduce that 10.0.0.1 belonged to workload X from t1 to t2. Since Netflix makes use of Amazon Time Sync throughout its fleet, the timestamps (captured by FlowExporter) are dependable.

The FlowCollector service cluster consists of many nodes. Each node have to be able to attributing arbitrary distant IP addresses and, due to this fact, requires data of all workload IP addresses and their current possession information. To signify this data, every node maintains an in-memory hashmap that maps an IP deal with to an inventory of time ranges, as illustrated by the next Go structs:

kind IPAddressTracker struct {
ipToTimeRanges map[netip.Addr]timeRanges
}

kind timeRanges []timeRange

kind timeRange struct {
workloadID string
begin time.Time
finish time.Time
}

To populate the hashmap, FlowCollector extracts the native IP deal with, native workload id, begin time, and finish time from every obtained circulation and creates/extends the corresponding time ranges within the map. The time ranges for every IP deal with are sorted in ascending order, and they’re non-overlapping since an IP deal with can’t belong to 2 completely different workloads concurrently.

Since every circulation is simply despatched to 1 FlowCollector node, every node should share the time ranges it discovered from obtained flows with different nodes. We applied a broadcasting mechanism utilizing Kafka, the place every node publishes discovered time ranges to all different nodes. Though extra environment friendly broadcasting implementations exist, the Kafka-based strategy is easy and has labored nicely for us.

Now, FlowCollector can attribute distant IP addresses by trying them up within the populated map, which returns an inventory of time ranges. It then makes use of the circulation’s begin timestamp to find out the corresponding time vary and related workload id. If the beginning time doesn’t fall inside any time vary, FlowCollector will retry after a delay, ultimately giving up if the retry fails. Such failures could happen when flows are misplaced or broadcast messages are delayed. For our use circumstances, it’s acceptable to go away a small proportion of flows unattributed, however any misattribution is unacceptable.

This new technique achieves correct attribution because of the continual heartbeats, every related to a dependable time vary of IP deal with possession. It handles transient points gracefully — just a few delayed or misplaced heartbeats don’t result in misattribution. In distinction, the earlier technique relied solely on discrete IP deal with project and unassignment occasions. Missing heartbeats, it needed to presume an IP deal with remained assigned till notified in any other case (which might be hours or days later), making it susceptible to misattribution when the notifications have been delayed.

One element is that when FlowCollector receives a circulation, it can’t attribute its distant IP deal with immediately as a result of it requires the newest noticed time ranges for the distant IP deal with. Since FlowExporter reviews flows in batches each minute, FlowCollector should wait till it receives the circulation batch from the distant workload FlowExporter for the final minute, which can not have arrived but. To deal with this, FlowCollector quickly shops obtained flows on disk for one minute earlier than attributing their distant IP addresses. This introduces a 1-minute delay, however it’s a lot shorter than the 15-minute delay with the earlier strategy.

Along with producing correct attribution, the brand new technique can also be cost-effective because of its simplicity and in-memory lookups. As a result of the in-memory state might be shortly rebuilt when a FlowCollector node begins up, no persistent storage is required. With 30 c7i.2xlarge cases, we will course of 5 million flows per second for your complete Netflix fleet.

For simplicity, we have now up to now glossed over one subject: regionalization. Netflix’s cloud microservices function throughout a number of AWS areas. To optimize circulation reporting and reduce cross-regional site visitors, a FlowCollector cluster runs in every main area, and FlowExporter brokers ship flows to their corresponding regional FlowCollector. When FlowCollector receives a circulation, its native IP deal with is assured to be inside the area.

To attenuate cross-region site visitors, the broadcasting mechanism is proscribed to FlowCollector nodes inside the identical area. Consequently, the IP deal with time ranges map comprises solely IP addresses from that area. Nevertheless, cross-regional flows have a distant IP deal with in a special area. To attribute these flows, the receiving FlowCollector node forwards them to nodes within the corresponding area. FlowCollector determines the area for a distant IP deal with by trying up a trie constructed from all Netflix VPC CIDRs. This strategy is extra environment friendly than broadcasting IP deal with time vary updates throughout all areas, as just one% of Netflix flows are cross-regional.

Up to now, FlowCollector can precisely attribute IP addresses belonging to Netflix’s cloud workloads. Nevertheless, not all circulation IP addresses fall into this class. For example, a good portion of flows goes by way of AWS ELBs. For these flows, their distant IP addresses are related to the ELBs, the place we can’t run FlowExporter. Consequently, FlowCollector can’t decide their identities by merely observing the obtained flows. To attribute these distant IP addresses, we proceed to make use of IP deal with change occasions from Sonar, which crawls AWS assets to detect modifications in IP deal with assignments. Though this information stream could include inaccurate timestamps and be delayed, misattribution just isn’t a most important concern since ELB IP deal with reassignment happens very occasionally.

Verifying that the brand new technique has eradicated misattribution is difficult because of the lack of a definitive supply of reality for workload dependencies to validate circulation logs towards; the circulation logs themselves are supposed to function this supply of reality, in spite of everything. To construct confidence, we analyzed the circulation logs of a big service with well-understood dependencies. A big footprint is critical, as misattribution is extra prevalent in companies with quite a few cases, and there have to be a dependable technique to find out the dependencies for this service with out counting on circulation logs.

Netflix’s cloud gateway, Zuul, served this goal completely resulting from its in depth footprint (dealing with all cloud ingress site visitors), its giant variety of downstream dependencies, and our means to derive its dependencies from its routing configurations because the supply of reality for comparability with circulation logs. We discovered no misattribution for flows by way of Zuul over a two-week window. This offered robust confidence that the brand new attribution technique has eradicated misattribution. Within the earlier strategy, roughly 40% of Zuul’s dependencies reported by the circulation logs have been misattributed.

With misattribution solved, eBPF circulation logs now ship reliable, fleet-wide insights into Netflix’s service topology and community well being. This development unlocks quite a few thrilling alternatives in areas similar to service dependency auditing, safety evaluation, and incident triage, whereas serving to Netflix engineers develop a greater understanding of our ever-evolving distributed methods.

We wish to thank Martin Dubcovsky, Joanne Koong, Taras Roshko, Nabil Schear, Jacob Meyers, Parsha Pourkhomami, Hechao Li, Donavan Fritz, Rob Gulewich, Amanda Li, John Salem, Hariharan Ananthakrishnan, Keerti Lakshminarayan, and different gorgeous colleagues for his or her suggestions, inspiration, and contributions to the success of this effort.



Source link

Accurately Apr Attributes Blog eBPF flow Logs Netflix Technology
Share. Facebook Twitter Pinterest LinkedIn Tumblr WhatsApp Email
Previous ArticleThe White Lotus’ Mike White Slams Composer Who Left Show Ahead of Season 4
Next Article Neil Young Does Things ‘That Really Piss Me Off’
Team Entertainer
  • Website

Related Posts

Why Netflix ‘Cut Ties’ With Meghan Markle’s As Ever Brand

March 6, 2026

Scaling Global Storytelling: Modernizing Localization Analytics at Netflix | by Netflix Technology Blog | Mar, 2026

March 6, 2026

LITTLE HOUSE ON THE PRAIRIE Series Renewed for Season 2 at Netflix Ahead of the Season 1 Premiere — GeekTyrant

March 4, 2026

Optimizing Recommendation Systems with JDK’s Vector API | by Netflix Technology Blog | Mar, 2026

March 3, 2026
Recent Posts
  • Ryan Gosling and Eva Mendes make their first couple appearance in 13 years
  • Seth MacFarlane & Cast Interview
  • One Suspect Charged for Foolio’s Murder Wants Separate Trial
  • Best Shows to Binge on Prime Video This Weekend

Archives

  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • October 2024
  • September 2024
  • August 2024
  • July 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021

Categories

  • Actress
  • Awards
  • Behind the Camera
  • BollyBuzz
  • Celebrity
  • Edit Picks
  • Glam & Style
  • Global Bollywood
  • In the Frame
  • Insta Inspector
  • Interviews
  • Movies
  • Music
  • News
  • News & Gossip
  • News & Gossips
  • OTT
  • Podcast
  • Power & Purpose
  • Press Release
  • Spotlight Stories
  • Spotted!
  • Star Luxe
  • Television
  • Trending
  • Uncategorized
  • Web Series
NAVIGATION
  • About us
  • Advertise with us
  • Submit Articles
  • Privacy Policy
  • Contact us
  • About us
  • Disclaimer
  • Privacy Policy
  • DMCA
  • Cookie Privacy Policy
  • Terms and Conditions
  • Contact us
Copyright © 2026 Entertainer.

Type above and press Enter to search. Press Esc to cancel.

Sign In or Register

Welcome Back!

Login to your account below.

Lost password?