Authors: Sejoon Oh, Moumita Bhattacharya, Yesu Feng, Sudarshan Lamkhede, Ko-Jen Hsiao, and Justin Basilico
Recommender programs have develop into important parts of digital providers throughout e-commerce, streaming media, and social networks [1, 2]. At Netflix, these programs drive vital product and enterprise impression by connecting members with related content material on the proper time [3, 4]. Whereas our advice basis mannequin (FM) has made substantial progress in understanding person preferences by large-scale studying from interplay histories (please confer with this article about FM @ Netflix), there is a chance to additional improve its capabilities. By extending FM to include the prediction of underlying person intents, we intention to counterpoint its understanding of person classes past next-item prediction, thereby providing a extra complete and nuanced advice expertise.
Current analysis has highlighted the significance of understanding person intent in on-line platforms [5, 6, 7, 8]. As Xia et al. [8] demonstrated at Pinterest, predicting a person’s future intent can result in extra correct and customized suggestions. Nevertheless, current intent prediction approaches usually make use of easy multi-task studying that provides intent prediction heads to next-item prediction fashions with out establishing a hierarchical relationship between these duties.
To handle these limitations, we introduce FM-Intent, a novel advice mannequin that enhances our basis mannequin by hierarchical multi-task studying. FM-Intent captures a person’s latent session intent utilizing each short-term and long-term implicit indicators as proxies, then leverages this intent prediction to enhance next-item suggestions. In contrast to typical approaches, FM-Intent establishes a transparent hierarchy the place intent predictions straight inform merchandise suggestions, making a extra coherent and efficient advice pipeline.
FM-Intent makes three key contributions:
- A novel advice mannequin that captures person intent on the Netflix platform and enhances next-item prediction utilizing this intent data.
- A hierarchical multi-task studying method that successfully fashions each short-term and long-term person pursuits.
- Complete experimental validation exhibiting vital efficiency enhancements over state-of-the-art fashions, together with our basis mannequin.
Within the Netflix ecosystem, person intent manifests by varied interplay metadata, as illustrated in Determine 1. FM-Intent leverages these implicit indicators to foretell each person intent and next-item suggestions.
Determine 1: Overview of person engagement information in Netflix. Consumer intent could be related to a number of interplay metadata. We leverage varied implicit indicators to foretell person intent and next-item.
In Netflix, there could be a number of sorts of person intents. As an example,
Motion Sort: Classes reflecting what customers intend to do on Netflix, resembling discovering new content material versus persevering with beforehand began content material. For instance, when a member performs a follow-up episode of one thing they had been already watching, this may be categorized as “proceed watching” intent.
Style Choice: The pre-defined style labels (e.g., Motion, Thriller, Comedy) that point out a person’s content material preferences throughout a session. These preferences can shift considerably between classes, even for a similar person.
Film/Present Sort: Whether or not a person is on the lookout for a film (usually a single, longer viewing expertise) or a TV present (probably a number of episodes of shorter length).
Time-since-release: Whether or not the person prefers newly launched content material, current content material (e.g., between every week and a month), or evergreen catalog titles.
These dimensions function proxies for the latent person intent, which is usually in a roundabout way observable however essential for offering related suggestions.
FM-Intent employs a hierarchical multi-task studying method with three main parts, as illustrated in Determine 2.
Determine 2: An architectural illustration of our hierarchical multi-task studying mannequin FM-Intent for person intent and merchandise predictions. We use ground-truth intent and item-ID labels to optimize predictions.
1. Enter Function Sequence Formation
The primary part constructs wealthy enter options by combining interplay metadata. The enter characteristic for every interplay combines categorical embeddings and numerical options, making a complete illustration of person conduct.
2. Consumer Intent Prediction
The intent prediction part processes the enter characteristic sequence by a Transformer encoder and generates predictions for a number of intent indicators.
The Transformer encoder successfully fashions the long-term curiosity of customers by multi-head consideration mechanisms. For every prediction process, the intent encoding is remodeled into prediction scores through fully-connected layers.
A key innovation in FM-Intent is the attention-based aggregation of particular person intent predictions. This method generates a complete intent embedding that captures the relative significance of various intent indicators for every person, offering worthwhile insights for personalization and rationalization.
3. Subsequent-Merchandise Prediction with Hierarchical Multi-Job Studying
The ultimate part combines the enter options with the person intent embedding to make extra correct next-item suggestions.
FM-Intent employs hierarchical multi-task studying the place intent predictions are performed first, and their outcomes are used as enter options for the next-item prediction process. This hierarchical relationship ensures that the next-item suggestions are knowledgeable by the expected person intent, making a extra coherent and efficient advice mannequin.
We performed complete offline experiments on sampled Netflix person engagement information to judge FM-Intent’s efficiency. Observe that FM-Intent makes use of a a lot smaller dataset for coaching in comparison with the FM manufacturing mannequin because of its complicated hierarchical prediction structure.
Subsequent-Merchandise and Subsequent-Intent Prediction Accuracy
Desk 1 compares FM-Intent with a number of state-of-the-art sequential advice fashions, together with our manufacturing mannequin (FM-Intent-V0).
Desk 1: Subsequent-item and next-intent prediction outcomes of baselines and our proposed technique FM-Intent on the Netflix person engagement dataset.
All metrics are represented as relative % enhancements in comparison with the SOTA baseline: TransAct. N/A signifies {that a} mannequin will not be able to predicting a sure intent. Observe that we added further fully-connected layers to LSTM, GRU, and Transformer baselines with a purpose to predict person intent, whereas we used authentic implementations for different baselines. FM-Intent demonstrates statistically vital enchancment of seven.4% in next-item prediction accuracy in comparison with one of the best baseline (TransAct).
Most baseline fashions present restricted efficiency as they both can not predict person intent or can not incorporate intent predictions into next-item suggestions. Our manufacturing mannequin (FM-Intent-V0) performs effectively however lacks the power to foretell and leverage person intent. Observe that FM-Intent-V0 is educated with a smaller dataset for a good comparability with different fashions; the precise manufacturing mannequin is educated with a a lot bigger dataset.
Determine 3: Ok-means++ (Ok=10) clustering of person intent embeddings discovered by FM-Intent; FM-Intent finds distinctive clusters of customers that share the same intent.
FM-Intent generates significant person intent embeddings that can be utilized for clustering customers with comparable intents. Determine 3 visualizes 10 distinct clusters recognized by Ok-means++ clustering. These clusters reveal significant person segments with distinct viewing patterns:
- Customers who primarily uncover new content material versus those that proceed watching current/favourite content material.
- Style lovers (e.g., anime/youngsters content material viewers).
- Customers with particular viewing patterns (e.g., Rewatchers versus informal viewers).
FM-Intent has been efficiently built-in into Netflix’s advice ecosystem, could be leveraged for a number of downstream functions:
Personalised UI Optimization: The expected person intent may inform the structure and content material choice on the Netflix homepage, emphasizing totally different rows based mostly on whether or not customers are in discovery mode, continue-watching mode, or exploring particular genres.
Analytics and Consumer Understanding: Intent embeddings and clusters present worthwhile insights into viewing patterns and preferences, informing content material acquisition and manufacturing choices.
Enhanced Advice Alerts: Intent predictions function options for different advice fashions, bettering their accuracy and relevance.
Search Optimization: Actual-time intent predictions assist prioritize search outcomes based mostly on the person’s present session intent.
FM-Intent represents an development in Netflix’s advice capabilities by enhancing them with hierarchical multi-task studying for person intent prediction. Our complete experiments display that FM-Intent considerably outperforms state-of-the-art fashions, together with our prior basis mannequin that centered solely on next-item prediction. By understanding not simply what customers may watch subsequent however what underlying intents customers have, we are able to present extra customized, related, and satisfying suggestions.
We thank our beautiful colleagues within the Basis Mannequin workforce & AIMS org. for his or her worthwhile suggestions and discussions. We additionally thank our associate groups for getting this up and operating in manufacturing.
[1] Amatriain, X., & Basilico, J. (2015). Recommender programs in business: A netflix case research. In Recommender programs handbook (pp. 385–419). Springer.
[2] Gomez-Uribe, C. A., & Hunt, N. (2015). The netflix recommender system: Algorithms, enterprise worth, and innovation. ACM Transactions on Administration Data Techniques (TMIS), 6(4), 1–19.
[3] Jannach, D., & Jugovac, M. (2019). Measuring the enterprise worth of recommender programs. ACM Transactions on Administration Data Techniques (TMIS), 10(4), 1–23.
[4] Bhattacharya, M., & Lamkhede, S. (2022). Augmenting Netflix Search with In-Session Tailored Suggestions. In Proceedings of the sixteenth ACM Convention on Recommender Techniques (pp. 542–545).
[5] Chen, Y., Liu, Z., Li, J., McAuley, J., & Xiong, C. (2022). Intent contrastive studying for sequential advice. In Proceedings of the ACM Internet Convention 2022 (pp. 2172–2182).
[6] Ding, Y., Ma, Y., Wong, W. Ok., & Chua, T. S. (2021). Modeling prompt person intent and content-level transition for sequential style advice. IEEE Transactions on Multimedia, 24, 2687–2700.
[7] Liu, Z., Chen, H., Solar, F., Xie, X., Gao, J., Ding, B., & Shen, Y. (2021). Intent choice decoupling for person illustration on on-line recommender system. In Proceedings of the Twenty-Ninth Worldwide Convention on Worldwide Joint Conferences on Synthetic Intelligence (pp. 2575–2582).
[8] Xia, X., Eksombatchai, P., Pancha, N., Badani, D. D., Wang, P. W., Gu, N., Joshi, S. V., Farahpour, N., Zhang, Z., & Zhai, A. (2023). TransAct: Transformer-based Realtime Consumer Motion Mannequin for Advice at Pinterest. In Proceedings of the twenty ninth ACM SIGKDD Convention on Data Discovery and Knowledge Mining (pp. 5249–5259).