How do Content Recommendation Engines Work?

Recommendation Engines are systems that typically use Machine Learning to predict which movie or video a particular user (or cohort) is likely to enjoy watching, based on their past choices, preferences, and the content provider’s catalog.

Recommendation Engines are important for OTT platforms as they are a tool to help users navigate through a movie catalog efficiently. With the help of Machine Learning, platforms can build a persona for every user based on their interaction with the service, their choice of movies, and extensive movie metadata.

In this article, we take a look at recommendation engines, what data they need, why they are useful, etc., from the context of an OTT service provider.

Recommendation Engines in OTT Streaming

In a recently published paper, Google’s researchers have explained how they recommend videos to their users. Here is abstract –

In this paper, we introduce a large scale multi-objective ranking system for recommending what video to watch next on an industrial video sharing platform. The system faces many real-world challenges, including the presence of multiple competing ranking objectives, as well as implicit selection biases in user feedback. To tackle these challenges, we explored a variety of soft-parameter sharing techniques such as Multi-gate Mixture-of-Experts so as to efficiently optimize for multiple ranking objectives. Additionally, we mitigated the selection biases by adopting a Wide & Deep framework. We demonstrated that our proposed techniques can lead to substantial improvements on recommendation quality on one of the world’s largest video sharing platforms.


Sounds complex? Well, that’s because YouTube needs to get the recommendations right to retain users, reduce churn, and improve its ad-based revenue. A lot is riding on these recommendations.

Of course, when we talk about movie recommendations, we have to mention the famous $1M Netflix Prize that sought to substantially improve the accuracy of predictions about how much someone is going to enjoy a movie based on their movie preferences. In other words, can you predict which movie the user is going to enjoy and consequently suggest it to them? Here is a description of the solution that won the Grand Prize of $1 Million in 2009!

In this article, we won’t be going deep into the ML algorithms (perhaps, another time!), but we’ll concentrate on the data requirements, data collection, and use of the recommendations.

Data for a Content Recommendation Engine

Recommendation engines need lots of data – of the right quantity and quality to recommend and recognize patterns. For example, proper data is needed to ensure that a movie recommended to a user suits the user’s watching/viewer preferences and pattern. Otherwise, it might lead to churn or poor reviews on the various app stores or online.

Let’s take a quick look at some of the important data sources (or features) for recommendation engines. I’ve divided them into two (1) data mined from the movies and (2) data mined from the user.

Movie Metadata

Movie metadata can be obtained from the studios or content creators, and it is a one-time ingest of data. Content providers can also obtain metadata from sources such as IMDb or similar rating websites/agencies in the absence of such information.

image 4
Information about a movie From Netflix

Here are some of the important movie metadata needed to make recommendations –

  1. Genre (action, comedy, romance, period-film, or a combination of genres).
  2. Language (English, French, Hindi, Kannada, Italian, Spanish, etc.)
  3. Length (short-form or long-form)
  4. Rating (G, PG, PG-13, R, etc.)
  5. Actors, Directors, Producers, Banner or Production House
  6. Year of Release

Next, let’s look at what information can be gathered about the user that is relevant to a recommendation engine.

Data about the User

Apart from movie metadata, you need to use data that describes a user’s viewing patterns, choices, likes, and dislikes. However, you need to tread carefully while dealing with user information.

Several laws exist that safeguard a user’s right to privacy, and you need to be aware of them before gathering, processing, and storing Personally Identifiable Information (PII). E.g., the European GDPR and the Brazilian LGPD laws are designed to guard a user’s right to privacy and the right to be forgotten.

But, as you’ll see next, information about the user is extremely useful in making recommendations. All you need to do is to ensure that you aren’t violating any data laws.

Here are a few data-points or features about a user that are interesting to recommendation engines.

  1. Location
  2. Language preferences
  3. Watch-time or watch duration across every dimension:
    • What does this mean? If a user watches an action movie starring Brad Pitt, it is important to record how long the user watched it.
    • This is very important because if the user chose the movie, watched it for a couple of minutes, and then stopped the movie, it is a bad signal for that movie.
    • If a movie is watched for less than X minutes (a threshold of your choice), then you shouldn’t take it as a positive sign for that movie or genre. Algorithms can use the watch time in conjunction with every other dimension such as Director, Actor, Movie Rating, etc.
  4. Up/Down Votes: these are powerful indicators of a user’s likes and dislikes. Several platforms (e.g., Netflix) allow you to up/down a movie and use this as a powerful feature in their recommendation engines.
image 5
Netflix’s User Interface shows movies that are similar to the one you are looking at.

Great – we have a rough idea about what features or data-points we need to collect in order to make recommendations. At this juncture, we need to pause and reflect on two important aspects of data – quality & quantity.

Quality and Quantity of Data Also Matters

It’s essential to feed a recommendation engine with the right quality and quantity of data to make recommendations. If you have incorrect data or insufficient data, your guesses and recommendations could be way off and lead to a poor user experience.

Imagine a user likes to see “war” and “sniper” movies but gets recommended “rom-com” movies. How do you think the user experience will turn out? Not good, right.

The fear of poor-quality recommendations is why creating, curating, collecting, and cleaning the data is vital to a recommendation engine.

  • Content providers need to work with studios or content creators to get the right metadata for each movie.
  • They need to integrate video analytics systems into their infrastructure to obtain information about the users and their viewing patterns.
  • And, of course, build tools that can combine these disparate sources of information, clean it, and make it available in the right format to their ML/AI engines.

Beyond this, the engineering teams need to ensure that the recommendations are available via scalable APIs, that the backend can handle any incoming loads, and that their system is designed in a manner that it can be consumed by the video platforms (desktop, Android, iOS, Roku, etc.) and by other departments such as marketing, and ad-ops.

A combination of information about the movies and the user’s watching patterns can be powerful in making recommendations to drive engagement and discovery.

Okay, now, here’s a question for you.

How do you make a recommendation when you have a lot of information about the movies and nothing about the user? What do you do in this situation? Let’s take a look at this next.

The “Cold-Start” Problem in Content Recommendation

Recommendation engines are typically very good at recommending content to users who’ve been on the platform for a while and for whom they have a lot of data.

But, what happens when a user signs up for the first time – i.e., a new user to the platform. The platform doesn’t have any information about the user, preferences, etc., so it is quite difficult to recommend content right off the bat.

This is called the “cold start problem” in recommendations engines. How and what do you recommend to a user you know nothing about?

One way to circumvent this problem is to use the user’s IP to geo-locate them and serve popular content in that geography. Or, if your platform collects information about their gender, age, language preferences at the sign-up stage, you can use that to make a general recommendation and learn as the user engages with the platform.

This is something similar to what Netflix does. They have a “Top 10 in India Today” banner, and without too much complexity, this could be a good solution for the cold-start problem.

Netflix Top 10 Content Recommendation
Netflix’s “Top 10 in India Today” [Credit: Netflix]

But, of course, simply throwing up location-based recommendations is not a foolproof solution for a country like India, where every state has one or more different languages (both spoken & written). So, for example, you could end up showing Tamil movies to a Kashmiri who happens to be passing through Chennai (where Tamil is the predominantly spoken language).

One simple way to avoid the cold-start is to show a menu to the user upon sign-up and ask them to choose their favorite language and genre. Then, you can use that information to show a few movies and improve upon the recommendations based on how they continue to interact with the platform.

Have you encountered a cold-start problem in your streaming service? How did you solve it? Or, was it not a problem, to begin with?

Feedback System – Monitoring the Recommendations

While designing a recommendation system, it is also important to gather data about the quality of your recommendations. E.g., if you suggest three movies to a user, does the user choose any one of those? And if yes, does the user watch the movie for more than X minutes, or does the user exit within the first couple.

The Click-Through-Rate (CTR) is a powerful indicator of how good the recommendations are and should be fed back to the AI/ML system as a correction factor.

Use Cases for Video Content Recommendation Engines

Recommendation Engines are vital to a video platform’s success and help improve content discovery, engagement, marketing campaigns, re-targeting dormant users, reducing churn, etc. Let’s look at a few of these use cases –

Increased Content Consumption

Contrary to popular belief, people don’t always know what to watch next. Instead, they depend on recommendations from friends, social media, movie reviews, etc. A good recommendation engine can gently nudge them into watching movies that they wouldn’t have considered before! If done correctly, this can increase content consumption and drive revenues via rentals, subscriptions, or advertising.

Improved Search and Auto-Completion

A platform’s search engine can also be configured to throw up suggestions based on the user’s preferences. For example, if the user types the word “The,” what would you suggest or auto-complete with

  • The Lion King” (cartoon/kids genre), or
  • The Guns of Navarone” (action)

Such a system should be designed intelligently to ensure that the auto-completion doesn’t take a lot of time because that could lead to further frustration.

Improved Catalog Discovery

It no secret that users don’t always know what they want, and the system can subtly nudge them into watching movies that they wouldn’t have searched for in the first place. This is the “content discovery” process, and by knowing a user’s profile, a content platform can suggest movies and guide the user to discover more of the content library/catalog.

A good example of this is Amazon Prime Video. If you search for the movie Interstellar, Amazon Prime also shows you movies similar to Interstellar based on different signals/features. For example, the suggestions “Inception” and “Tenet” are Christopher Nolan movies – same as Interstellar, our first search query. If done intelligently and tastefully, content recommendations can help your users explore and engage with a large section of your catalog.

related videos
Related Movies as seen on Amazon Prime Video.

Ability to Intelligently Re-Target Users

Platforms can also feed recommendations via push notifications, emails, social media feeds to target & attract users via different media. The ability to personalize emails and push notifications can be a game-changer in marketing campaigns and drive engagement.


I hope this introduction to content/movie recommendation engines was useful and gave you a glimpse into this interesting and important part of the OTT toolchain.

Before you leave, I have a question to ask you. Can you recommend movies based on what is inside the movie? Not the genre, or video metadata, but, what happens inside the movie. It would be very useful for UGC platforms like TikTok where metadata is scarce or poor.

Please subscribe to OTTVerse to get notified of future articles in this series where we will interviewing people from different OTT Platforms and Recommendation Engine providers.

Stay tuned, and happy streaming!

krishna rao vijayanagar
Krishna Rao Vijayanagar

Krishna Rao Vijayanagar, Ph.D., is the Editor-in-Chief of OTTVerse, a news portal covering tech and business news in the OTT industry.

With extensive experience in video encoding, streaming, analytics, monetization, end-to-end streaming, and more, Krishna has held multiple leadership roles in R&D, Engineering, and Product at companies such as Harmonic Inc., MediaMelon, and Airtel Digital. Krishna has published numerous articles and research papers and speaks at industry events to share his insights and perspectives on the fundamentals and the future of OTT streaming.

Leave a Comment

Your email address will not be published. Required fields are marked *

Enjoying this article? Subscribe to OTTVerse and receive exclusive news and information from the OTT Industry.