As you might have already seen on the Twitter-verse and in Bloomberg, Netflix released its top 10 movies and Extraction topped that list. The numbers in the data represent the number of views that a movie has gathered in the first 4 weeks of its release.
Looks great, right? 99M viewers for Extraction – that’s a huge win for Netflix.
But, wait. Can we trust these numbers right off the bat?
Netflix’s Data and Reporting Methodologies
The problem with Netflix is that they guard their data religiously and have never released their data in its entirety to the public. So, if they tell you that a movie was watched 100M times in Japan, you just have to take their word for it and break out a bottle of champagne if you were the producer or actor in that film.
When Netflix’s data says that “Extraction” starring “Thor” Chris Hemsworth and Bollywood star Randeep Hooda garnered 99 million views within the first 4 weeks of its release, does it mean that the movie was truly exceptional and beat every other release on Netflix in terms of audience numbers?
Well, to be honest, I really can’t say because Netflix’s data is opaque, their methodology is unknown, and their definition of a movie’s popularity using only their “starters” metric is odd to say the least.
Let me explain why.
How does Netflix Quantify Their Audience
The first step in decoding the data that Netflix released is understanding the definitions of their metrics.
Here is what Netflix shared with the United Kingdom’s House of Lords’ Communications Committee:
The information we give them mainly consists of “starters” (i.e. households that watch two minutes of a film or one episode) and “completers” (i.e. households that watch 90% of a film or season of a series) for the first seven and 28 days on Netflix. We believe that these two metrics will give our creative partners a broader understanding of how members engage with their title from start to finish. We also selectively share “watchers” (i.e. households that watch 70% of a film or single episode of a series) with both the public and with creators. Depending on how useful our partners find this data, we will consider sharing it in more countries outside Europe and North America.
Distilling this into the main classifications, we get:-
- starters: households that watch two minutes of a film or one episode
- watchers: households that watch 70% of a film or single episode of a series
- completers: households that watch 90% of a film or season of a series
Only 2 mins? Which 2 minutes – from the beginning, in the middle, or from the point they left off the previous day?
In the Q4 2019 letter shared by Netflix to its shareholders, Netflix reveals their definition of a view and its impact on its data reports. I’ve copy-pasted the relevant section here and highlighted the important parts.
As we’ve expanded our original content, we’ve been working on how to best share content highlights that demonstrate popularity. Given that we now have titles with widely varying lengths – from short episodes (e.g. Special at around 15 minutes) to long films (e.g. The Highwaymen at 132 minutes), we believe that reporting households viewing a title based on 70% of a single episode of a series or of an entire film, which we have been doing, makes less sense. We are now reporting on households (accounts) that chose to watch a given title . Our new methodology is similar to the BBC iPlayer in their rankings based on “requests” for the title, “most popular” articles on the New York Times which include those who opened the articles, and YouTube view counts. This way, short and long titles are treated equally, leveling the playing field for all types of our content including interactive content, which has no fixed length. The new metric is about 35% higher on average than the prior metric. For example, 45m member households chose to watch Our Planet under the new metric vs. 33m under the prior metric.
Buried in a footnote on the same page in the Q4 2019 shareholder letter is an explanation of what “chose to watch a given title” means –
Chose to watch and did watch for at least 2 minutes – long enough to indicate the choice was intentional – is the precise definition
Netflix says that this method of reporting is similar to what the BBC and YouTube do, and so, they aren’t doing anything funny here.
Well, summarizing everything till now, it appears that Netflix counts how many sessions lasted at least 2 minutes and this metric is used to quantify their audience.
The Issue with Reporting Based on “starters”
Portraying a movie’s popularity based solely on “starters” is where things begin to get weird for me.
Having spent a few years in the video analytics industry, I can say with absolute confidence that a large number of views does not * always * translate to popularity. There are caveats one has to consider and some amount of data cleansing and outlier removal is needed before drawing broad conclusions.
When I look at such viewership statistics, a few questions come to mind immediately.
- How long was each view (beyond the 2 minute minimum i.e.)?
- When did each view begin and end (i.e., start and end times)?
- How many views or sessions did a person take to completely watch a movie?
Note: “completely” generally means 90% of the playtime of a movie and this is considered as an acceptable threshold in the industry because people generally skip the credits.
Now, let’s understand with the help of a few examples of why these questions are important in making sense of an audience’s engagement.
Problem 1: if a person watches only 2 minutes, that should not count towards the movie’s popularity.
Example 1: if a person began to watch Extraction, watched it for 2 mins, found it utterly boring, and decided to bail on it after just 3 minutes, then do you count it as a view or a starter? Well, logically speaking – he did start the movie, but, when you take this number without context (the viewer dropped out after 3 minutes) and use it to justify a movie’s “popularity”, it’s not right.
Example 2: Let’s assume that Netflix acquires the “Oceans” franchise and decides to produce the “Oceans 21” with 21 A-Listers from Hollywood. And by a cruel twist-of-fate, they produce a movie with the worst possible storyline and direction. However, what they do get right is marketing! Netflix pulls out all the stops and does a brilliant job at marketing the movie and building up excitement for Oceans 21.
On the day of release, Oceans 21 gets millions of hits because people are excited to see George Clooney and Brad Pitt crush it! But, what’s happening?
- 10 minutes into the movie, most of the viewers realize that the movie sucks and stop watching. But, they are added to the movie’s popularity count because they watched 2 mins, right?
- a section of Netflix’s subscribers who haven’t (yet) heard that the movie sucks end up watching at least 10 mins of the movie before bleeding out of their eyes. But, they too are added to the “watch” count because they watched 2 mins, right?
- and, the die-hard fans of the Oceans’ franchise will watch it no matter what the critics say and of course, they are going to be added to the “watch” count.
Problem 2: Multiple Viewing Sessions from the Same Subscriber Aren’t Considered
The definition of “starter” also does not take consider multiple viewing sessions and this is an issue. Let’s take a simple example to understand more.
Case A: I might have watched a single movie over a period of 10 days watching 10-15 mins each day, because I need to split my day between work, babysitting my toddlers, yard-work, etc.
Case B: Now consider another situation where I had the entire Saturday night to myself and watched an entire 1.5 hour movie in one sitting.
So, is A more popular than B? It will under Netflix’s algorithm because A had 10 “starters” and B had only one.
Sounds wrong, doesn’t it? Okay, now let’s look at the problem from a different angle.
Does Netflix’s data Translate to IMDB and Rotten Tomato ratings?
Take a look at the IMDB and Rotten Tomato ratings for Netflix’s Top 10 list because these ratings play a huge role in a movie’s continued – importantly, after the marketing hype has died down.
We compiled the IMDB and Rotten Tomato ratings (as of July 16th 2020 at 6 am UTC) and here is what the data shows.
At first glance, you might think that IMDB appears to agree with Netflix’s popularity chart, but, The Irishman throws a curveball coming in at 7.9 in IMDB and 96% in Rotten Tomato while languishing at #6 in Netflix. The same goes for The Platform which has great ratings from both IMDB and Rotten Tomato, but is at the 9th position in Netflix’s data.
Let’s look at this data in a different perspective.
Below is a table where we have ranked the movie based on the audience’s ratings/reviews on IMDB and Rotten Tomato. We have highlighted three movies in particular – Extraction, The Irishman, The Platform to show how different they are ranked based on the data from Netflix, IMDB, and Rotten Tomato.
Quit different, eh?
The Platform’s ranking goes to show that a movie might not have the same number of views as more “popular” movies, but, can be highly rated by the audience. So, does Netflix’s methodology have flaws?
What Data from Netflix Will Help?
- Viewing Trends for 4 weeks: Instead of providing a single, aggregated number, it will be great if a trendline of the number of “watch”s is released. This will help us understand the influence of the first few days of release on the overall aggregated figure and how a movie fares after the initial hype has died down.
- Number of Sessions Required to achieve a “Complete” Watch: How many sessions does the average user take to watch a movie completely? If the average viewer take 3 attempts to watch a movie fully, and if the movie has 30 million views, it probably is safe to assume that the movie wasn’t watched fully 30 million times. Right?
- Session Durations as a function of the length of the Content: preferably represented as a histogram.
- Internal Ratings: Can Netflix tell us how many people pressed the thumbs-up/down icons as a sense of their audience’s ratings?
- Re-watch metrics: Though not a vital measure of popularity, it will be really cool to see how many people watch a movie multiple times and the gap between those sessions.
I truly believe that Netflix has a great lineup of movies and some of them are insanely good. But when it comes to declaring a movie as a blockbuster, I don’t think measuring the number of people who watch at least 2 mins is a good metric.
I think the people at Observer sum it up perfectly when they say this about Netflix’s ratings and data.
It’s like a high school teacher giving you a B- while your friend gets a smiley face.