What is Per-Title Encoding?

Per-title encoding is the science of tuning of the ABR bitrate ladder for each movie based on its unique spatial & temporal nature and complexity to save bitrate, storage space, and ABR transmission bandwidth. In other words, the aim of per-title encoding is to generate a different set of encoding or compression parameters for every movie based on its characteristics (slow-moving, sports, anime, cartoon content, etc.)

In this article, we shall learn about per-title encoding, the process involved, and see what benefits per-title encoding can provide to streaming providers.

Table of Contents

What is Per-Title Encoding? Where Did It Start?

One of the early mentions of per-title encoding came from Netflix on their blog and a follow-up IEEE publication titled “Complexity-based consistent-quality encoding in the cloud”. An interesting sentence from this paper’s abstract reads as follows –

To produce the best quality video streams, the system needs to adapt the encoding to each piece of content, in an automated and scalable way. In this paper, we describe two algorithm optimizations for a distributed cloud-based encoding pipeline: (i) per-title complexity analysis for bitrate-resolution selection; and (ii) per-chunk bitrate control for consistent-quality encoding. These improvements result in a number of advantages over a simple “one-size-fits-all” encoding system, including more efficient bandwidth usage and more consistent video quality.

The sentence “To produce the best quality video streams, the system needs to adapt the encoding to each piece of content” sums it up perfectly.

The encoder needs to “understand” each and every video content and adapt its settings and parameters to compress it such that the best video quality possible is achieved.

What happens in Traditional ABR and Compression?

In traditional video streaming approaches using ABR streaming, the general idea is to create a bitrate ladder (or a set of profiles) and use the same for all the movies in the library. Read this article for a quick introduction to ABR video streaming.

For example, the bitrate ladder can have a 6mbps 1080p profile and this is applied to all genres – be it, anime, or sports, or a talk show.

However, there is a problem with this approach and it has to do with the nature and complexity of every movie.

All movies are not the same visually, right?

Some have fast action scenes (sports, action genres) and some are more slow-moving (Shawshank Redemption). While others are animated with low-textural features (Simpsons) and some are highly-detailed (Toy Story). So, every movie has its own DNA and characteristics that makes it different from every other movie ever produced!

So, then why should every movie be compressed in the same way, using the same encoder settings, and using the same bitrate ladder for ABR video delivery?

For example, take a look at the three screenshots below from Simpsons, a soccer game, and the Park Joy test sequence. They all look different, right?

good bitrate video encoding — Easy to Compress!

soccer match augmented audio — Really hard to compress!

parkjoy — Very hard to compress due to the water, grass, and leaves!

Now, these examples were subjective and relied on your capacity to figure out what looks better/worse. Let’s look at some experiments with numbers from Netflix’s blog. The RD plot below plots the bitrate vs. video quality (PSNR) for different sequences at different target bitrates.

Look at the diversity in this plot! At 5000 kbps, some sequences have incredibly high PSNR scores of 45 dB and more, while others are at 36 dB. This clearly shows that no two videos are equal and they should all be treated on their individual merit.

In more technical terms, one would say that there is a difference between the spatio-temporal complexities and natures of these videos, and it is a great idea to take advantage of this to compress the videos better.

Per Title Encoding Bitrate PSNR graph — Source: Netflix’s blog

Hence, the name “per title encoding” – encoding that varies or adapts from one video to another.

What Variables Can Be Changed on a Per-Title Basis?

There are many encoding and transmission parameters that can be varied on a per-title basis such as,

The resolutions chosen in the bitrate ladder: some titles might produce great quality content at 720p and for such videos, you might not have to produce a 1080p to get higher quality content. To learn more about bitrates and resolutions, please read this article on OTTVerse.com.
The bitrate chosen for each resolution: this is the most important part of per-title encoding. If you are forced to product a set of video reqolutions (1080p, 720p, etc.) then you can vary the bitrates for each of these resolutions. That is, instead of producting 1080p at 6mbps, you might find yourself producing 1080p at 3mbps and achieving the same video quality!
The number of profiles in the bitrate ladder: This again is a big advantage of per-title encoding. By playing around with the bitrate – resolution combinations, you might be able to reduce the number of profiles that you need to produce in the bitrate ladder to begin with!

These parameters exist on a larger-scale. On a more granular level, you can go into the encoder’s settings and tweak,

the strength of the filters,
GOP length
enable or disable sub-pel or quarter-pel motion estimation,
search range for motion estimation
the GOP structure (ratio of P to B-frames),

and much more depending on how your video codec is set up. The prime focus here should be in getting an understanding of the complexity of your video, the capabilities of your video codec, and how to combine all your data & video intelligence to better compress your videos.

How is Per-Title Encoding Implemented?

The most important aspect of per-title encoding is the ability to “understand” a movie’s complexity, its scenes, the variations in it, etc. The way to do this is to gather information and statistics about the movie and use that data to compress it.

This leads us to the concept of multi-pass encoding, where the first pass (or N passes) are used to gather information about the movie. In the final M passes, the video is encoded using this information.

What sort of information is helpful in understanding the complexity of a movie? Let’s take a look.

1. Global speed or motion vectors: this tells us how fast the scenes are moving and can be used to distinguish between a talk show where nobody is moving and an NFL games that is filled with fast camera panning.
2. Spatial complexity: are a majority of the frames filled with plain-colored blocks like the Simpsons or is it filled with complex patters like period films?
3. Temporal Complexity: again this relates back to the global motion vectors and speed to understand how quickly the movie’s content changes from one frame to another.

These are very important video characteristics that determine how well a video can be compressed given a certain bit-budget. In simpler terms, if you know the “nature” of your video, you can tweak the encoding settings to get the best video quality if asked to compress that video to some X mbps.

So, after you gather all of this information, you perform another pass on your video codec to compress it to the correct bitrate as determined by your convex hull.

Advantage of Per-Title Encoding

There are huge advantages to executing per-title encoding such as,

Storage Savings: by varying the bitrate and resolutions on a per-title basis, you can compress the video well and this will help you save on storage space.
Transmission Savings: since each title is encoded using a bitrate ladder that is most suited to it, you will immediately see savings on CDN delivery costs. Additionally, the end-user will also download smaller files and should see lower incidences of buffering and smaller start-up delays.
Encoding-time Savings: again, since the bitrate ladder is tuned for every movie individually, you can easily see savings on the encoding time requirements. For example, instead of using 1080p for encoding the Simpsons clip, if we use 720p and get the same visual quality, then the drop in resolution increases the encoder’s speed. This is primarily because the motion estimation and compensation algorithms need to do less work due to the reduction in the resolution.
Improvement in Quality: by fine-tuning the encoder, resolution, bitrate, frame-rate, and other settings on a per-movie or a per-title basis, you can extract the most out of an encoder and get the best video quality out of it. This in turn will translate to a great customer experience!

So, by switching to a per-title encoding scheme, you stand to save a lot on your storage, transmission, and encoding-time costs.

Until next time, take care and happy streaming!

Krishna Rao Vijayanagar

Founder at OTTVerse

Krishna Rao Vijayanagar, Ph.D., is the Editor-in-Chief of OTTVerse, a news portal covering tech and business news in the OTT industry.

With extensive experience in video encoding, streaming, analytics, monetization, end-to-end streaming, and more, Krishna has held multiple leadership roles in R&D, Engineering, and Product at companies such as Harmonic Inc., MediaMelon, and Airtel Digital. Krishna has published numerous articles and research papers and speaks at industry events to share his insights and perspectives on the fundamentals and the future of OTT streaming.