What is Video Pre-processing in Encoders?

Video Pre-Processing is a very important step in any commercial encoder where several important operations such as de-interlacing, up/down-sampling, denoising, etc. are performed. Though it is not a part of any video codec or video coding standard, it is important to understand what happens in a pre-processor because of its impact on video compression efficiency.

In this article, let’s take a look at some important video pre-processing steps, shall we?

Table of Contents

Interlaced to Progressive Conversion (De-Interlacing)

De-interlacing is a common scenarion where the input is interlaced video and the output needs to be in progressive format.

Interlaced video was developed for formats such as NTSC and PAL where alternating lines are displayed that are taken from two separate fields which were captured at slightly different times. So, you display the odd-numbered lines and then the even-numbered lines. This is done so fast that it gives the impression of a complete image.

video preprocessing interlaced — Interlaced Image (left) and Progressive Image (right). Credit [1]

But, if you are given an interlaced video and asked to produce progressive output, you need to do some work. In this case, you’ll need to interleave the top and bottom fields from an interlaced movie, apply some cleanup-filtering to remove any distortions, and then send it to the encoding pipeline.

Or, you could simply duplicate the rows present in the field (also known as “bobbing”).

De-interlacing has been studied extensively for the past couple of decades (or more) and there are several good products and algorithms out there for you to choose from. Whichever algorithm you choose to use to for de-interlacing, you are bound to have some combing artifacts. So, it is always worth the money investing in a good de-interlacer.

Image Resizing

This is another common pre-processing step in video encoders. For example, if your input video is 1920x1080p @ 60 fps, and you want the output to be 640x480p @ 60 fps, then you need to resize the frames before sending it to the codec pipeline.

Image resizing is exceedingly common in OTT compression workflows, where you have several different resolutions in your bitrate ladder.

video preprocessing — Converting an input video frame to several different resolutions

How do you resize images though? The most naive way of image resizing is to simply throw away unwanted pixels or add new pixels during the resizing process, but this can lead to very annoying visual artifacts.

Modern encoders and video pre-processors use well-researched filters such as the bicubic, bilateral, trilateral, gaussian, or lancsoz filters in the image re-sizing process.

Frame-Rate Conversion

Let’s assume your input video’s resolution is 1920x1080 pixels at 60fps, and you want a 30fps output, then you’ll have to use an algorithm to convert the frame rates as requested.

Frame-rate conversion works both ways – you might need to discard every nth frame if you are going from a higher frame-rate to a lower one, or you might have to add frames if you want to go from a lower frame rate to a higher one.

When you attempt at increasing the frame-rate by either frame-stuffing or frame-doubling, you need to take a lot of care so as to not introduce video artifacts and succeed at making the video look normal and not cartoonish. Frame-rate conversion is a rich & wonderful area of research, actually!

Noise Removal

It is common for encoders to have their own proprietary noise removal algorithms to clean up the video before compressing it. Generally, these noise removal processes result in softer images due to the Gaussian noise removal filters used, but, this sometimes helps with compression efficiency.

Note: In a future article, we’ll tackle two important concepts in video compression (Transform and Quantization) and the effect of filtering on compression efficiency will begin to make sense.

Scene Change detection

For efficient video compression, it important to know when the scene changes in the video you are trying to compress.

If you know what prediction is, you’ll realise that it is useless to predict or find commonalities between two very different images. It is like searching something common between a black image and a white image – you won’t find anything.

Hence, the need for detecting where the scene changes in a movie – so that you don’t try and predict across such a scene change.

Note: If you haven’t understood this concept, don’t worry for now. After you get through the articles on Prediction and Motion Estimattion (I, P, B pictures), everything will start making sense.

Conclusion

There are obviously more algorithms and functions that fit the video preprocessing bill, but, I’ll stop here.

The reason I wanted to talk about video pre-processing is to show you how important pre-processing is and how much innovation can take place here.

Many people assume that it is only the codec that matters, but, that is wrong.

Any one of you reading this article can come up with a superior scene change detection algorithm, or a noise removal filter, or a frame-rate convertor and take the industry by storm by contributing it back to open-source codecs.

Credits

De-interlacing picture taken from IBM

Krishna Rao Vijayanagar

Founder at OTTVerse

Krishna Rao Vijayanagar, Ph.D., is the Editor-in-Chief of OTTVerse, a news portal covering tech and business news in the OTT industry.

With extensive experience in video encoding, streaming, analytics, monetization, end-to-end streaming, and more, Krishna has held multiple leadership roles in R&D, Engineering, and Product at companies such as Harmonic Inc., MediaMelon, Airtel Digital, and Visionular Inc.. Krishna has published numerous articles and research papers and speaks at industry events to share his insights and perspectives on the fundamentals and the future of OTT streaming.

andy li

November 23, 2021 at 2:40 pm

dear sir,where can i find your excellent article ” In this article, let’s take a look at some important video pre-processing steps, shall we?”, or maybe the iem is “study on deep CNN as pre-processing for video compression”. am i right ? tank you very much,sir!