In this article, we compare the quality (objective & subjective) and speed of LCEVC with H.264/AVC as its base-codec vs. H.264/AVC using FFmpeg. Let’s take a look at the experiments and the results, shall we?
The LCEVC Codec (MPEG-5 Part 2) or “Low Complexity Enhancement Video Coding” is one of the three new codecs being introduced by MPEG (others being VVC and EVC). LCEVC aims at increasing compression efficiency for existing codecs at little to no increase in coding complexity by using a base bitstream and an enhancement bitstream.
As described in our comprehensive guide to LCEVC, the LCEVC codec (Low Complexity Enhancement Video Coding) is “a codec to improve other codecs” with a low complexity-overhead. The LCEVC codec’s output is a combination of a “base bitstream” produced by a video codec such as AVC, HEVC, VP9, AV1, etc. along with an enhancement layer that can be used to improve the quality of the video.
If the decoder/end-device supports LCEVC, the enhancement layers are decoded, else, the base codec alone is used to decode the bitstream and the video is rendered to the user. This fallback mechanism ensures backward-compatibility and encourages roll-out of the LCEVC codec without the fear of breaking the user’s experience.
Background to this article
After writing an article on the LCEVC standard, I wanted to test it and see the results for myself. I got in touch with V-Nova (Fabio Murra, Anthony Concannon) and asked them if I can get my hands on some test software to run experiments and see the performance of LCEVC for myself. They agreed, and within a week, I started encoding using V-Nova’s LCEVC encoder and analyzing the results for myself.
Before we proceed with the codec evaluation, I’d like to highlight a few points.
- Encoders are complex and come with several tuning parameters designed to help you compress video to your liking and requirements. And, consequently, every codec comparison has its own idiosyncracies.
- I am sure that once the LCEVC standard is released and encoder implementations enter the market, you’ll see more codec comparisons like the one presented here with different sequences, tuning parameters, presets, etc. That is the nature of codec comparisons.
- I used open-source video test sequences so that others can reproduce the results I got. You can download them from Xiph.org
My goal here is to showcase the difference between the two codecs and give you an understanding of what gains are there to be had if you use LCEVC.
With that short introduction, let’s get down to business and see how LCEVC fares in comparison to H.264/AVC.
Test Video Sequences
I chose two popular test videos to test LCEVC. They are –
1. Park Joy, 50 fps, 1080p
ParkJoy is a favorite among video compression teams, and it is interesting because of how the camera follows the actors. The camera movement gives an illusion of the people running in place with the background moving around them. ParkJoy is also characterized by highly textured grass, water, and trees that appear suddenly and close to the camera. All these elements make it a problematic sequence to compress, making the ParkJoy sequence interesting. You can download the original y4m file here.
2. CrowdRun, 50 fps, 1080p
The CrowdRun sequence (download from xiph) is also popular because it has so many different elements that are hard to compress. There are sections of grass with fine details, a whole lot of people running in one direction (with clearly visible facial expressions), a tree right in the middle of the scene, and a lot of texture in the background (trees, clouds, etc.). Everything that’s needed to trouble an encoder is here!
V-Nova provided me with an FFmpeg build (4.3.1) with LCEVC enabled. It hasn’t been released to the public yet, but if you are interested in getting your hands on an LCEVC encoder and decoder, please reach out to the V-Nova team, and they’ll help you out.
Note: you need both an LCEVC-enabled encoder and decoder to test LCEVC which is what V-Nova gave me (
ffplay with LCEVC enabled).
As with all codec analysis, I ran the chosen sequences through the LCEVC encoder with H.264/AVC (libx264) as its base-codec and then ran the same sequence through H.264/AVC (libx264) using a range of bitrates in CBR mode.
The inputs are 1080p @ 50 fps and the same is retained for the output (1080p50).
I disabled tuning (
psnr, etc.) and chose a 1 sec GOP size for the experiments.
- I computed the PSNR and VMAF values at different bitrates and used them to evaluate the codecs objectively.
- I did side-by-side visual comparisons to judge which codec did better. There are screenshots pasted below to show you what I saw.
Ok, let’s look at the tests and results now.
ParkJoy: LCEVC vs. AVC
Expt 1. CBR using an IPPP Structure (no B-pictures)
For the first set of experiments, I chose a simple “IPPP” GOP structure without any B-pictures. With this fixed GOP structure, I was able to take any scene-change detection, or dynamic-GOP-length (or mini-GOP) decision algorithms out of the picture, and keep the focus purely on the way the two codecs are designed.
Here is an example of encoding with LCEVC using FFmpeg. You need to specify a set of parameters called the
eil_params that are then transmitted to the base codec which is H.264/AVC (libx264) in this case.
For example, in the command line below, I instruct the base codec to disable B pictures, to use CBR encoding, disable scenecut detection, and use the
veryslow preset. The
threads 1 is to ensure repeatability between H.264/AVC encodes (LCEVC is deterministic).
Note: For speed comparisons,
threads 1 is disabled.
ffmpeg.exe -i park_joy_1080p50.y4m -c:v lcevc_h264 -base_encoder x264 -threads 1 -r 50 -g 50 -b:v 1200k -eil_params "rc_pcrf_base_rc_mode=cbr;bframes=0;preset=veryslow;rc_pcrf_ipp_mode=1;scenecut=0" parkjoy_1080p50_1200k_lcevc_ipp.mp4
The equivalent AVC command line is –
ffmpeg.exe -i park_joy_1080p50.y4m -c:v libx264 -threads 1 -r 50 -g 50 -b:v 1200k -bufsize 1200k -maxrate 1200k -sc_threshold 0 -preset veryslow -bf 0 parkjoy_1080p50_1200k_avc_ipp.mp4
After repeating the encodes over a range of bitrates, I plotted the PSNR and VMAF values and here are the plots.
The RD-Plot above shows a substantial gain for LCEVC vs. vanilla H.264/AVC. At very high bitrates, the PSNR values converge, but, at lower bitrates, the gap is substantial.
If we compute the BD-Rate using the PSNR-Bitrate data, we get a value of
-28.07% and this is huge. It implies that our tests were able to demonstrate that LCEVC can give a 28% average bitrate savings over H.264/AVC for equivalent video quality (computed using PSNR).
Here are the VMAF results plotted against the bitrate values. Again, as you can see, there is an immediate benefit to using LCEVC over AVC alone.
Expt 2. CBR using B-pictures
This command line is similar to what was used for the IPP mode, except, we enable B-frames. So, the encoder is allowed to use B-pictures and decide for itself, how many B-pictures to use, and where to place them.
The LCEVC command line is shown below.
ffmpeg.exe -i park_joy_1080p50.y4m -c:v lcevc_h264 -base_encoder x264 -threads 1 -r 50 -g 50 -b:v 1200k -eil_params "rc_pcrf_base_rc_mode=cbr;preset=veryslow;scenecut=0" parkjoy_1080p50_1200k_lcevc.mp4
The H.264/AVC equivalent command used is –
ffmpeg.exe -i park_joy_1080p50.y4m -c:v libx264 -threads 1 -r 50 -g 50 -b:v 1200k -bufsize 1200k -maxrate 1200k -preset veryslow -sc_threshold 0 parkjoy_1080p50_1200k_avc.mp4
Here are the PSNR and VMAF values plotted against the bitrates.
Again, as you can see, there is an immediate benefit to using LCEVC over H.264/AVC alone with a -20% BD-Rate value that indicates a savings of 20% for LCEVC over AVC at equivalent video quality.
Now, let’s switch over to the CrowdRun sequence and repeat these tests.
CrowdRun – LCEVC vs AVC
I then repeated the experiments using the CrowdRun sequence, 1080p50 with the output being 1080p50.
Expt 1. CBR using an IPPP Structure (no B-pictures)
Here are the results for the CrowdRun sequence with only P-pictures.
Similar to what we saw in the ParkJoy experiments, LCEVC delivers a substantial gain in terms of video quality at a given bitrate. This is backed up by the PSNR and VMAF data.
Expt 2. CBR using B-pictures
Here are the results for the CrowdRun sequence after enabling B-pictures.
Again, we can see that the difference is quite obvious and that underscores the benefit of adopting the LCEVC coding standard.
The PSNR vs Bitrate plot translates to a 24.53% bitrate savings (using BD-Rate calculations) for LCEVC over AVC at equivalent video quality.
Whatever the objective metrics say, I believe that visual comparison is important and has to be performed when you are comparing codecs. You cannot rely only on the objective metrics to derive your conclusions
So, let’s take a look at screenshots from the ParkJoy and CrowdRun experiments. I took care to ensure that the same picture-types are being compared.
The images below are from the ParkJoy sequence encoded at 108p50, 3600 kbps. The image on the left is from the LCEVC bitstream and the image to the right is from the AVC bitstream.
If you click on the images above and switch back and forth, it’s quite easy to see the difference in the quality between LCEVC and AVC especially on the grass bank, and the bark of the tree. If you look closely, you can see that there is a clear difference in the edges around the people running on the bank of the canal.
To get a better look, I cropped the images and juxtaposed them. The differences should be quite clear now (click on the image to open it in full-screen mode).
Here is another set of images from around the 4th second of the CrowdRun sequence encoded at 3600 kbps and the difference is stark!
And here is the same set of images juxtaposed. Interesting, isn’t it?
Let’s move on to CrowdRun.
Below are a couple of screengrabs of the same frame from an LCEVC bitstream and an AVC bitstream (both encoded at 3600 kbps). If you look at the tree, the background, the grass, and the people (literally, everywhere!), you’ll see much better preservation of texture in LCEVC than AVC.
And here is the same set of images juxtaposed to highlight the compression artifacts (especially on the tree).
Using PSNR, VMAF tuning
A lot of encoders today come with a PSNR or VMAF tuning option. So does V-Nova’s LCEVC encoder and libx264.
As I said at the onset, tuning an encoder is a both an art and a science; and most encoders today come with a battery of tools to help you fine-tune your encoder settings. You might choose to tune for PSNR, VMAF, SSIM, Visual Quality, etc and all are perfectly valid choices! In fact, certain coding tools result in a drop in PSNR but an increase in perceived visual quality – softening for example.
I’ll update this article with some results from
psnr tuning in a couple of days. In the meantime, you can check out Jan Ozer’s article comparing LCEVC and AVC where he uses
vmaf tuning presets.
Okay! Where are we now?
We’ve demonstrated better performance (28% gain) using PSNR, VMAF, and Subjective viewing. But, one of the stated objectives of LCEVC is to deliver these results at the same (or lower) computational complexity.
Let’s test this hypothesis next.
Due to the way LCEVC is designed, you are literally guaranteed a speed-up – whichever base-codec you choose to use. Why do I say this?
It’s actually very simple. Anyone familiar with video compression will realize that it takes more time to compress a stream to 1080p than it takes to compress to 540p. Right?
The number of pixels are now much lower, the motion estimation and compensation has fewer searches to perform, and everything in the codec gets scaled down in terms of number operations that need to be performed.
For example, if you take a 1080p input and encode it to 1080p and compare that to taking a 1080p input and encoding it to 540p, what do you get? Here are the commandlines using FFmpeg and
libx264 if you want to give it a try.
Encoding to 1080p –
ffmpeg.exe -i touchdown_pass_1080p30.y4m -c:v libx264 -crf 18 -pix_fmt yuv420p -preset veryslow touchdown_pass_1080p_2997fps.mp4
Encoding to 540p –
ffmpeg.exe -i touchdown_pass_1080p30.y4m -c:v libx264 -crf 18 -pix_fmt yuv420p -vf scale=960x540 -preset veryslow touchdown_pass_1080p_2997fps.mp4
On my Thinkpad T490 (i7, 10th gen processor and 16GB RAM), it takes 90 seconds for the 1080p encoding and 41 seconds for the 540p.
This is in essence what LCEVC is doing, isn’t it? It downsamples the video and then sends it to the base encoder for compression. And, this provides a speed-up.
Let’s test this hypothesis.
I took a couple of the tests from earlier using the CrowdRun sequence and re-ran it several times using both LCEVC and H.264/AVC. I recorded the time it took to execute on my machine (a Thinkpad T490, i7 10th gen processor, and 16GB RAM) using FFmpeg’s
Here are the results (averaged over multiple runs).
|LCEVC with H.264/AVC base (libx264)||25 seconds|
|H.264/AVC (libx264)||80 seconds|
The difference is amazing! I re-ran the tests 5 – 6 times, and each time, I saw the same results. The 3x speed-up is impressive and it is something that video compression teams should look at quite seriously.
Next Steps for LCEVC?
I asked V-Nova about the next steps for LCEVC and based on our discussion, MPEG-5 Part 2 LCEVC ISO/IEC 23094-2 standard was elevated to Final Draft International Standard (FDIS) status in the last 132nd MPEG meeting in the second week of October 2020.
That’s good news and it means that the official press release and encoder + decoder implementations are not far away!
Let’s sum up what the objective and subjective tests say. You can get a 28% gain using LCEVC for equivalent video quality and a 3x speed-up!
This hits all the right notes –
- an increase in compression efficiency (equivalent or better video quality at the same bitrate)
- faster encoding
The fact that LCEVC is much faster than the base-codec that it is optimizing is a welcome departure from what is generally expected from the “next generation of video codecs” (i.e., better video quality at the same bitrate, but at a 30 – 40% increase in complexity).
I am looking forward to LCEVC being ratified as a MPEG standard and seeing it being used commercially.
If you want to get your hands on V-Nova’s LCEVC encoder, please contact them at [email protected] and someone will get in touch with you.
I thank the V-Nova team for entertaining my request to test and report on LCEVC. Special thanks to Fabio Murra, Anthony Concannon, Pierdavide Marcolongo, Rick Clucas, Guendalina Cobianchi, and Matt Hughes for answering my questions and putting up with my requests!
Until next time, take care and good luck. Please subscribe to get notified of future articles directly in your inbox.
Krishna Rao Vijayanagar
I’m Dr. Krishna Rao Vijayanagar, founder of OTTVerse. I have a Ph.D. in Video Compression from the Illinois Institute of Technology, and I have worked on Video Compression (AVC, HEVC, MultiView Plus Depth), ABR streaming, and Video Analytics (QoE, Content & Audience, and Ad) for several years.
I hope to use my experience and love for video streaming to bring you information and insights into the OTT universe.