In this article, let’s learn to compute VMAF, PSNR, and SSIM using FFmpeg. While working on videos (compression or post-processing), it is common to compute objective metrics in addition to doing subjective Visual Quality testing. If you are using FFmpeg in your workflow, it is very easy to compute these metrics instead of purchasing expensive tools.
Let’s go ahead and take a look at these metrics and how they are computed, shall we?
What is VMAF?
VMAF from Netflix stands for Video Multi-method Assessment Fusion, and it is a video quality metric that combines human vision modeling with machine learning. It’s become very popular as it succeeds (not fully) at automating subjective quality testing that usually requires humans to watch and score videos.
FFmpeg and Netflix’s VMAF are now part of every video processing and compression engineer’s toolbox. To understand how to install VMAF, please read our “VMAF Installation” article.
How to Compute VMAF using FFmpeg
Let’s look at a couple of ways of computing VMAF.
If your source and destination videos are of the same dimensions (height and width), then you can directly compute the VMAF value using the command line below.
All you need to do is point FFmpeg to the location of the VMAF model file and it will do the rest.
ffmpeg.exe -i videoToCompare.mp4 -i originalVideo.mp4 -lavfi libvmaf="model_path=vmaf_v0.6.1.pkl":log_path=vmaf_logfile.txt -f null -
However, in case the source and destination videos are not of the same resolution, then you have to ensure that the destination is brought to the same resolution as the source video before computing VMAF.
Here is a simple one-liner that uses the
filter_complex to do the resolution change and then computes the VMAF value. I have used the
bicubic filter to do the up/down scaling.
bin/ffmpeg -i test_720p30.mp4 -i test_1080p30.mp4 -filter_complex "[0:v]scale=1920x1080:flags=bicubic[main]; [1:v]scale=1920x1080:flags=bicubic,format=pix_fmts=yuv420p,fps=fps=30/1[ref]; \
-f null -
The log file that is produced by the VMAF computation is very comprehensive. It provides a frame-wise score of VMAF and its constituent metrics, along with the aggregate score (if that’s all you are looking for).
Here is an example of the VMAF file.
Check out EasyVMAF – a tool for making VMAF calculations simple.
What is PSNR and How Is It Calculated?
Peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation.
So, for a video, you are essentially trying to compute how much noise or pixel corruption has been introduced due to the video compression process which is essentially lossy (mainly due to Quantization).
The first step is to compute the Mean Squared Error or MSE. The formula is shown below where I and K represent the original and destination images respectively and
n are the height and width of the images respectively.
Once you have the MSE, you can compute the PSNR using the formula shown below.
Here, MAXI is the maximum possible pixel value of the image. When the pixels are represented using 8 bits per sample, this is 255.
How to Calculate PSNR using FFmpeg?
You don’t have to go through all those calculations while using FFmpeg. All you need to do use the filter module
lavfi and tell it to compute the PSNR. That’s it.
ffmpeg.exe -i videoToCompare.mp4 -i originalVideo.mp4 -lavfi psnr=stats_file=psnr_logfile.txt -f null -
FFmpeg will print the average PSNR on the console while the log file will contain a frame-wise list of the MSE and the PSNR for the Luma and Chroma planes (y, u, and v). An example is shown below –
n:1 mse_avg:54.96 mse_y:72.51 mse_u:27.98 mse_v:11.74 psnr_avg:30.73 psnr_y:29.53 psnr_u:33.66 psnr_v:37.43 n:2 mse_avg:69.70 mse_y:93.80 mse_u:31.01 mse_v:12.02 psnr_avg:29.70 psnr_y:28.41 psnr_u:33.22 psnr_v:37.33 n:3 mse_avg:72.74 mse_y:98.37 mse_u:31.02 mse_v:11.96 psnr_avg:29.51 psnr_y:28.20 psnr_u:33.21 psnr_v:37.35 n:4 mse_avg:73.11 mse_y:98.94 mse_u:31.14 mse_v:11.76 psnr_avg:29.49 psnr_y:28.18 psnr_u:33.20 psnr_v:37.43
How to Calculate SSIM using FFmpeg?
Finally, let’s take a look at computing SSIM using FFmpeg. Again looking at the definition of SSIM,
The structural similarity index measure (SSIM) is a method for predicting the perceived quality of digital television and cinematic pictures, as well as other kinds of digital images and videos. SSIM is used for measuring the similarity between two images. The SSIM index is a full reference metric; in other words, the measurement or prediction of image quality is based on an initial uncompressed or distortion-free image as reference.
Calculating SSIM using FFmpeg is very similar to calcuating PSNR.
ffmpeg.exe -i videoToCompare.mp4 -i originalVideo.mp4 -lavfi ssim=stats_file=ssim_logfile.txt -f null -
The output of this command is going to look like this –
[Parsed_ssim_0 @ 0000029c82894300] SSIM Y:0.926845 (11.357537) U:0.876798 (9.093807) V:0.860658 (8.559193) All:0.907472 (10.337287)
And the log file will give you information on the SSIM values for each of the planes (Y, U, and V) along with the aggregate for each of the frames of the video.
n:1 Y:0.930033 U:0.926453 V:0.913508 All:0.926682 (11.347897) n:2 Y:0.919140 U:0.915343 V:0.910900 All:0.917134 (10.816226) n:3 Y:0.922417 U:0.915795 V:0.910959 All:0.919404 (10.936853) n:4 Y:0.920077 U:0.916443 V:0.912994 All:0.918291 (10.877300) n:5 Y:0.926438 U:0.927597 V:0.917905 All:0.925209 (11.261518) n:6 Y:0.920619 U:0.919988 V:0.918780 All:0.920207 (10.980375)
That’s it folks – now you know how to calculate three very important objective metrics for video quality evaluation — PSNR, VMAF, and SSIM. By computing all three and using them for analysis, it’s hard to be thrown off by artifacts that one metric can catch and the other can’t.
And like always, do spend a few minutes doing some visual testing (subjective) after you’re done with the objective scores!
If you enjoyed this post, do check out the rest of OTTVerse’s FFmpeg tutorials to learn more about how to use FFmpeg to improve your video editing, compression, and processing skills!