Enhancing the x265 Open Source HEVC Video Encoder: Novel Techniques for Bitrate Reduction and Scene Change Detection

The demand for reducing video transmission bitrate without compromising visual quality has increased due to increasing bandwidth requirements, especially with the emergence of higher device resolutions. The HEVC video coding standard is suited to solve this problem by delivering high video quality at considerably lower bitrates than its predecessor (H.264/AVC).

Ashok is the Solution Architect for Media and Entertainment BU in MulticoreWare with 15+ years of experience in multimedia-embedded DSP systems and implementing and optimizing video codecs.

Santhoshini Sekar heads the Video Codecs and Video Solutions Engineering team at MulticoreWare, playing a key role in MulticoreWare’s x265 open-source HEVC encoder project and leads the x266 project.

Shivakumar Narayanan heads the Media and Entertainment BU focusing on AI-Enabled Video Codecs, Media Solutions, and Services.

This blog post explores two novel techniques to enhance the x265 open-source video encoder: a motion-compensated spatio-temporal filtering scheme and a histogram-based scene change detection scheme.

These techniques contribute to improved coding gains and more efficient video encoding.

Table of Contents

Technique 1: Motion-Compensated Spatio-Temporal Filtering Scheme

The x265 encoder, compliant with the H.265/MPEG-HEVC video coding standard, has gained significant popularity among open-source frameworks, broadcast, and streaming service providers. It incorporates nearly all the tools defined in the HEVC standard and features algorithmic optimizations that balance encoder performance and output quality.

The encoder maximizes performance on x86 CPUs by utilizing AVX2 and AVX-512 instructions, further enhancing efficiency.

Recent x265 development efforts have focused on enhancing the motion-compensated spatio-temporal filtering (MCSTF) employed as a pre-processing step to improve coding gains further.

This scheme effectively reduces noise in pictures, particularly in cases where the content contains a high noise level. By utilizing motion vectors obtained from motion estimation across different video content resolutions, the MCSTF identifies optimal temporal

correspondence for low-pass filtering. The hierarchical motion estimation scheme enables efficient motion vector processing, contributing to improved noise reduction capabilities.

Target Audience and Usability:

The motion-compensated spatio-temporal filtering scheme is a powerful tool for reducing noise in videos encoded with the x265 video encoder. It is beneficial in scenarios where the content contains a high noise level. Applications and users who would find this technique valuable include:

(1) Broadcasters and streaming service providers: By leveraging the benefits of the HEVC video coding standard, these providers can significantly reduce the bitrate of their over-the-top (OTT) content without compromising visual quality.

The motion-compensated spatio-temporal filtering scheme enhances encoding, improving coding gains and more efficient video transmission.

(2) Video streaming platforms and content delivery networks (CDNs): With the ever-increasing demand for streaming high-quality video content, these platforms can leverage the motion-compensated spatio-temporal filtering scheme to optimize bandwidth utilization. By reducing video noise, the scheme allows for higher compression ratios while maintaining a visually pleasing experience for viewers.

Technique 2: Histogram-Based Scene Change Detection Scheme

Another area of focus in enhancing the x265 encoder is scene change detection.

Automatic extraction of key video information is crucial for indexing, scene analysis, and improved coding efficiency. The proposed histogram-based scene change detection scheme presents a novel algorithm for detecting gradual and abrupt scene changes with reduced computational complexity.

The algorithm operates by dividing each picture/frame into several regions and computing picture statistics (histogram, variance, mean pixel intensities) separately for each region. Thresholds are determined based on luminance and chrominance variance, and the difference between region histograms and the average histogram of the entire picture is evaluated. By considering intensity contrast and variance, the algorithm effectively detects scene changes.

Figure-2 Histogram-based scene change detection

Experimental results demonstrate the reliability and efficiency of the proposed scheme, with significant bit-rate savings and computational complexity reduction achieved. The algorithm’s ability to detect scene changes influences the placement of I-frames within the video, contributing to bitrate reduction and improved coding efficiency.

Target Audience and Usability:

The histogram-based scene change detection scheme offers an efficient solution for automatically detecting scene changes in videos encoded with the x265 encoder. Applications and users who would find this technique valuable include:

(1) Video indexing and scene analysis systems: Scene change detection is crucial in automatically extracting key information from videos for indexing and scene analysis. The histogram-based scheme enables reliable and efficient scene change detection, enhancing the capabilities of these systems and improving their accuracy.

(2) Video streaming platforms and content recommendation engines: By detecting scene changes, streaming platforms can optimize video segmentation, improve content categorization, and provide more accurate recommendations to viewers. The histogram-based scene change detection scheme enables these platforms to identify and analyze transitions within videos efficiently.

Conclusion

The continuous demand for reducing video transmission bitrate while maintaining visual quality has led to the developing of novel techniques for enhancing the x265 open-source video encoder.

The motion-compensated spatio-temporal filtering scheme leverages motion vectors and hierarchical motion estimation to reduce picture noise.
Meanwhile, the histogram-based scene change detection scheme automates the detection of scene changes, leading to improved coding efficiency.

These techniques expand the capabilities of the x265 encoder, making it a popular choice for both on-premises and cloud-based HEVC encoding. By incorporating these advancements, video streaming service providers, open-source frameworks, and other users can achieve significant bitrate reductions, optimize bandwidth requirements, and deliver high-quality video content efficiently.

Overall, the ongoing development efforts for the x265 encoder demonstrate the commitment to continually enhancing video coding gains and addressing the evolving needs of the industry in terms of bandwidth, visual quality, and scene analysis.

What Next? Here are some technical ideas that can be built upon the two techniques presented:

Joint Optimization: Explore the possibility of jointly optimizing the motion-compensated spatio-temporal filtering scheme and the histogram-based scene change detection scheme. Investigate how these two techniques can complement each other to further improve video encoding efficiency and visual quality.

Adaptive Filtering Strategies: Develop adaptive filtering strategies within the motion-compensated spatio-temporal filtering scheme. Explore techniques that dynamically adjust the filtering parameters based on the content characteristics, such as scene complexity, motion intensity, or noise levels. This adaptive approach can optimize noise reduction for different types of videos and further enhance visual quality.

Motion Estimation Refinement: Investigate advanced techniques for motion estimation refinement within the motion-compensated spatio-temporal filtering scheme. Explore methods to improve the accuracy and efficiency of motion vector estimation, such as hierarchical or predictive algorithms, to enhance temporal correspondence and achieve better noise reduction.

Scene Change Detection Optimization: Enhance the histogram-based scene change detection scheme by exploring alternative statistical measures and feature representations for improved accuracy. Consider incorporating machine learning techniques like deep neural networks to learn complex scene change patterns and enhance detection performance.

Multi-Modal Approaches: Combine different modalities, such as audio and visual cues, for more robust and accurate scene change detection. Investigate how audio analysis techniques, such as sudden changes in audio energy or spectral content, can be integrated with the histogram-based scheme to provide a multi-modal scene change detection solution.

Energy Efficiency Optimization: Consider energy-efficient optimizations for the motion-compensated spatio-temporal filtering scheme and the histogram-based scene change detection scheme. Explore techniques that can reduce computational complexity or power consumption while maintaining or even enhancing the coding gains and scene detection accuracy.

Hardware Acceleration: Investigate hardware acceleration techniques, such as leveraging GPUs or specialized hardware architectures, to enhance the performance of both techniques further. Explore how parallel processing and optimized memory access patterns can expedite the execution of these schemes and enable real-time or near-real-time video encoding with reduced power consumption.

The proposed techniques can be further refined and extended by pursuing these technical ideas, leading to advancements in video encoding, noise reduction, scene change detection, and overall video processing.

Ashok Kumar Mishra

Solution Architect

Ashok is the Solution Architect for Media and Entertainment BU in MulticoreWare. He brings 15+ years of experience in the Design and Development of multimedia embedded DSP systems, implementation and optimization of video codecs (H.265, H.264, AVS and MPEG-2), development of streaming media based on RTSP, RTP and WebRTC protocols, Image Signal Processing, and Computer Vision.

Santhoshini Sekar

Head of Video Codecs and Video Solution Engineering

Santhoshini Sekar heads the Video Codecs and Video Solutions Engineering team at MulticoreWare. She played a key role in MulticoreWare’s x265 open-source HEVC encoder project, which some leading broadcasting and OTT solution providers use globally. She currently heads the x266 Project, the open-source initiative for VVC.

Santhoshini has also worked on various customer projects focused on video workflow solutions and its adjacent areas in the Media and Entertainment Industry.

Shivakumar Narayanan

VP & GM, Media and Entertainment

Shivakumar Narayanan has over 20 years of experience in Product Management, Marketing & New Business Development. He heads the Media and Entertainment BU focusing on AI-Enabled Video Codecs, Media Solutions, and Services at MulticoreWare. He holds a Masters from Arizona State University.

Enhancing the x265 Open Source HEVC Video Encoder: Novel Techniques for Bitrate Reduction and Scene Change Detection

Technique 1: Motion-Compensated Spatio-Temporal Filtering Scheme

Target Audience and Usability:

Technique 2: Histogram-Based Scene Change Detection Scheme

Target Audience and Usability:

Conclusion

What Next? Here are some technical ideas that can be built upon the two techniques presented:

Ashok Kumar Mishra

Santhoshini Sekar

Shivakumar Narayanan

Related

Leave a Comment Cancel Reply