Vladislav Vyushkov, Alexander Kulish, Nikolay SkleymovMar 3, 20258 min read

MainConcept VVC/H.266 Decoder: overview of key features and capabilities

12:15

Introduction

VVC (Versatile Video Coding/H.266) is a next-generation video codec that follows on from HEVC (High Efficiency Video Coding/H.265), and is aimed at improving video compression and meeting new content quality requirements. VVC and HEVC differ in the following ways:

Compression efficiency: VVC can achieve approximately 30–50% better compression than HEVC while maintaining comparable video quality. This reduces file size and lowers network transmission load. Encoding complexity is almost 10 times higher while the expected increase in decoding complexity is not as significant (up to twice as high).
Support for ultra-high resolutions (4K, 8K, 16K) and HDR: VVC is designed from the ground up to handle ultra-high resolutions like 4K, 8K and beyond, as well as 360-degree VR and AR video. While HEVC supports 4K and 8K, it is less efficient for these formats.
Versatility: VVC can be configured for various scenarios, from streaming to gaming, VR/AR, video conferencing and IoT. HEVC primarily focuses on streaming and broadcast video.

Parallel processing

VVC retains the same set of parallel decoding features as HEVC, including WPP (Wavefront Parallel Processing), tiles and slices. These features increase the average decoding speed and reduce frame-decoding times. Even in the absence of intra-frame features, the decoder can use parallel decoding for multiple frames, although this is less memory-efficient and does not speed up individual frame-decoding. Fig.1 shows an example of how the MainConcept VVC/H.266 Video Decoder processes a picture that contains slices and tiles and is WPP-enabled.

VVC decoder post fig 1

Fig.1: Picture contains 4 slices and 4 tiles

WPP in the decoder relies on CABAC context saved after decoding the first CTU of the previous row and offset data saved at the start of the current row. VVC restricts the use of data from the upper-right CTU, which enhances parallel processing efficiency compared to HEVC.

Tiles divide a frame into fixed rectangular regions that can be decoded in parallel (with known offsets in the bitstream to the start of each tile). If post-processing filters are disabled at tile boundaries, tiles can be decoded independently, though this may reduce quality.

Unlike HEVC, VVC slices are designed to store complete tiles or rows. VVC offers two types of slices: raster slices, containing several consecutive tiles, and rectangular slices, containing a rectangular region of tiles or several rows from one tile.

The MainConcept VVC/H.266 Video Decoder prioritizes WPP over tiles and slices, and tiles over slices. Slices will be used for parallel processing only when tiles and WPP are not present, or entry point offsets are missing.

New prediction features and improved post-processing filters

Modern video-coding technologies are continuously evolving, offering new methods to enhance image quality and compression efficiency. One of the key areas of focus is improving motion prediction and data processing, which allows for less information to be encoded while maintaining high accuracy. This section reviews several features that are implemented in the MainConcept VVC/H.266 Video Decoder.

One of the primary challenges in video coding is to accurately describe the motion of objects, especially in complex scenes involving rotation, scaling, or deformation. To address this issue, the Affine Motion Model can be used. This model employs three motion vectors, enabling it to apply affine transformations such as rotation, resizing or shifting to handle such motions. Its approach is particularly effective for large objects or scenes with abrupt changes, significantly enhancing motion processing quality. The Affine Motion Model has high computational complexity, however, efficient optimization techniques for its implementation have been incorporated in the MainConcept VVC/H.266 Video Decoder. For handling small objects and subtle movements, the SbTMVP (Sub-block Temporal Motion Vector Prediction) method was developed. This predicts motion at the sub-block level, allowing for the consideration of even minor changes. This is particularly useful in scenes with high detail, resulting in more accurate reproduction of small objects and complex motions.

Additionally, DMVR (Decoder-side Motion Vector Refinement) technology refines motion vectors directly on the decoder side without additional signaling. This improves image quality by minimizing artifacts such as blurring or jitter. However, it is worth noting that this method requires significant computational resources, which may increase the decoder's workload.

For more accurate reproduction of object boundaries and complex scenes, methods like GPM (Geometric Partitioning Mode) are used. This approach combines two prediction sources based on geometric features, which is particularly beneficial for scenes with sharp object boundaries. As a result, the accuracy of reproducing such areas is significantly enhanced.

Another important tool is CIIP (Combined Intra Inter Prediction), which merges intra-frame and inter-frame prediction. This allows for efficient processing of areas where both methods complement each other, reducing artifacts and improving detail.

For handling complex dynamic scenes, BDOF (Bi-directional Optical Flow) technology is used, applying optical flow to refine bi-directional prediction. This results in smoother motion and improved video quality in dynamic scenes. BCW (Bi-Directional Optical Flow Combined with Weighting) is an advanced motion compensation technique in VVC that improves inter-frame prediction. It enhances bi-directional motion estimation by efficiently combining motion vectors from two reference frames to better predict the current block.

Before applying post-processing, Luma Mapping with Chroma Scaling (LMCS) may be used, which is applied as part of the decoding process and maps luma samples to specific values and, in some cases, also applies a scaling operation to the values of chroma samples. It enhances brightness and color representation in high-dynamic-range scenes.

VVC has several filters that can enhance image quality. There is a Deblocking filter that smooths block boundaries with a width of up to 14 pixels, while in HEVC the smoothing width was 6 pixels. Next is Sample Adaptive Offset (SAO), which is used to improve the quality of the decoded image and reduce coding artifacts. This filter is the same as it was in HEVC. The last in-loop filter is Adaptive Loop Filter (ALF), which is employed to enhance visual quality and decrease errors by performing adaptive filtering at pixel level. Cross Component ALF (CCALF) is an algorithm, which enables even more precise processing for specific areas of the image, which was developed using machine learning during standardization. Deblock, SAO and ALF can be disabled in the MainConcept VVC/H.266 Video Decoder for faster processing. This can be valuable for tasks prioritizing decoding speed over output conformance, i.e. for preview.

Multi-layer support

VVC uniquely supports multi-layer functionality as part of its core specification. This includes features for efficiently encoding multi-layer video, where one data stream contains multiple independent or related layers. This is especially useful for adaptive streaming, VR, video conferencing and multi-camera systems.

Multi-Layer coding gives the ability to provide multiple streams at different levels of performance, quality and bitrate. It is a great solution for streaming video to devices with different capabilities and performance. The MainConcept VVC/H.266 Video Decoder provides an efficient implementation solution for multi-layer decoding.

In VVC, as in any modern video codec, the interaction between different video layers plays a crucial role. RPR (Reference Picture Resampling) technology adapts the prediction for reference frames with different resolutions. This provides flexibility during decoding and supports resolution changes during playback. The ILRP (Inter-Layer Reference Picture) method utilizes data from various video layers for efficient encoding and decoding. This is especially important in scenarios involving image overlays, adaptive streaming, or multi-camera systems. As a result, data usage efficiency is improved, and playback quality is enhanced.

MainConcept VVC/H.266 Video Decoder performance

Overall speed

The performance of the MainConcept VVC/H.266 Video Decoder is not significantly different from that of the HEVC implementation (Fig.2). The VVC decoder requires more powerful hardware than the HEVC decoder due to its increased algorithmic complexity. However, efficient parallelism mechanisms such as WPP, slices and tiles enable real-time video playback.

VVC decoder post fig 2

Fig. 2: Decoder performance comparison

The impact of coding tools

The MainConcept VVC/H.266 Video Decoder supports all the main functions of the VVC standard. The performance dependency on the gradual inclusion of features in the MainConcept VVC/H.266 Video Decoder is shown in the graph. Most of the standard's features have little impact on decoder performance (Fig.3). For LMCS and BDOF, there are no vectorized versions used, this could even speed up the decoder in future decoder releases.

VVC decoder post fig 3

Fig. 3: MainСoncept VVC/H.266 Video Decoder performance with gradual feature enablement

Vectorization

In the MainConcept VVC/hH.266 Video Decoder, many low-level processing functions are vectorized, totaling over 85% of functions where applicable. The MainConcept VVC/H.266 Video Decoder efficiently uses AVX2 technology, which is widely present on modern Intel and AMD processors. On average, it improves performance by up to 2.4 times compared to the plain C version.

Parallel processing

As stated above, the MainConcept VVC/H.266 Video Decoder can process WPP, tiles or slices in parallel. Furthermore, it can process several frames in parallel. This helps improve performance when there are not enough parallelization tools in a single frame. However, this approach causes slightly higher resource-usage, forces internal inter-frame synchronization and increases peak per-frame latency.

VVC decoder post fig 4

Fig. 4: Relative MainConcept VVC/H.266 Video Decoder performance

As illustrated in Fig. 4, to achieve the highest level of parallelism within a frame and better performance of the MainConcept VVC/H.266 Video Decoder, it is recommended to use several tiles combined with WPP. This combination benefits the decoder only if entry-point offsets are enabled.

Another approach to improve CPU utilization efficiency is with thread pool sharing between several MainConcept VVC/H.266 Video Decoder instances. This can help avoid extra thread concurrency and improve performance in applications, where several streams must be decoded in parallel (editing, surveillance, etc.).

Conclusion

The MainConcept VVC/H.266 Video Decoder is a highly efficient solution for modern video formats, offering superior quality and flexibility. Its support for most standard features ensures high performance across diverse scenarios. Key features such as parallel processing, advanced prediction technologies and improved post-processing filters ensure high-quality video decoding with efficient resource utilization. The MainConcept VVC Decoder continues to evolve. Expect new developments and refinements at a steady pace.

The MainConcept VVC Decoder, along with the rest of our codecs, is available as a trial download at www.mainconcept.com/vvc-demo. If you have the chance to try it out, let us know what you think.

Vladislav Vyushkov, Alexander Kulish, Nikolay Skleymov

Vladislav Vyushkov
Vladislav is a Senior Software Engineer. He is a graduate of Tomsk State University of Control Systems and Radioelectronics with a degree in Software Engineering. He is currently working on improving the performance of the VVC decoder.

Alexander Kulish
Alexander is a Staff Software Engineer specializing in the development and optimization of video codecs since 2008. He is a graduate of Tomsk State University of Control Systems and Radioelectronics with a degree in Software Engineering. He is currently working on improving the performance of the VVC decoder.

Nikolay Skleymov
Nikolay is a Staff Software Engineer. He has been working at MainConcept for over a decade developing codecs and multimedia solutions. Graduated Tomsk State University of Control Systems and Radioelectronics with a degree in Software Engineering. He is currently working on improving the performance of the VVC decoder.

VMAF-E

MainConcept Easy Video API (EVA)

CMAF: Low-Latency at Scale

MainConcept VVC/H.266 Decoder: overview of key features and capabilities

Introduction

Parallel processing

New prediction features and improved post-processing filters

Multi-layer support

MainConcept VVC/H.266 Video Decoder performance

Overall speed

The impact of coding tools

Vectorization

Parallel processing

Conclusion

RELATED ARTICLES