Jens Schneider & Max BläserJun 10, 20257 min read

Encoder performance tuning with Optuna

8:05

TL;DR

Optimization and tuning of encoding parameters can be efficiently performed using the optimization tool Optuna. Even its introductory paper mentions Optuna’s capability to find almost optimal parameters for FFmpeg-based encoding. In this post, we demonstrate how to set up an Optuna optimization for the MainConcept HEVC/H.265 Encoder. This can serve as a starting point to tune parameters for specific content or optimize for specific video quality measures. These improvements lead to higher quality at the same throughput or vice-versa.

Introduction

For a given bitrate, Constant Rate Factor (CRF) or Quantization Parameter (QP), encoder settings determine the trade-off between computational complexity and the resulting video quality. These encoder settings may, for example, encompass the on/off switching of specific coding tools and their individual parameterization. Modern codecs such as HEVC or VVC typically have many settings and, therefore, exponentially larger sets of combinations or operating points of an encoder. Encoder implementations typically come with predefined parameter sets, also referred to as presets, e.g., x264/x265 and performance levels for MainConcept encoders. These default parameter sets are obtained from simulations performed on a wide variety of video content, resulting in a good match for general purpose applications and arbitrary video content. It is important to note that additional constraints may be imposed on the use of specific encoder settings in real-world applications. This can occur during encoding for a specific format or when targeting specific playback devices that define the capabilities and limits of a decoder (or encoder). However, these constraints are specific to the encoding scenario and lie beyond the scope of this article.

For content with specific characteristics, e.g., in surveillance scenarios, cartoons or animated content, the default encoding parameters may no longer represent the optimal operating points. In this case, it can be beneficial to tune the encoder parameters to the characteristics of the content. Manual tuning can be very labor-intensive and requires deep expertise, whereas Optuna offers an automated and scalable alternative. Optuna runs on a compute cluster or locally with minimal setup and enables finding a Pareto front of optimal encoding parameters at comparably lower costs. In this post, we demonstrate a straightforward local optimization of animated content.

The optimization logic

Assuming that Python and other dependencies, such as Pandas, Seaborn, Matplotlib (required later for visualization) and, of course, Optuna are installed, we begin by importing the required modules, including utilities for parsing encoder logs. We then define a very basic encoding function that runs the MainConcept sample HEVC encoder with fixed settings, such as resolution, color space and a constant rate factor (CRF) of 36. The input sequence is a scene from the BigBuckBunny test video, utilized here to demonstrate the effectiveness of optimization for animated video content. Additionally, the encoding function accepts a list of additional settings. These are the settings that Optuna will vary to gather performance points of the encoder. The function returns a tuple of the measured encoding speed in frames per seconds (fps) and the achieved video quality (PSNR). While peak signal-to-noise ratio (PSNR) is used here for simplicity, other measurements such as Netlix’s Video Multimethod Assessment Fusion (VMAF) could be easily substituted.

import optuna
import os
import subprocess
import re
import pickle
from typing import List
def encode(settings: List[List[str]]) -> (float, float):
    cmd = ["./sample_enc_hevc"]
    cmd += ["-I420"]
    cmd += ["-v", "1080pBigBuckBunnyScene10_1920x1080_25_8_yuv420p.yuv"]
    cmd += ["-w", "1920"]
    cmd += ["-h", "1080"]
    cmd += ["-f", "25"]
    cmd += ["-bit_rate_mode", "4"]
    cmd += ["-rate_factor", "36"]
    cmd += ["-quality_metric", "1"]
    cmd += ["-o", os.devnull]
    for s in settings:
        cmd += s

    result: subprocess.CompletedProcess = subprocess.run(
        args=cmd, capture_output=True, check=False, timeout=600
    )

    fps_match = re.search("Average speed achieved\s+(\d+.\d+)\s+fps", result.stdout.decode())
    fps = float(fps_match.groups()[0])

    psnr_match = re.search("Overall PSNR \(A\)\s+(\d+.\d+)\s+dB", result.stdout.decode())
    psnr = float(psnr_match.groups()[0])

    return fps, psnr

With the encoding function defined, we can measure the baseline performance for the default presets performance levels that come with the encoder. The snippet below calls the encoding function for all performance levels from 10 to 20 and stores the results in a file named default_performance_curve.pickle. Note that the performance level of the MainConcept HEVC/H.265 Encoder can be configured to any value between 0 (fastest, but lower quality) and 31 (slowest, but best quality). Consequently, the range from 10 to 20 centers around the default performance level of 15, providing a balanced trade-off between speed and quality.

perf_level_settings = [[["-perf", str(x)]] for x in range(10,21)]
default_perf_curve = list(map(lambda x: encode(x), perf_level_settings))
with open('default_perf_curve.pickle', 'wb') as f:
    pickle.dump(default_perf_curve, f)

After collecting this reference performance curve, we proceed to set up the Optuna optimization. First, we specify which parameters should be varied and provide their value ranges. The MainConcept HEVC/H.265 Encoder supports a wide selection of tunable options, but for the sake of clarity in this post, we have deliberately kept the set of chosen parameter options small. Next, we need to define a wrapper function for Optuna that accepts an Optuna trial as input and passes it to the encoding function.

parameter_options = {
    "log2_min_cu_size": [3, 4],
    "inter_partitioning": [0, 1, 2],
    "motion_search_precision": [0, 2, 3, 4, 5, 6],
    "sign_data_hiding": [0, 1],
    "fast_type_decision": [0, 1, 2],
    "fast_me_skip_decision": [0, 1, 2],
}

def encode_optuna(trial: optuna.Trial) -> (float, float):
    # generate the settings to try in this trial
    settings = {
        k: trial.suggest_categorical(k, v)
        for k, v in parameter_options.items()
    }
    settings_list = [["-" + k, str(v)] for k, v in settings.items()]
    return encode(settings_list)

We then can create an Optuna study and launch the optimization. The keyword argument directions tells Optuna that both the encoding speed in frames per second (fps) and the PSNR should be maximized by the optimization. The storage argument ensures that the results will be stored in an SQLite database. Note that the optimization can take some time when encoding longer video sequences or even sets of sequences. This is a one-time process that can significantly improve long-term performance. If you, for example, optimize for improved throughput (fps) at the same quality, this process will reduce your resource requirements (encoding time or number of encoding instances) and ultimately reduce costs.

study = optuna.create_study(
    directions=["maximize", "maximize"],
    study_name="optimize_enc_hevc",
    storage="sqlite:///optimize_enc_hevc.db",
    load_if_exists=True,
)

study.optimize(encode_optuna, n_trials=500)

Analysis of the study

Once optimization is complete, we can analyze the results using a few lines of Python code. After importing Pandas, Seaborn and Matplotlib, we load the baseline performance curve from the pickle file that was stored previously and extract the best trials from the Optuna study. Note that Optuna's best_trials member variable contains only the Pareto-optimal trials, i.e., those trials that are not outperformed in both speed and quality. Other trials exist, but are excluded from this analysis, as we are only focusing on the optimal points.

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# read previously saved pickle as a pandas dataframe
default_performance_df = pd.DataFrame.from_records(pd.read_pickle("default_perf_curve.pickle"), columns=["fps", "psnr"])

# read the best perfomrance points from the study that Optuna identified to be best
study = optuna.load_study(study_name="optimize_enc_hevc", storage="sqlite:///optimize_enc_hevc.db")
best_trials_df = pd.DataFrame(map(lambda x: x.values, study.best_trials), columns=["fps", "psnr"])

# visualize the default performance curve and the best points found by optuna
sns.lineplot(default_performance_df, x="fps", y="psnr", marker="o", color="r", label="default performance")
sns.scatterplot(best_trials_df, x="fps", y="psnr", label="optimized performance")
plt.grid()
plt.legend()
plt.show(block=False)

The resulting plot clearly demonstrates that Optuna finds parameter sets offering higher quality at the same encoding speed compared to the default performance curve. Additionally, Optuna finds some performance points with even higher quality but lower speed (top-left corner of the plot). These points may be considered as outliers for this experiment, as we lack a reference curve in that speed range.

Optune blog 2025-06

Viewed differently, Optuna's optimization can deliver faster encoding at the same level of quality. For example, at 42.8 dB PSNR, the encoding speed improves by more than 25%, or 40 fps in absolute terms.

In summary, this performance curve visualization provides an intuitive understanding of the potential speedup your video encoding could leverage when optimized settings are used.

Conclusion

We have demonstrated how encoding parameters can be effectively tuned using Optuna. Adapting the encoder parameters to the characteristics of the encoded content can be solved easily with the Optuna optimization framework and a reasonable amount of computational resources. Moreover, the results of an example study reveal the potential benefits in terms of encoding speed. Substantial improvements to existing encoding pipelines could be realized with the methodology described above, leading to real-world cost savings. Thanks to the simplicity of the Optuna approach, optimization objectives can also be interchanged easily, such as maximizing the encoding speed and VMAF-based video quality.

In conclusion, Optuna serves as a powerful optimization tool, not only as a hyperparameter tuner for machine learning, as presented in the Optuna documentation, but also for video encoding workflows.

Jens Schneider & Max Bläser

Senior Video Coding Research and Development Engineers

Max focuses on applying machine learning algorithms to encoders and video applications. He previously worked for Germany’s largest private streaming and broadcast company where he co-developed large-scale transcoding systems for VOD streaming. He received the Dipl.-Ing. in electrical and communications engineering and Dr.-Ing. degrees from RWTH Aachen University. During his doctoral studies at the Institut für Nachrichtentechnik (IENT), Max performed research in the area of video coding and actively contributed to the standardization of Versatile Video Coding (VVC).

Jens received his Dr.-Ing. degree in communications engineering from RWTH Aachen University in 2021. His research focused on the link between machine learning and low-level coding tools in combination with higher level video coding concepts such as dynamic resolution coding. Jens joined MainConcept after working as a Software Engineer in the cloud native landscape. In his role at MainConcept, he is currently working on machine learning-based encoder optimizations and cloud native simulation setups.

AutoLive Encoding

Introducing EVA

CMAF: Low-Latency at Scale