Interview with Benjamin Bross and Adam Wieckowski from HHI on VVC and Video Compression

Editor’s note: OTTVerse is pleased to present this free-flowing interview with Benjamin Bross and Adam Wieckowski from Fraunhofer HHI. Do not miss this interview as they talk about the Heinrich Hertz Institute, what goes on at the HHI, the VVC codec, VVenC, VVdeC, and share some thoughts on complexity as a driving factor for designing video compression tools. For a quick intro to the new codecs released by MPEG, check out OTTVerse’s review of VVC, EVC, and LCEVC.

If you want to contact them for further information, you can reach out to either Benjamin or Adam on LinkedIn or at this webpage.

Speaker Profiles

HHI Benjamin Bross

Benjamin Bross received the Dipl.-Ing. degree in electrical engineering from RWTH Aachen University, Aachen, Germany, in 2008. In 2009, he joined the Fraunhofer Institute for Telecommunications – Heinrich Hertz Institute, Berlin, Germany, where he is currently heading the Video Coding Systems group at the Video Coding & Analytics Department, and in 2011, he became a part-time lecturer at the HTW University of Applied Sciences Berlin. Since 2010, Benjamin is very actively involved in the ITU-T VCEG | ISO/IEC MPEG video coding standardization processes as a technical contributor, coordinator of core experiments, and chief editor of the High Efficiency Video Coding (HEVC) standard [ITU-T H.265 | ISO/IEC 23008-2] and the emerging Versatile Video Coding (VVC) standard. In addition to his involvement in standardization, Benjamin is coordinating standard-compliant software implementation activities. This includes the development of an HEVC encoder that is currently deployed in broadcast for HD and UHD TV channels. Benjamin Bross is an author or co-author of several fundamental HEVC and VVC-related publications and an author of two book chapters on HEVC and Inter-Picture Prediction Techniques in HEVC. He received the IEEE Best Paper Award at the 2013 IEEE International Conference on Consumer Electronics – Berlin in 2013, the SMPTE Journal Certificate of Merit in 2014, and an Emmy Award at the 69th Engineering Emmy Awards in 2017 as part of the Joint Collaborative Team on Video Coding for its development of HEVC.

HHI Adam Wieckowski

Adam Wieckowski received the M.Sc. degree in computer engineering from the Technical University of Berlin, Berlin, Germany, in 2014. In 2016, he joined the Fraunhofer Institute for Telecommunications, Heinrich Hertz Institute, Berlin, as a Research Assistant.
He worked on the development of the software, which later became the test model for VVC development. He contributed several technical contributions during the standardization of VVC. Since 2019, he has been a Project Manager coordinating the technical development of decoder and encoder solutions for the VVC standard.

Transcript of the Interview

Krishna Rao
Hi, everyone. Welcome to today’s discussion. I have two very special guests with me. And before I introduce them, let me give you a quick introduction. I’m Krishna, and I’m the Founder and Editor of OTTVerse. We are a magazine dedicated to video streaming; we talk about everything in the video pipeline, starting from compression, packaging, ad insertion, DRM, delivery, playout … the whole pipeline. So, we have interviews; we have deep dives, a lot of tech news. So, do check us out on ottverse.com.

Before I introduce my guests, a bit of a background on why I chose this topic. So I have done a bit of compression myself, I did research, I did codec development and whenever I would do literature review, one name would pop up very often and as you might have guessed, that’s the Fraunhofer HHI Heinrich Hertz Institute.

So I decided let’s kick it off with a group which has been quite involved in all the codecs that we’re familiar with H.264, HEVC, and now with VVC. So we’ll talked about more about the institute itself, what is it? What drives it? How does the decision making process take place? And their thoughts on the VVC? They have an implementation of the VVC, which is open source, plus a few philosophical questions, I guess, on compression.

Let me introduce our guests. We have Benjamin Ross, who’s a project manager in the image processing department of the HHI. Hey, Benjamin, thank you for joining. And we have Adam Wieckowski ( I know I messed that up, Adam, sorry for that !) , who is a research assistant at the HHI. So let me hand it over to both of you for quick introductions. Benjamin, you first?

Benjamin
Yeah. Thanks, Krishna, for inviting us, it’s a real pleasure being here. So my role as the head of the Video Coding systems department at Fraunhofer HHI and that function, I was the editor of the HEVC standard, standard text specification, as well as for the Versatile Video Coding (VVC) specification text. And also, as the whole group, or even our department, there are some more groups involved in that we contributed technical, or technically contributed to the standards by contributing compression algorithms.

Also, we are active in administrative support. So we host the repositories of the reference software. And yeah, Karsten is the software coordinator for the reference software, and yeah, so. And, yeah, my role is heading the group that also implements the VVC standards, as well as coordinating and being the editor at the standardization activity itself.

Krishna Rao
When you said editor of the entire spec, my first question was, has anybody read the entire spec cover to cover?

Benjamin
You’ll be surprised. Yes, since we also have a bug tracker, where you can find bugs. And we get a lot of feedback from the people working at companies implementing the standards. And they they find places that Wow, really people, they look, they really look at each part and each text and each bullet of that specification, in order to make their products compliant. That’s really amazing.

Krishna Rao
That’s wonderful. Because I’m usually stuck in the first page where like, this is what ‘must’ means this is what ‘shall’ means I was like Oh God. Now to keep all of that in mind when reading this spec.

Benjamin
When I started, I guess the basic structures the same since H.264 AVC. So if you went through the pain of familiarizing yourself with the way the spec is written, it’s not changing to the newer specs. So once you did that for H.264, you can use that knowledge to read H.265 HEVC, as well as H.266. VVC.

Krishna Rao
Adam — your introduction.

Adam
So first, Hey, Krishna, thanks for the opportunity being here. So you got my name almost right. That’s great. So I’ve been at HHI for around five years. I started just as we were starting off to work towards VVC standardization and, you know, contributed here and there some some aspects with the standard. Also, I was involved with the software. So we’ve developed our own software, which then, you know, became the basis for the VVC reference software. And for the last two years, I’ve been doing the technical coordination for the development of our encoding and decoding software, VVenC and VVdeC. Which I guess we want to discuss a little bit more later, right?

Krishna Rao
Yeah. So we have some questions on that. Okay. Thanks for those introductions. I guess. My first question, Benjamin, to you is, so what is the HHI? What is Fraunhofer? What is HHI? What’s the relationship? What do you actually do there? Is it like a company? Is it a Research Institute? If you can give us a little more insight into what happens at the HHI?

Benjamin
Yeah, that’s a very, very good question. Because, yeah, that’s always the question. What are you? Are you an institute of an university? Are you public institute? Are you like a government agency? Are you a company? And the answer is a little bit of everything. Our Institute is called Heinrich Hertz Institute, all about telecommunications. And this shows you that there are maybe other Institute’s within the Fraunhofer society that have different focus. So if we go one level up at the Fraunhofer society, what is that? So founder of society or Gesellschaft as it’s called in German, as the leading organization for Applied Research in Europe. And so it’s a research organization, thats number don’t pin me down. And the exact number at some point of time, it was 72 institutes and research units in Germany. And all these research institutes and units have cooperation around the world.

The staff members, so just to give you some numbers about the size of that, at that point in time, of course, it’s changing, it was 26,600 staff members. So it’s kind of a huge organization. At that time, the budget was 2.6 billion euros. And from that research compounds, some highlights were the mp3 audio codec, which was the Fraunhofer IIS Institute in the south of Germany, and Erlangen. The H.264, you already mentioned H.265, VVC Video Codec. We also did together with the IIS contributions to LTE and now 5G Mobile Communications. This is also where our institute is involved. And yeah, the focus of each Institute from Fraunhofer, where the institute’s research focus ranges from health environment over life science, mobility, transport, energy, production and safety, IT security. So there’s a broad range of, research fields, within Fraunhofer.

That is targeting to support company and its industry as well as the service sector and public administrations with research and development services. So if, for example, one public administration has a problem, and some of our expertise areas, but they don’t have an r&d department, they say, well, we have that problem. Could you take a look and provide us a solution for that and that’s what what we do and for these kind of projects, we also get support from our German government, which is a way for them to, to support these services or these kind of R&D services that we offer.

Krishna Rao
That’s, that’s very interesting. So are there Academy roles as well. I mean, I’ve seen positions, I’m not sure it was right at HHI, but a PhD positions and like Adam is a research assistant, so is there active academics also going on where somebody could come in as a Masters, do a PhD publish, go into the industry something like that.

Benjamin
Exactly. That’s, that’s, that’s the other important pillar of Fraunhofer. But typically, we have the head of the Institute, also head of departments also have a chair at the university. So they also have a professor, and having that close relationship to academia, they can also tutor masters students, Bachelor students and PhD students. And that’s why, with the applied research we are doing here, we are offering also students to do their bachelor , master and PhD thesis in cooperation with my colleagues who also are professor at the university.

So this is where the, where we have close ties to academia. And of course, our research we publish. One very nice thing that I learned recently is that when we publish something, Fraunhofer also gives funding to make that open access. So everything that we published, for example, we try the best as we can to make everything open-access, so people can just read it without subscribing to expensive journals.

Krishna Rao
Yeah, that’s pretty important. Because I remember clearly, when I was in university, it was very easy for us to download papers. And that happens in a few companies, but not everybody can afford the costs to actually download these journal papers. So it is quite cost prohibitive I would say – especially when you don’t know that’s the right paper.

Benjamin
Yeah, I also, if you want to share something, you sent the link to somebody outside the organization as well. Yeah, it asked me to pay like $200, whatever.

Krishna Rao
That’s true. Yeah. But that’s a great effort, actually, to make it open access, that would help a lot. I guess this is kind of a segue into something we spoke about the other day, where you mentioned that somebody has to pay for all your lights to stay on. And being good research institute.

I mean, everybody who has done research knows that not every idea is going to work out. I mean, you might think that you’re on the right track. And at the end of the day, somebody else has already invented it, or it’s giving you the exact opposite results. So how do you as an institute prioritize? I mean, for a company, it’s pretty easy. So we say that, hey, this codec has to run in real time. So I might sacrifice a few tools here and there and turn off some knobs. But how do you approach it? Do you see a payoff? Or is it just research for the sake of research? What goes on the back of your mind when someone says – Hey, I want to improve CABAC!! What do you then?

Benjamin
I guess here, also highly depends on who you’re asking, right? If you ask the very technical colleagues with a clear research focus, and I was like well I’m super psyched by these techniques, and I really want to push the capability of the compression capability and get to investigate more what’s behind that and then other colleagues caring about that, what you do should have an impact, meaning being used and standards and devices out there in the world. And we only, yeah, and the very favorable position to that we can do both.

So with the success, we had with previous standards, that revenue that we get from there, we can reinvest. And not because we are a public research institute, not in our shareholders, but in our new ideas. The whole system is pretty complicated, but the bottom line is Whatever we get from our past success, we can reinvest to research without having a clear, narrow focus on commercialization. So we can just explore that direction, we can explore that direction. And even in the past, since you mentioned CABAC, there were a lot of people at that time saying “Are you crazy, this won’t run on a computer, you have to have like super-fast hardware, it won’t run, it’s super complicated.”

So at that time, there were two ways to do entropy coding. One is a pretty simple one, based on the variable-length codec, with adaptive codebook switching, and so on. And the other one is CABAC. And now, CABCAC runs on your netbook. basically, on your phone, it’s running. So this is how it can change and but at that time if somebody would have said, but this is commercially, like the resources you need for that, that’s commercial, complete nonsense, stop that, we wouldn’t have that very efficient entropy coding methods.

And that’s why we have the freedom to explore what’s beyond. And that’s what we currently do, and cooperation with my colleagues on the other research group, so they can look in Video Coding structures beyond what we do now, which is now even in VVC, we started things with having a complex neural network for some prediction modes. And that was too complex during the course of standardization. we simplified it to be a matrix multiplication, but we never would have come to that place to have a very efficient matrix multiplication if we haven’t played with super complex networks before. So that that freedom that we have also allowed us to investigate directions where not everyone would go.

Krishna Rao
How much pushback did you face from chip manufacturers during the standardization?

Benjamin
The discussions can get quite heated. But at the end, I guess you’re having a beer at the bar. Everyone’s just doing his job. Right. So we want to get our great compression ideas. Yeah. And that chip manufacturers have to make sure that it’s implementable and that the chip does not cost like 30 years for low cost devices. So that’s, that’s the challenge. And that’s the beauty of that international standardization that everyone involved from the ecosystem is there when the technical structure or when the the technical aspects of that standards are being developed.

So we bring our crazy ideas and then they have to somehow bring us back down to earth, saying, well, we need to work on that. And then we work on that. And that’s sometimes it’s frustrating. Sometimes there’s this big discussion, there’s some fighting, but in the end, we come up with something that is implementable and efficient. And that’s the beauty of that process.

Krishna Rao
Okay, let’s circle back to this question at the end, but probably switch over to Adam at this point. I guess most of the people who are going to be watching this video are quite familiar with the reference implementations HM, JM, VTM. And reference implementations are as the name suggests, reference – which no one expects to use them commercially. You use them to see whether how the tools were you can use the decoder, to check if your stream is conformant

Just to get an idea of the codec and how it’s written. Some people might use it as a starting point, but they’re not commercially viable. But this time you have taken the effort and probably the impetus to build out VVenC and VVdec and make it open source. So what was the thought process behind all this? And if you can give us a glimpse on the roadmap, where are you going with this? Because I really saw some interesting features like frame level parallelism, SIMD, So all these are expected in commercial codecs.

So what’s the thought behind VVenC and VVdec?

Adam
So I think we started with exactly the thought that you just mentioned that the reference software it’s not viable for practical use or commercial use. You know, it’s only viable for like people who want to cross check their stuff, or for some very, very early demonstrations. So you know, we knew that once the VVC is done we want to be able to do some good demonstrations. And we also wanted the standard to be accessible to everyone. And this requires for tools to be out there. So like a for a decoder that works, you know, on that can play out in real time. And for an encoder that can encode the content in, you know, amount of time that is available to a person who doesn’t have a, you know, high-performance cluster to encode the 10 seconds of video, then,

Benjamin
Yeah, maybe to give a good comparison here, what you can , you’re probably familiar with X265, for HEVC. So this is what we what we had in mind, to have the people are using that all over the world if you want to, to have a free available HEVC encoder, there’s X265. And that was that was our idea. And since like, a VVC equivalent, X.266 is a little bit far on the road. We thought, well, let’s have something directly from the start, that is available, and accessible to everyone, with the main focus to do not make compromises on the efficiency.

Krishna Rao
But when you say don’t make compromises, on the efficiency is it going to have I mean, it already has different levels that use like the FFMPEG like superfast, fast, where it kind of helps both ends of the spectrum. Somebody wants to do “quick and dirty” encoding, they get it out. Somebody wants to reduce their whole home library into small disk, they can use the placebo I mean, not the placebo, but the very, very slow modes.

Benjamin
Yeah, that is a very good point. And yeah, maybe I can share you one slide. And Adam, you can walk us through that slide where we exactly measured what you described the different trade offs. And I was talking about the like the slowest setting, or setting we don’t want to compromise on the efficiency.

Adam
Yeah, so it’s just as you mentioned, so we, you know, we configured a few presets that you can choose, like, depending on how much time you have to encode, or what’s the quality you’re looking for in your encoded video. And the idea behind is that with the slowest configuration, we always want to match the VTM software, we want to be able to provide, you know, everything that’s VVC promises, which is basically what the VTM encoder does

And as you see, even with our implementation, it takes a lot of time. But then we really put a lot of work to implement speed ups. You know, when you implement speed ups, you cut corners, right? So yeah, that testing of mode, stuff like that. So of course, we have dropped in the quality. But we really went a long way to ensure that the trade offs are really, really good. So we run a lot of tests to see which tools to enable when, which speedups to activate or deactivate when, so that the points that you get the presets are really as fast and as good as they can be. Still, with every new release, we have, like also you can see here, we are able to improve those points. So we still find places where we can do better.

So as you can see between the 0.1 and 0.2 version, the trade-offs improved, so basically, if something is farther left or farther down like if a point is farther left or farther down, of another point, it’s better and for the version 0.3, the presets are going to be improved even further. But yeah, you know, we’re putting in the work and we want to provide those opportunities to use those very optimized settings, you know, as an open-source package.

Krishna Rao
Very nice, so if you can clarify, is this fixed QP or any rate control, which is going on here?

Adam
This is fixed QP but at least our two pass rate controller provides very, very similar, very similar results.

Krishna Rao
This is very, very nice. So how is the implementation going on? Is it? Are you inviting participation from the public? Or is it more internally from HHI, but being released as an open source?

Adam
Yes, so to pick up the question or the idea from before. So, as I mentioned, at the beginning, that reference of tours actually based on the software that we developed at HHI. So, because we’re so familiar with it, we picked it up for the development. So our implementations are based on the VTM. And, in turn, on our previous software, so this allows us to do very fast paced development, because we’re very familiar with the software basis.

And it’s also very easy to work with. And we’re making most of the development internally, that is, you know, we have people working on aspects. And we’re releasing those versions to the public. But in the public space, in the GitHub repository, we also accepting external contributions. So we already had a few contributions to the decoder and one or two to the encoder, that we then, you know, incorporate into our internal development and basically keep, as a part of the, you know, of the distribution.

Krishna Rao
That’s, very encouraging. Because I mean, like you mentioned, people need something to play with and to get used to the codec, otherwise these codec comparisons are endless. This is better than that.

Benjamin
So one other aspect there, what we see as that we provide and make public the basic functionality that you need, for example, if you want to do cloud based encoding, if you want to do HDR video, if we added tools for screen content coding, but then there are use cases beyond that, for example, 360 VR AR video, and there are a lot of people out there who wants to play with that.

So if they want to do that, and want to rely on an efficient implementation, they can use VVenC as a basis for their extensions, or that’s tech scalable extension. So if you do want to extend that was scalable coding, that’s a good basis to do that. And if they part, bring it back, even better, right, that’s the idea.

But the license, the software licenses, such that you could use it in your product, you do not need to disclose your source code, if you if you use it in your product. So you, we really made it very simple for people to just use it to deploy VVC as a standard.

Krishna Rao
That’s very nice actually.

Benjamin
What we already get from feedback from the community. And that is super nice. Also for us. And everyone working on that to see us fixes, suggestions, improvements and merge requests. So people are already starting working with that and giving back fixes and improvements that that’s really nice to see.

Krishna Rao
It’s very cool. So is there anything new that is coming out in the roadmap, if you’re allowed to reveal it any exciting feature that you can look forward to?

Adam
Should I take this one? Yeah. So yes, there are many new and exciting features that we’re working towards. So as I already mentioned, we’re going to be releasing even better preset configurations with some new speed ups. So mostly towards the faster and fast presets, those are going to be getting a lot better. But also in the slower and slow part there are going to be some minor improvements or multi threading. It’s for me, it’s the future for 0.3 is going to be way way better. So if you run it with, like a threads, so you know, something modern workstation should have, you know, it will run basically like two times faster than the previous version with the same number of threads.

So I’m very excited about that, you know, like we’re there. We’re almost there. To start You know, measuring, or encoding times not in relation to like VTM or HM, but rather like in FPS or something so excited about that. We’re also going to, you know, every part will have improvements or rate control, the QP adaptation, those are gonna be getting much better. So like the subjective quality will get better. And also, there are a few, versatile application aspects that are going to be included. But well, let’s not go into details just yet.

Benjamin
But maybe, to add a point on that, because that is something that will be also released soon. And the context of that is one of the verses tile features of PVC. Its reference picture resampling. So that allows you to reference pictures from your previous pictures that have different resolutions. So that’s key for adaptive streaming, when your resolution change. And also I, we came across your article about open-GOP close-GOP. And so you already think a bit into that. And that’s exactly the point that was not covered before. So really opened up with resolution change, although open GOP is about like eight to 9% more efficient than closed GOP, you cannot use it, since the reference pictures would have a different resolution. And this is a big key feature that VVC now enables you to reassemble the reference pictures.

And also you can use them. And this is there are a lot of more of these features that do not show up in these coding efficiency graphs, or some pictures. So you can also chunk your spatial or spatial regions of your picture, you can chunk into network units that you can already transport. So that enables first, directly access in the bitstream some kind of random access within a picture. It’s good for 360 video and also low latency. So you can already send out packets that contain part of the picture. And this is something that is I am coming more from the compression coding algorithm part of view but thinking about these versatility features also for terms of latency and transport aspects in the systems layer that is also really exciting feature of VVC. So being able to have these sub portions of a picture being encapsulated in a data packet that can be sent that out. That’s also amazing feature.

Krishna Rao
Actually very interesting. But I mean, that does bring in loss also into the question like, what happens if that small sub picture gets lost? So error recovery becomes a very interesting research topic. You could do an entire PhD of that, just recovering the sub picture.

Adam
Just to comment on the topic, still recovering the sub picture must be much easier than recovering whole picture, right? That’s true. Think about that.

Krishna Rao
Yeah.

My mind went back to HEVC and interlaced. So are there any surprises with interlaced compression and VVC? Any retrofitting, like we have to do it in HEVC? or things are good?

Benjamin
Yeah, I guess, the interlaced game … pretty much played. So nothing new there. But it’s basically what we, the support that you have in, in HEVC, is maintained. And I’m not sure whether there was one or two things that need to be corrected at later versions of HEVC.

Krishna Rao
The DVB spec I think … they have to be in the same order in display and decode. So yeah, that kind of a mess.

Benjamin
So this fix is carried over into the first version of VVC. So I guess the

Krishna Rao
People will be happy as long as the same fixes is carried over .

Benjamin
I accept.

Krishna Rao
So I guess my last thing that I want to ask us a little bit more of a philosophical question, as I say, every generation of codec says, Hey, we’re gonna give you 50% more compression efficiency and 50% more complexity. And then the complexity keeps growing and compression efficiency obviously grows. But so I want to get your thoughts on designing algorithms. I mean, this might not be related to codecs itself, it might just be more of a research question. But how do you design codecs, keeping encoding complexity as one of the factors.

I can always go from macroblocks to Portree, increase the complexity, improve the efficiency, but then, I didn’t think even one bit about the complexity, but somebody has down the line, the guy who programs Adam has to, he gets scared, he’s like, hey, how do I program this?

How do I make it run in real time, especially for light compression? And like I mentioned the other time LCVC, which is a new concept, fundamentally to our industry. So how do you? How does complexity factor into research when you propose things at standardization bodies there?

Benjamin
So yeah, that’s a very important factor. Also, we, for HEVC, we wrote a paper on complexity, which has one sentence in the introduction saying “Complexity is a very complex topic.” And it gets to the core of that. So it’s hard to generally discuss. It’s very implementation specific, as you say, if you do software implementation, if you ask Adam, what’s his take on complexity, he has like software complexity, running on x85 x86 CPUs in mind.

If you ask guys from Ambarella, or some other chipset manufacturers, they have like a different, more memory bound problems with complexity and so on on hardware, so this is, but if you take the bottom line, it’s okay to be more complex, since we have more computing power, and also we have tools that allows us to deal with the increased search space thats the more options a new codec provides. And there you have to take into consideration two parts of coding towards the first part, our implicit tools, what does that mean? It means that it’s you apply an algorithm and it compresses your source video, and the decoder will do that as well.

So that is a pretty symmetric encoder-decoder, then you have and this is easy. So, this really gives you gain and does not increase the encoder search space. So, this is very nice, and this is also what we have factored into our development. So, these are the tools that we enable first for the fast mode, because they do not require advanced search strategies, because you do not search, then there are tools like motion vector, very simple example. So you need to search for the best motion vector, right. And this can be arbitrary complex until full search where you check every sample position. And there it’s super important to have a smart search strategy that and now we come to the possibilities of the new tools that we have nowadays can be data driven

So you can analyze your input signal your input data. And based on that characteristic, you can already narrow down your search space to a very manageable amount. So although theoretically, you introduce a tool that, as you mentioned, quadtree and whatever kind of tree combinations you want to have exponential growth of product space, that is super complex. However, if you have smart ways to analyze your signal to narrow it down to a reasonable amount of options, you need to check out the encoder is not that bad, and it still provides you the benefit.

So this is key here. And also, in the figure I was just showing, you compare with, for example, x265, the state of the art HEVC codec – fast one, then we can show with the same speed with the same frames per seconds. You can get higher efficiency with VVC, and this is why We have this implementation. So you can take a reference software, you can compare reference software, then it’s like factor x to get the efficiency like 10 times. But this is only part of the picture.

If you have a real optimized implementation let’s take VVenC. And compare that to its predecessor standard HEVC with some other optimized implementation x265. Then you say what can you get squeezed out of that VVC standard for the same processing time? And that’s if that’s a compression game, then, you know, that’s good. Then the last moment with regard to HEVC, and other schemes that are some enhancements, low complexity enhancements of a code, I haven’t dipped deep into these techniques, enough to make a comment there.

But certainly this is it shows that complexity is of concern, right. So people are taking that into account. And, of course, if you, for example, you do encoding in the cloud, and you need to, to pay cloud instances on whatever cloud services use, they’re not for free. Of course, if you have one week of cloud rent versus two weeks of cloud rent, that’s the next financially different. So it also needs to be factored in commercially. So this is it’s still relevant. But with the VVenC, we have shown that you can keep most of the VVC that as promised by very slow reference software in an optimized implementation. So that’s that’s the good news.

Krishna Rao
So that actually kind of probably leads into pre-processing of video playing a very, very crucial role. I guess as codecs more, get more complex your pre-processing and even before you encode one frame or one pixel of video, the amount of work, the more work you can do before probably dictates how much less you might have to do while you’re in compression. I mean, just off the bat, if you can detect it’s a black video on the next one is a black frame, just skip it and go ahead. That’s the most naive form, I guess. But do you think that’s a good place for a lot of machine learning and such neural networks in order to be kind of retrofitted in .. at the pre-processing stage?

Benjamin
yeah, that’s a very good point. And I would also play that ball to Adam, to explain a bit. That’s what the purpose of VvenC is because we are more in the core compression domain. With regard to pre and post processing, we designed VVenC, having specifically these codec agnostic frameworks in mind, but Adam will maybe elaborate a bit more.

Adam
So again, there are a few points to look into here. So we actually have like a pre processing step that we ported from VTM. It’s a bilateral filter that basically enables the codec to make more optimal decisions. And then when you go into machine learning aspects, there is always the factor that you know, when you apply machine learning on VTM is so slow that any added complexity doesn’t show up. But if you go into VVenC, like preset faster, it’s actually pretty fast already. And, you know, inference in machine learning also costs time.

So you know, if you put it in there, you need to factor in your trade off – how much time you To spend, how much time you save, and at what cost with regard to the BD rate. But still, we are looking into those methods, you know, but we also see that with the kind of approach that we’ve been taking, we actually don’t need so much of the pre processing. Because if there is an easy decision to be taken the search algorithm that we have, we’re now to take it, you know, we don’t need to pre process the video to note that the algorithm should take a specific decision. So like in case of a black frame, the cost for you know, for just skipping a block will be so good that the algorithm would know not to look any farther, you know.

So, so yeah, that’s, that’s, that’s the first points. And next, with regard to some, you know, more elaborate prep processing steps. So VVenC, as mentioned, it is a core codec, you know, we want to be able to provide people with, you know, in the pipeline with the step of compressing the stuff. But, of course, there are super interesting things you can make, you know, like another pre processing step, or some kind of pre analysis that can be also fed into the encoder may be some kind of region of interest detection. But this is more like encoding framework, you know, this, this goes beyond the core encoder, which we want to provide with the VVenC,

Krishna Rao
Because, I mean, things that I’ve kind of toyed with in my head is, suppose you have the Home Shopping Network, or you have gotten network, it just plays the same thing, 24 hours a day, you could provide hints to the next one second, I mean, just saying that, hey, the previous one second, used this as the search region, and this is as the QP probably that helps getting a slight bit of a head start instead of searching a lot. So,

Adam
one aspect to mention here, we have our QP adaptation, which basically analyzes the video all the time, also temoprally. So you know, it analyzes the temporal activity, and adapts the bitrate distribution between the frames and within a frame, to you know, to, to optimally allocate the bitrate for, you know, for for the parts that the eye will see. And basically what we could do, we could, you know, boost this analysis with the information from the previous frames. But for now, this is like the complexity of this algorithm is so low that there’s no point for us right now doing this.

And if we, if we were, you know, if we talk like a broadcaster who has a very specific kind of content, and again, we’re at the level of, you know, full compression pipeline, which kind of goes beyond the core codec, you know, even if you would build in some hooks into the core codec to respond to the feedback from the from other pipeline steps. You know, we only have some so much manpower, and there are still so many interesting topics, you know, in the engineering and research within the core codec that this is our focus, you know, we don’t want to lose the focus.

Krishna Rao
No, like you mentioned, if you have kept the software design flexible enough for people to plug in the different use cases, I would say, that is brilliant.

Benjamin
But also one other thing we would like to see is having VVenC inside multimedia frameworks like FFmpeg, because this multimedia framework provides such a huge set of tools that say mp4, mkv, multiplexing, pre filtering all this, the whole filtering pipeline, and to have that there that that’s amazing. So this is what what we envisioned to VVenC as a core encoding, compression component, and everything that you consider as post processing, pre processing and non codec agnostic improvements, you know, things that you can apply to every video codec. This is something I see as a very important factor in such a framework. And so I really like these combinations of having a powerful framework with core compression components where I see VVenC as one of them.

Krishna Rao
Fantastic. I guess that brings me to the end of our discussion. Is there anything else you’d like to add? The floor is all yours, Benjamin and Adam.

Benjamin
Adam, do you do you have something? I guess you mentioned that we are about to release a new version within the next month that gets even a little bit faster. And, yeah, I guess we have some exciting projects that we still need to. Yeah, to finalize, and so stay tuned. We will, hopefully have some more exciting announcement this year.

Krishna Rao
Fantastic. Thank you. Thank you so much for taking the time out of your schedules to join me here. I think this has been very educated for me and hopefully to a lot of people on what goes on inside front of HHI. what goes on inside product development and the thought process. It’s very different from just implementing the spec, the actual research behind it. So thank you very much. If people have any questions to either Benjamin Benjamin or Adam, you can either reach out to me I also if it’s okay, I’ll link the LinkedIn profile so you can reach out to them directly. So until next time, thank you very much. And stay tuned.

Benjamin
Thank you. It was a pleasure. And it really didn’t feel like one hour. It’s like time flies. Let’s just enjoy talking to you who are deeply familiar with the technical stuff.

Adam
Thank you. Thanks.

Krishna Rao
Thanks a lot. Thanks.

About The Author

I’m Krishna Rao Vijayanagar, Ph.D., and I am the Founder and Editor of OTTVerse.com. I've spent several years working hands-on with Video Codecs (AVC, HEVC, MultiView Plus Depth), ABR streaming, and Video Analytics (QoE, Content & Audience, and Ad). I hope to use my experience and love for video streaming to bring you information and insights into the OTT universe. Please use the Contact Page to get in touch with me.

Leave a Reply

Scroll to Top