In this article (part of our video compression series), we take a look at the concepts of Closed GOPs and Open GOPs. Both of these GOP types are very common in video streams and can have an impact on your compression efficiency, stream’s error resilience, and switchability in ABR streaming.
What is a GOP or Group Of Pictures in Video Coding?
A Group of Pictures (GOP) as the name suggests is a collection of pictures with a well-defined order in which they are to be encoded/decoded and displayed.
Note: If you are new to compression and frame-types, please refer to this article for a simplified explanation of I (IDR/Key) Frames, P-frames, and B-frames.
A GOP or Group of Pictures consists of pictures (frames) of different types. Fundamentally, these pictures are either
- I-frames (intra prediction only),
- P-frames (predicted from a frame that occurs before it in display order)
- B-frames (predicted from frames that can occur before and after it in display order).
Okay, now with these fundamental picture types, we can construct GOPs (i.e., groups of pictures).
- A GOP generally starts with an I picture and has a sequence of P and B frames following it.
- The distance between two successive I frames is called the size of the GOP or the length of the GOP.
- The distance between successive P frames is called the mini-GOP size.
P and B-frames refer to other frames for the purpose of temporal prediction. The frame used as the predictor can be either an I, P, or a reference-B frame. And a combination of these three types in the case of multiple predictors as allowed in H.264/AVC.
The question then arises – what is the location/position of the predictor/reference frames when analyzed in Display Order. In other words, can a P or a B-frame refer to a frame several GOPs before? It might be the best match – so why not?
Well to answer these questions, we need to understand the concepts of Open and Closed GOPs. So, lets move on!
What is a Closed GOP?
As the name suggests, a Closed GOP is closed to outsiders! A frame belonging to a Closed GOP can only refer to frames within its own GOP.
In the image above, the first GOP is ended with a P-frame instead of a B-frame and this allows the encoder to ensure that frames from the next GOP are not used as predictors. As a side note, what do you think will happen if the last frame of the GOP is a B-frame? Isn’t a B-frame designed to refer to frames before and after it?
Well, in this case, the encoder should ensure that the backward predictor (that references a future frame) is empty. You can produce an empty backward predictor and the B-frame now behaves like a P-frame! However, in all likelihood, your encoder will ensure that the last frame is a P-picture, which simplifies the process of “closing” the GOP.
Closed GOPs are very useful in video streaming and compression. They signify a clean break in the video and ensures that any problems that might occur in that GOP will be confined to that GOP only.
Closed GOPs begin with an I-frame called an IDR or Instantaneous Decoder Refresh. It’s called an IDR because when the decoder encounters an IDR frame, it can flush its picture buffer (Decoded Picture Buffer or DPB) because none of the frames that appeared prior to the IDR can be used as references for the pictures that appear after that IDR. This is a clean break in the sequence.
So, what is the use of an IDR or a Closed GOP?
- ABR streaming: as you might know in ABR streaming, the player can switch between different profiles (bitrate-resolution combinations of the video) depending on the bandwidth and the decoder’s buffer fullness. If the player has to switch from 1080p to 360p, then it needs a clean way of switching. The IDR helps in this because now the player can flush its buffers and make way for the 360p stream to come in. If you are new to ABR, please read our simplified introduction to ABR streaming.
- Error Resilience: if you are streaming using HLS for example, and every segment starts with an IDR, then it means that frames belonging to a particular segment cannot refer to frames belonging to previous/future segments. So, if you lose a segment due to some error, then the player can continue streaming from the next segment onwards. An interesting point to note is that Apple’s HLS Spec mentions that an IDR should be used every two seconds. (Note: the spec does not say that the segment duration should be two seconds – it says that the size of GOP is two seconds).
- Trick modes: as we mentioned earlier, IDRs are very helpful in implementing trick modes (seeking forward and back). A player needs to seek the nearest IDR and it can begin streaming from that point onwards.
Okay with that understanding of Closed GOPs, let’s move on to Open GOPs.
What is an Open GOP?
An Open GOP is the opposite of a Closed GOP (duh!) and it allows frames from an Open GOP to refer to frames from another GOP. Take the second I-frame in the image below. B-frames from the previous GOP use it as a predictor implying that this is an Open GOP. It is represented by the yellow arrows.
Open GOPs are particularly useful when,
- you do not need to start a new segment of video for ABR but need to close the GOP.
- when you want to get more compression efficiency because the B-frames now have access to one more high-quality predictor.
- when you need to insert an I-frame (either to refresh the video quality, or, but, you are not at a scene-change and so, it doesn’t really matter if prediction occurs across an I-frame.
I hope this clarified some of your questions on Closed GOPs, Open GOPs, and IDRs. Do check out the rest of our articles on Video Compression! If you want us to cover any topic in video compression, please let us using the Contact Form.
Thank you and see you next time on OTTVerse.com!