Request Collapsing or Collapse Forwarding is a very important feature in CDNs that protect CDNs and Origin Servers from being overwhelmed by a large number of redundant requests.
In this article, we go into the basics of CDN Cache Hits, Misses, and then take a look at the Thundering Herd / Cache Stampede Problem, and finally, end with an intuitive understanding of Request Collapsing or Collapse Forwarding.
Table of Contents
What is a Cache Hit & Cache Miss
When you put a CDN between the origin server and the clients, then, essentially, you are adding a caching or storage layer in between yourself (origin) and the clients. The CDN stores popularly requested content and being geographically distributed, the CDN can serve this content faster to your clients in comparison with your origin servers doing the same job. This is explained beautifully by this illustration from Wikipedia.
However, a CDN cannot store all the content that’s present on the origin server, right? A CDN will periodically purge less-frequently requested content from its caches. And, when a client requests a file that is not in the CDN’s cache, the CDN will have to ask the Origin Server for that file.
At this juncture, let’s brush up on two important CDN metrics: cache hit and cache miss.
Cache Miss: A cache-miss occurs when a client requests the CDN for some particular content, and the CDN has not cached that content. When a cache miss occurs, the CDN sends a request back to the origin server for that missing content. When the Origin responds, the CDN caches the content and serves it to the client.
Cache hit: A cache-hit occurs when a client requests the CDN for some particular content, and the CDN has cached that content. The CDN, in this case, will serve the content back to the client device.
Ok, with this whirlwind introduction to CDNs, cache hits, and cache misses, next, let’s learn about something that can adversely affect a CDN.
Thundering Herd Problem or the Cache Stampede Problem
- A client device requests a rather large file from the CDN.
- There are two possibilities at this stage – either this file is cached on the CDN, or it is not.
- Let’s assume that the CDN doesn’t have it in its cache, and, so it has to fetch the file from the origin server.
- While the CDN is fetching the file from the origin server, another client requests the same file from the CDN.
- Now, we have a unique problem on our hands.
- The file isn’t in the CDN’s cache, and,
- Unfortunately, CDN hasn’t received the file from the origin server in response to its earlier request.
So, in the absence of any higher-order intelligence, the CDN requests the same file from the origin server – again!
Now, scale this problem –
- Several thousand clients request the same file (almost simultaneously) from the CDN.
- All of them result in cache misses.
- And, the CDN naively makes thousands of requests to the origin server for that particular file.
Doesn’t this sound like a stampede – one that can crush your origin server? Resembles a DDoS attack, right?
Such situations can occur in video streaming as well, and not just on servers that serve documents or render webpages. For instance, it can occur during
- the worldwide premiere of a movie.
- a live-streaming event that has gone viral
And, it can also occur in a Multi-CDN configuration where multiple CDNs serve different regions and all of them start requesting the same file from the origin at the same time, without any communication between each other.
So, how do commercials CDNs handle this problem and prevent it from overwhelming their infrastructure?
Tackling the Thundering Herd / Cache Stampede Problem in CDNs
There are many techniques of handling the Thundering Herd or Cache Stampede problem in OTT (video streaming).
Collapse Forwarding or Request Collapsing.
Collapse Forwarding is a logical way for CDNs to tackle multiple redundant requests being made almost at the same time. In this method,
- on the first cache miss, the CDN contacts the origin server and requests the file.
- in another cache miss occurs for the same file before the origin can respond, then the CDN adds this second request to a queue and tells it to wait.
- the CDN does the same for every other request that reaches it for the same file (i.e., while it is waiting for the origin to respond).
- after the origin responds with the file, the CDN serves it to the first client and the rest of the clients that it has stored in a queue.
In effect, what the CDN did was to take all the requests and “collapse” them into a single request. When the origin responds with the file, the CDN delivers the file to all the clients that requested it.
This technique is known as Collapse Forwarding or Request Collapsing in CDN terminology.
Some companies take this concept forward and add additional “Origin Shield” layers between the CDN and Origin Servers. These “origin shields” act as additional caching layers, prevent excessive trips to the origin, and can be referenced by multiple geographically-distributed PoPs.
Pre-warming or Pre-Fetching Content into the Cache
Warming the cache is a defensive technique where the CDN’s caches are pre-loaded with content expected to go viral or be in-demand. Some CDNs do not allow this, as, quite frankly, it can defeat the purpose of a CDN and use more storage than what is really needed.
Commercial Implementations of Collapse Forwarding
Most of the popular CDN vendors implement Collapse Forwarding in one form or the other. I’ve linked the documentation of a few CDN vendors below, along with a snippet of text from their websites/docs where they refer to the Collapse Forwarding.
Note, that this is simply a representative list of CDN vendors who have Collapse Forwarding implemented, and isn’t exhaustive! And, just FYI, all rights for the text below rest with the respective companies.
- Amazon Cloudfront
- Quote: “Origin Shield is a centralized caching layer that helps increase your cache hit ratio to reduce the load on your origin. Origin Shield also decreases your origin operating costs by collapsing requests across regions so as few as one request goes to your origin per object. When enabled, CloudFront will route all origin fetches through Origin Shield, and only make a request to your origin if the content is not already stored in Origin Shield’s cache.”
- Quote: “….. When a Limelight PoP receives a request for content that is not cached locally, it will first request the content from one or more designated origin shield PoPs. If the content exists in one of the origin shield PoPs, it will be retrieved and delivered to the requesting PoP over Limelight’s private backbone. Only if the content does not exist in the cache of the origin shield PoPs will the request be forwarded to origin. This greatly reduces requests to origin, minimizes origin egress costs, and improves user experience.“
- Quote: “Request Collapsing causes simultaneous cache misses within a single Fastly data center to be “collapsed” into a single request to an origin server. While the single request is being processed by the origin, the other requests wait in a queue for it to complete.”
- Quote: “Multiple edge requests for uncached content get sent to the Origin Shield, where they are aggregated into a single origin request. And then the content is cached at the shield and further distributed to the requesting edge locations.”
- Quote: “With Origin Shield enabled, when the first request for content arrives at our edge server and that edge server does not have the content cached, it passes the request along to our shield server, which also doesn’t have the content cached. The shield server passes the request along to your origin server. The shield server caches the content that it has retrieved from your origin server and passes it along to our edge server. Finally, our edge server passes the content along to the client.“
- Quote: “To further reduce trips to origin, especially during popular livestreaming events, Cloud Wrapper provides request collapsing. This combines multiple end-user requests when retrieving from origin. The result is fewer, more efficient origin requests.”
That’s it, folks. A simplified explanation of the Collapse Forwarding solution for the Thundering Herd / Cache Stampede problem that can plague CDN vendors. If you feel that I’ve missed something, or said anything incorrectly, do reach out to me. Until next time, thank you and see you again at OTTVerse.com.
PS: If you are interested in video streaming, do check out the rest of OTTVerse’s Video Streaming-centric articles.
If you are interested in CDNs, then do read these articles to learn more about their applications.
- How does a CDN work?
- What is the Thundering Herd Problem in CDNs? What is Request Collapsing?
- How does a Multi-CDN work?
- What is the advantage of using a CDN for Live Streaming?
- What is Cache Hit, Cache Miss, and TTL (Time-To-Live) in CDNs?
Krishna Rao Vijayanagar
Krishna Rao Vijayanagar, Ph.D., is the Editor-in-Chief of OTTVerse, a news portal covering tech and business news in the OTT industry.
With extensive experience in video encoding, streaming, analytics, monetization, end-to-end streaming, and more, Krishna has held multiple leadership roles in R&D, Engineering, and Product at companies such as Harmonic Inc., MediaMelon, and Airtel Digital. Krishna has published numerous articles and research papers and speaks at industry events to share his insights and perspectives on the fundamentals and the future of OTT streaming.