Summarize fromPart 1: HTTP/1.1 faces a head-of-line (HOL) blocking issue, meaning that if one response is large or slow, it can hold up all the other responses waiting behind it. This happens because the protocol is text-based and lacks the way to separate different resource parts. To get around this, browsers tend to open multiple TCP connections at once, but that’s not an efficient solution and doesn’t really scale effectively.

I. HOL blocking in HTTP/2

The main aim of HTTP/2 was straightforward: to allow us to return to using a single TCP connection by addressing the HOL blocking issue. In simpler terms, we wanted to achieve effective multiplexing of resource chunks. This was a challenge in HTTP/1.1 since it couldn't identify which resource a segment was associated with or where one ended and another started.

HTTP/2 introduced a new binary framing mechanism that changes how the data is exchanged between the client and server and tackled this problem nicely by adding small control messages, known as frames, in front of the resource chunks. For more detail, you could read this article to gain more technical perspective about the design in depth. You can see this in illustrated Figure 1:

Shortly, HTTP/2 added a DATA frame for every chunk of data. These DATA frames carry two key pieces of information. First, they specify which resource the chunk is associated with, using a unique identifier called the stream id for each resource's byte stream. Second, they indicate the size of the upcoming chunk. The protocol also includes various other frame types, such as the HEADERS frame shown in Figure 2. This frame also utilizes the stream id to link the headers to their corresponding response, allowing headers to be separated from the actual response data, and all of which are multiplexed within a single TCP connection.

Unlike our examples in HTTP/1.1, the browser now handles this situation perfectly. It starts by processing the HEADERS frame for script.js, followed by the DATA frame for the first JavaScript chunk. The browser can tell from the chunk length in the DATA frame that it only goes up to the end of TCP packet 1, meaning it needs to look for a new frame starting in TCP packet 2. Sure enough, it finds the HEADERS for style.css there. The next DATA frame has a different stream ID (2) compared to the first DATA frame (1), so the browser recognizes that this is for a different resource. The same process occurs with TCP packet 3, where the stream IDs in the DATA frames help the browser sort the response chunks back to their respective resource streams.

HTTP/2 is more flexible than HTTP/1.1 because it uses "framing" for individual messages. This means it can send multiple resources at once over a single TCP connection by multiplexing their chunks together. Then, it addresses the issue of Head-of-Line (HOL) blocking when a slow resource is involved. Instead of just sitting around waiting for the slow index.html to be generated completely, the server can start sending data for other resources while it waits for index.html.

As a consequence of HTTP/2's multiplexing, we now need a way for the browser to tell the server how it wants the bandwidth of that single connection divided among different resources. Basically, each resource is given a certain order in which it has to be sent.

For example .htmlis the most important, so it has a priority of 1, .css and .js get 2 and 3, while images get 4. When there are several resources waiting to be sent that share the same priority, their data can be mixed together. This means the server will send a piece of each resource in turn, as illustrated in figure 3.

HTTP/2 prioritization is much more complex than these examples that desire another dedicated blog post, and you don't need to dive into it for the rest of this one, so I won't cover it here.
As you can see, thanks to HTTP/2's frames and its prioritization system, we've addressed the HOL blocking issue in HTTP/1.1. And you can see that browsers now just open one TCP connection for each domain when requesting HTTP/2 websites.

So, does that mean we're all set to call it a day? Unfortunately, not quite yet. While we've solved the HOL blocking in HTTP/1.1, we still face HOL blocking in the transport protocol that HTTP/2 runs on, called TCP HOL blocking.

II. TCP HOL blocking

It turns out that HTTP/2 has addressed Head-of-Line (HOL) blocking specifically at the HTTP level, which can be referred to as "Application Layer" HOL blocking. However, it is important to recognize that there are additional layers beneath this that must be taken into account within the standard networking model. This is illustrated clearly in Figure 4.

HTTP is at the top of the stack, it relies on TLS for security (optionally), which is then supported by TCP at the transport layer. Each of these protocols adds some metadata to the data from the layer above. For instance, the TCP packet header is added to our HTTP(S) data, which is then wrapped in an IP packet, and so on. This setup creates a clean separation between the protocols, which is great for reusability. A transport layer protocol like TCP doesn’t need to worry about the type of data it’s handling—whether it’s HTTP, FTP, or SSH, it all works the same, and the IP layer works well with both TCP and UDP.

This, however, has significant consequences for our ability to multiplex several HTTP/2 resources over a single TCP connection. Refer to Figure 5 for illustration.

Notes: each HTTP/2 frame, like DATA and HEADERS, does have a few bytes of size, but I haven't included that extra overhead or the HEADERS frames here to keep the numbers straightforward.

Although both the browser and we recognize that we are retrieving JavaScript and CSS files, HTTP/2 knows that it manages chunks from various resource stream identifiers. However, TCP does not recognize that it is transmitting HTTP data. TCP merely understands that it is transferring a sequence of bytes from one computer to another. To accomplish this, it utilizes packets of a specific maximum size, generally around 1450 bytes. Each packet is responsible for indicating the data (byte range) it contains, enabling the original data to be accurately reassembled in the correct sequence.

In simpler terms, there's a difference in how the two Layers view things. HTTP/2 recognizes multiple independent streams of data, while TCP only sees one continuous stream. Take TCP packet 3 from Figure 5 as an example: TCP only knows it's sending bytes 750 to 1599 of whatever data it’s handling. In contrast, HTTP/2 understands that packet 3 actually contains two parts from two different resources.

It might feel like we're diving into too many details, but the truth is, the Internet isn't the most reliable network out there. Packets can get lost or delayed while transmitting from one point to another. This is one of the main reasons TCP is widely used; it adds a layer of reliability to the unreliable IP layer. It achieves this by simply resending any lost packets.

We can now comprehend how this situation can result in HOL blocking at the Transport Layer. Let us revisit Figure 5 and consider the implications if TCP packet 2 is lost during transmission, while packets 1 and 3 are successfully received.

It is important to note that TCP is unaware that it is transmitting HTTP/2; it is solely focused on delivering data in the correct sequence. Consequently, TCP recognizes that the data in packet 1 is available for use and forwards it to the browser. However, it identifies a gap between the bytes in packet 1 and those in packet 3, which indicates the absence of packet 2. As a result, TCP cannot forward packet 3 to the browser at this time. Instead, it retains packet 3 in its receive buffer until it obtains a retransmission of packet 2, which requires at least one round-trip to the server. Only then can both packets be delivered to the browser in the proper order. In other words, the loss of packet 2 causes HOL blocking for packet 3.

The issue may not be immediately apparent, so let us explore the contents of the TCP packets at the HTTP layer as illustrated in Figure 5. It is evident that TCP packet 2 contains solely the data for stream id 2, which corresponds to the CSS file, while packet 3 includes data for both stream 1, associated with the JS file, and stream 2. At the HTTP level, these two streams are independent and distinctly separated by the DATA frames. Therefore, in theory, it would be possible to transmit packet 3 to the browser without waiting for the arrival of packet 2. The browser would receive a DATA frame for stream id 1 and could utilize it immediately. Only stream 2 would need to be delayed, pending the retransmission of packet 2. This approach would be more efficient compared to the current TCP method, which results in the blocking of both streams 1 and 2.

In another scenario, if packet 1 is lost while packets 2 and 3 are successfully received, TCP will pause the delivery of both packets 2 and 3, holding them back until packet 1 is retransmitted. However, at the HTTP/2 level, the complete data for stream 2, which is the CSS file, is contained in packets 2 and 3, meaning it doesn't need to wait for the retransmission of packet 1. This means the browser could have easily parsed and utilized the CSS file, but it remains stuck waiting for the retransmission of the JavaScript file.

In conclusion, since TCP is unaware of the independent streams in HTTP/2, any head-of-line (HOL) blocking at the TCP layer—caused by lost or delayed packets—ultimately leads to HOL blocking in HTTP as well!

You might be wondering: what’s the purpose of HTTP/2 if TCP head-of-line (HOL) blocking still exists ?
The answer is that while packet loss can occur on networks, it’s actually quite uncommon. On high-speed wired networks, for instance, packet loss rates are typically around 0.01%. Even on the most unreliable cellular networks, you usually won’t encounter loss rates exceeding 2%.

Additionally, packet loss tends to happen in bursts. A 2% packet loss rate doesn’t mean you’ll consistently lose 2 out of every 100 packets (like packet 12 and packet 53). Instead, it’s more likely that you might lose a series of 10 packets within a batch of 500 (for example, packets 355 to 365). This burstiness is often due to temporary memory buffer overflows in routers along the network path, which leads to packet drops when they can’t be stored.

While the technical details are not crucial here, what’s essential to understand is that while TCP HOL blocking is indeed a factor, its effect on web performance is significantly less severe than the HOL blocking experienced with HTTP/1.1, which you’re almost guaranteed to encounter every time and is also affected by TCP HOL blocking!

This is primarily accurate when we compare HTTP/2 using a single connection to HTTP/1.1 also on a single connection. However, as we've previously discussed, this isn't how things typically operate in the real-world, since HTTP/1.1 usually establishes multiple connections. This capability helps HTTP/1.1 reduce both HTTP-level and TCP-level head-of-line (HOL) blocking to some extent. Consequently, there are just some cases where HTTP/2 on a single connection can be faster or even as fast as the speed of HTTP/1.1 operating over six connections. This is mainly due to TCP's congestion control mechanism. However, this topic is quite deep and not central to our discussion on HOL blocking, but you can find it elsewhere if you’d like to know.

For illustrative purposes, let’s look at the site performance test results on webpagetest (run with Chrome browser and FIOS connection).

This is a website using HTTP/1.1

And this one using HTTP/2, which has almost the same performance as HTTP/1.1

But, (spoiler) with an appropriate reorganization for websites using HTTP/2, it could bring a modest performance improvement and enhance visual progress in overall, which would be discussed in another blog post in the future.

III. Conclusion

Overall, in real-world usage, it turns out that HTTP/2 websites are generally as fast or even slightly faster than HTTP/1.1 under most circumstances. However, there are situations, particularly on slower networks with more packet loss, where using HTTP/1.1 with six connections can actually perform better than HTTP/2 with just one connection. This often happens because of the TCP-level HOL blocking issue. This challenge was a big reason behind the creation of a new transport protocol to replace TCP, named QUIC.

HTTP Adventure: [Part 2] The evolution of HTTP/2

I. HOL blocking in HTTP/2

II. TCP HOL blocking

III. Conclusion

Most read

Thuyết trình với quy tắc ABC

“Tân binh khủng long” của MFV chính thức “Nhật tiến”

Unconscious Bias Trong Chốn Công Sở

How our team set up Automation Test by Playwright and Cucumber for multiple projects?

Mock tạo thủ công có thể dùng để test trong Golang

More like this

Dev Trẻ MFV Nói Gì Về Golang

A Summary of Live Coding a Simple CLI for gRPC Testing at Go Conference 2023

Life cycle of a Scrum team and lessons learned