📜 ⬆️ ⬇️

HTTP / 3: from root to tip

The application layer protocol HTTP is the basis of the Internet. He began his life in 1991 as HTTP / 0.9, and by 1999 turned into HTTP / 1.1, which was standardized by the Internet Engineering Task Force (IETF).

HTTP / 1.1 satisfied everyone for a long time, but the growing needs of the Network required an upgrade - and in 2015 they accepted HTTP / 2. The story is not over: just recently, the IETF announced a new version of HTTP / 3. For some, this came as a surprise and caused some confusion. If you are not tracking IETF, HTTP / 3 may seem to come from nowhere. Nevertheless, we can trace its origin in the history of experiments and the evolution of web protocols, in particular, the QUIC transport protocol.

If you are not familiar with QUIC, my Cloudflare colleagues have covered various aspects in some detail: for example, see the articles on the real disadvantages of modern HTTP and details on the transport layer protocol . We have collected these and other materials on cloudflare-quic.com . And if you're interested, be sure to check out quiche : this is our own QUIC implementation, written in Rust with open source code.

HTTP / 3 - translation of the QUIC transport protocol for the application layer. The name HTTP / 3 was officially approved only recently, in the 17th version of the draft ( draft-ietf-quic-http-17 ). It was proposed at the end of October 2018, and consensus was reached at the IETF 103 meeting in Bangkok in November.

Previously, HTTP / 3 was known as HTTP by QUIC, and before that it was known as HTTP / 2 by gQUIC, and even earlier - SPDY by gQUIC. But the bottom line is that HTTP / 3 is just the new HTTP syntax that runs on the IETF QUIC protocol, a multiplexed and secure UDP-based transport.

In this article, we will look at the history of some of the previous HTTP / 3 names and present the motivation for the last name change. Let's go back to the first days of HTTP and everything that happened during this time. If you want to get the full picture, you can go directly to the end of the article or open this very detailed version of SVG .


HTTP / 3 layer cake

Situation


Just before focusing on HTTP, it’s worth recalling that there are two protocols called QUIC. As we have already explained , gQUIC is commonly used as an acronym for Google QUIC (the original protocol), and QUIC as an IETF version that differs from gQUIC.

Since the beginning of the 90s, the needs of the Internet have changed. We have new versions of HTTP and a new level of security in the form of the Transport Layer Security (TLS) protocol. In this article we will only touch on TLS, and in other articles of our blog you can study the topic in more detail.

The history of HTTP and TLS cannot be expressed by a simple list of dates, since some branches evolved in parallel and overlapped in time. When you are trying to connect all the points for almost 30 years of Internet history, you cannot do without visualization. So I made this schedule: Cloudflare Secure Web Timeline. (note: technically this is a cladogram , although people are more familiar with the word "chart").



For the sake of beauty, I dropped some of the information, focusing only on the successful branches in the IETF space. For example, the efforts of the HTTP-NG working group of the W3 consortium are not shown here, as well as some exotic technologies that the authors are still trying to explain their pronunciation: HMURR (pronounced "hummer") and WAKA (pronounced "wah-kah").

In the following sections, we will go through this cladogram and consider some turning points in the HTTP history. I hope this helps to understand why standardization is beneficial to all, and how the IETF approaches this issue. Therefore, let us begin with a very brief overview of the topic, before returning to the wig itself. Feel free to skip the next section if you are already familiar with the IETF.

Types of Internet Standard


Typically, standards define general competence, scope, limitations, applicability, and other considerations. Standards exist in different shapes and sizes. They can be informal (de facto standard) or formal (agreed / published by a standard setting organization such as IETF, ISO or MPEG). Standards are used in many areas, there is even a formal British tea-making standard - BS 6008.

The first definitions of the HTTP and SSL protocols were published outside the IETF: they are marked with red lines in the graph. But widespread use made them de facto standards.

At some point, it was decided to formalize these protocols (some reasons are described below). Internet standards are usually defined in the IETF, which is guided by the unofficial principle of “exemplary consensus and current code” based on actual applications on the Internet. This is different from the “clean room” approach when someone tries to develop ideal protocols in a vacuum.

IETF standards are commonly known as RFCs. This is difficult to explain in brief, therefore I recommend the article “How to read RFCs” from Mark Nottingham, co-chair of the QUIC working group. A working group or WG is, in essence, just a mailing list.

Three meetings are held each year for the personal meetings of members of all working groups, if they so wish. The agenda for these weeks can be very rich, there is not enough time for an in-depth discussion of technical issues. Therefore, some prefer to have more interim meetings between general IETF meetings. Since 2017, the QUIC working group has held several intermediate meetings, the full list is available on the page for meetings .

These meetings have the opportunity to meet with experts from other organizations, such as the Internet Architecture Board (IAB) or the Internet Technology Research Group (IRTF). In recent years, the IETF hackathon has traditionally been held at the weekend before the IETF meeting. This is where real code is developed and, importantly, pass compatibility tests. This helps to find problems in the specifications that can be discussed directly at the meeting.

It is important to understand that RFCs do not arise from nowhere. They go through a whole process. It usually starts with the IETF Internet Draft (ID) draft that is submitted for review. In the case when the specification is already published, the preparation of the ID will be a simple reformat. The service life of the ID is 6 months from the date of publication. To keep it active, you need to publish new versions. In practice, there is nothing terrible in the fact that the ID expires. This happens quite often. Documents are still stored on the IETF website and are open to all.

On the cladogram, drafts are presented in purple . Each has its own name in the draft- {author} - {working group} - {theme} - {version} format. The WG field is optional, it may indicate a future IETF working group and sometimes changes. If the ID is approved by the IETF or initiated directly within the IETF, then the draft is called draft-ietf- {working group} - {topic} - {version} . IDs may branch, merge or fade. The version starts at 00 and is increased by one with each new project. For example, the fourth draft will receive the version number 03. Each time the ID name is changed, its version is reset to 00.

It is important to note that anyone can submit a draft to the IETF: they cannot be considered as standards. But if the standardization process reaches a consensus and the final document passes the test, we will get an RFC. At this stage, the name is changed again. Each RFC receives a unique number, for example, RFC 7230 . Documents with this status are presented in the form of blue lines .

RFC is forbidden to change. That is, changes in the RFC require the adoption of a document with a new number. Changes are allowed only for the correction of editorial or technical errors or for a simple optimization of the layout. Newer RFCs can completely replace old ones or supplement them.

All IETF documents are publicly available at http://tools.ietf.org . Personally, it seems to me a bit more comfortable with IETF Datatracker , because the document path from ID to RFC is visually displayed there.

Below is an example that shows the evolution of the RFC 1945 standard, that is, HTTP / 1.0.


RFC 1945 history in the IETF Datatracker interface

Interestingly, in the course of the work, I found that the above visualization is incorrect. For some reason, draft-ietf-http-v10-spec-05 is missing. Since the ID lifetime is 6 months, it probably expired before the adoption of the RFC, although in reality the draft was active until August 1996.

Cladogram study


After a small theoretical introduction, we can proceed to the study of the cladogram. This section presents some excerpts from the most important parts. Each dot indicates the date of submission of the document or function. For clarity, IETF documents have project numbers omitted, but they are all in the full version .

HTTP appeared in 1991 as the HTTP / 0.9 protocol, and in 1994 a draft draft-fielding-http-spec-00 was published. Soon it was adopted by the IETF, with the result that the name changed to draft-ietf-http-v10-spec-00 . After six drafts of the draft in 1996, the RFC 1945 standard was adopted - HTTP / 1.0.



However, even before the completion of work on HTTP / 1.0, a separate HTTP / 1.1 project was launched. The draft version of draft-ietf-http-v11-spec-00 was published in November 1995, and officially adopted as RFC 2068 in 1997. The keen eye will notice that the cladogram does not quite reflect this sequence of events - an unsuccessful visualization tool glitch. I tried as much as possible to minimize such problems.

In mid-1997, HTTP / 1.1 revision began as draft-ietf-http-v11-spec-rev-00 . It ended in 1999 with the publication of RFC 2616 . Until 2007, everything was quiet in the IETF HTTP world. Let's come back to this later.

SSL and TLS history




Let's switch attention to the SSL trajectory. We see that the SSL 2.0 specification came out around 1995, and SSL 3.0 came out in November 1996. Interestingly, SSL 3.0 is approved in RFC 6101 , which appeared only in August 2011. It is located in the historical section. According to the IETF , it was created “to document ideas that were reviewed and rejected, or protocols that already existed by the time they decided to document them.” In this case, an IETF document with a description of SSL 3.0 was needed to be used everywhere as a canonical link.

We are more interested in how SSL inspired the engineers to develop TLS, which began with a draft draft-ietf-tls-protocol-00 in November 1996. It went through 6 draft versions and was published as RFC 2246 - TLS 1.0 in early 1999.

In 1995-1999, SSL and TLS protocols were used to protect HTTP connections on the Internet. It worked great as a de facto standard. Only in January 1998, with the publication of the draft draft-ietf-tls-https-00 , the process of official standardization of HTTPS began. The work ended in May 2000 with the publication of RFC 2616 - HTTP over TLS.

TLS continued to evolve from 2000 to 2007, with the adoption of TLS 1.1 and 1.2 standards. Then there was a seven-year pause before work began on the next version of the TLS protocol, which will be published as draft-ietf-tls-tls13-00 in April 2014, and after 28 drafts will be approved as RFC 8446 - TLS 1.3 in August 2018.

Internet standardization process


After a brief introduction to the cladogram, I hope it became better to understand how the IETF works. When creating standards, researchers or engineers develop experimental protocols for specific use cases. At various levels, they experiment with public or private protocols. The information obtained allows you to identify problems or improve the protocol. The publication of the work helps to explain the experiment, the collection of the opinion of a wider range of specialists or to find the help of other performers. If other participants accept this work at an early stage, then it will become a de facto standard, and eventually there will be enough momentum for official standardization.

The official status of the protocol is an important factor for organizations that are thinking about using it. The formal standardization process increases the attractiveness of the standard de facto, because it usually provides stability. The management and leadership is undertaken by a reputable organization, such as the IETF, which reflects the interests and experience of many participants. But it should be noted that not all formal standards become successful.

The process of creating a standard is almost as important as the standard itself. Processing the initial idea, inviting people to discuss with broader knowledge, experience and use cases - all this helps to create something more useful for a wide audience. However, the standardization process is not always easy. There are pitfalls and obstacles. Sometimes the process takes so much time that the result is no longer relevant.

Each organization that defines the standards usually has its own process, focused on its field of activity and participants. Explaining all the details of how the IETF works is well beyond the scope of this article. If you're interested, a great starting point would be the “How we work” page on the IETF website. As usual, the best way to figure it out is to take part yourself. Simply join the mailing list or the discussion in the corresponding GitHub repository.

Cloudflare Work Code


Cloudflare is proud to be one of the first to introduce new protocols, as was the case with HTTP / 2 and other technologies. We also test experimental and not yet approved features, such as TLS 1.3 and SPDY .

Running a real code helps to understand how well the protocol will work in practice. We combine expert knowledge with experimental information to help improve the code and, where it makes sense, report problems or improvements to a working group that standardizes the protocol.

Testing innovations is not the only priority. A true innovator always knows when to postpone innovation until better times. Sometimes this refers to security-oriented protocols: for example, Cloudflare has disabled SSLv3 by default due to the POODLE vulnerability. In other cases, the protocols are replaced by more technologically advanced ones: for example, we replaced SPDY with HTTP / 2 .

The introduction and deactivation of protocols on Cloudflare is represented by orange lines . Vertical landmarks help align Cloudflare events with relevant IETF documents. For example, Cloudflare introduced support for TLS 1.3 in September 2016, and the final RFC 8446 document was published almost two years later, in August 2018.



Refactoring: HTTPbis


HTTP / 1.1 is a very successful protocol. The chart does not show much IETF activity after 1999. But in reality, years of active use of the protocol gave experience and revealed hidden problems of RFC 2616, including some compatibility issues. In addition, the protocol has been extended by other RFCs, such as 2817 and 2818. As a result, in 2007 it was decided to begin activities to improve the HTTP specification. It is called HTTPbis (where “bis” is derived from the Latin word for “two,” “twice,” or “repetition”). The initial charter of the new working group well describes the problems that it tried to solve.

In general, RFC 2616 began refactoring on HTTPbis. It includes bug fixes and the introduction of some aspects from other specifications that are published at the same time. It was decided to divide the document into parts. As a result, in December 2007, six drafts were published:




The diagram shows how the work progressed during the long seven-year development process. Before the final standardization, 27 drafts were accepted. In June 2014, the so-called RFC 723x series came out (where x ranges from 0 to 5). The chairman of the HTTPbis working group noted this achievement with the phrase “RFC2616 is dead” . If someone did not understand, the new documents sent to the archive the old RFC 2616 .

What does this have to do with HTTP / 3?


While the IETF was refining RFC 723x, the world was not in place. People continued to expand and complement HTTP. Among them are Google engineers who have begun to experiment with their own protocol called SPDY (pronounced “speedy”). They said that this protocol speeds up the loading of web pages, which is an essential function of HTTP. At the end of 2009, the first version was announced, and in 2010 SPDY v2 quickly appeared.

I do not want to go into the technical details of SPDY, but it is important to understand that SPDY took the main HTTP paradigms and slightly changed the exchange format for optimization. Looking back, we see that HTTP has clearly defined semantics and syntax. Semantics describes the concept of exchanging requests and responses, including methods, status codes, header fields (metadata) and bodies (useful data). The syntax describes how to map semantics to bytes on a network.

HTTP / 0.9, 1.0 and 1.1 have a lot of common semantics. They also use a common syntax in the form of character strings sent via TCP connections. SPDY took the HTTP / 1.1 semantics and changed the syntax to binary. This is a really interesting topic, but today we will not go deep into this rabbit hole.

Experiments with SPDY have shown that changing the syntax of HTTP really brings effect. At the same time, it is important to preserve the existing semantics. For example, saving the URL format for using https:// allowed us to avoid many problems that could affect the implementation of HTTPS.

After seeing some positive results, the IETF decided it was time to consider the options for HTTP / 2.0. The slides from the HTTPbis session, held during the IETF 83 meeting in March 2012, outlined the requirements and goals that the developers had set for themselves. It is also clearly stated: "HTTP / 2.0 only means that the transport protocol (wire format) is not compatible with HTTP / 1.x"



During this meeting, the community was invited to express their ideas. Among the drafts submitted for review were draft-mbelshe-httpbis-spdy-00 , draft-montenegro-httpbis-speed-mobility-00 and draft-tarreau-httpbis-network-friendly-00 . In the end, the draft SPDY was adopted, and in November 2012 work began on draft-ietf-httpbis-http2-00 . After 18 drafts for a little over two years, RFC 7540 appeared - HTTP / 2. By 2015, the HTTP / 2 syntax has gone just enough to make HTTP / 2 and SPDY incompatible.

These years have become a very stressful period for working groups, which in parallel carried out HTTP / 1.1 refactoring and accepted HTTP / 2. This contrasts sharply with the many years of calm in the early 2000s. Be sure to check out the full cladogram to truly appreciate the amount of work done.

Despite the standardization of HTTP / 2, experiments with SPDY are still beneficial. Cloudflare introduced support for SPDY in August 2012 and removed it only in February 2018, when our statistics showed that less than 4% of web clients request it. Meanwhile, shortly after the publication of the RFC in December 2015, we introduced HTTP / 2 support, when the analysis showed significant support for web clients.

The SPDY and HTTP / 2 protocols use TLS by default. The introduction of universal SSL in September 2014 made it possible to ensure that all Cloudflare users will take advantage of new protocols as they are implemented.

gQUIC


Google continued to experiment and until 2015 released another version of SPDY v3 and v3.1. They also began working on the gQUIC protocol, the first draft of which was published in early 2012.

Earlier versions of gQUIC used HTTP SPDY v3 syntax. This choice made sense, because HTTP / 2 is not yet approved. SPDY's binary syntax is packaged in QUIC packets, which are sent in UDP datagrams. This is a departure from the TCP transport that HTTP traditionally relied on. The entire system assembly looked like this:


SPDY puff pie by gQUIC

To improve the performance of gQUIC used clever tricks. One of them - to blur the clear boundary between the application and transport. In practice, this meant that gQUIC only supports HTTP. This connection was so strong that gQUIC, which at the time was called QUIC, was considered as a candidate for the next version of HTTP. Although many changes were made to QUIC later, many still believe that it supports HTTP only. Unfortunately, this leads to constant confusion when discussing the protocol.

gQUIC continued to evolve and eventually switched to syntax much closer to HTTP / 2. So close that most people began to call it "HTTP / 2 by QUIC". But due to technical limitations, some very subtle differences emerged. One example is serialization and the exchange of HTTP headers. This is a minor difference, but in practice it means that HTTP / 2 by gQUIC is incompatible with HTTP / 2 from IETF.

Last but not least, you should always consider the security aspects of Internet protocols. And the gQUIC developers decided to abandon TLS in favor of a different approach called QUIC Crypto. One of the interesting innovations there is a new method of speeding up handshakes. After establishing a secure session with the server, the client can reuse the information and fix the “zero” time of a handshake, that is, 0-RTT. This trick was later included in the TLS 1.3 protocol.

Can you finally find out what HTTP / 3 is?


Nearly.

Now we understand how standardization works. So, consideration gQUIC went on the same scenario. In June 2015, the first draft of the draft-tsvwg-quic-protocol-00 , entitled “QUIC: secure and reliable UDP-based transport for HTTP / 2”, was submitted. But do not forget that in the end, the syntax of the protocol is almost brought to compatibility with HTTP / 2.

Google has announced that "BoF will be held at the IETF 93 meeting in Prague." If you are interested in what BoF is, please refer to RFC 6771 . In short, BoF ( Birds of a Feather ) is an informal meeting at the conference.



Following the discussion with the IETF, it was decided that QUIC has many advantages at the transport level, this protocol should be separated from HTTP and a clear separation between layers should be reintroduced. In addition, for this protocol they decided to return the handshake on the basis of TLS (which is not so bad, because by this time they have already developed TLS 1.3 with the 0-RTT scheme).

About a year later, in 2016, a new set of drafts was released:


This is where the confusion reappeared: draft-shade-quic-http2-mapping-00 is called "HTTP / 2 semantics using the QUIC transport protocol" and describes the "display of HTTP / 2 semantics by QUIC". However, this is the wrong name. The essence of HTTP / 2 is in changing syntax while maintaining semantics. In addition, “HTTP / 2 by gQUIC” has never been an accurate syntax description, for the reasons I stated earlier. Keep this in mind when meeting further events.

This IETF version of QUIC should be a completely new transport protocol. This is a serious undertaking, so the IETF tried to assess the interest in the project from its members. To this end, a BoF session ( slides ) was held at the IETF 96 meeting in Berlin in 2016. I was fortunate to be present in person at that meeting, which attracted hundreds of participants, as evidenced by the photograph of Adam Roach . As a result, consensus was reached: QUIC will be adopted and standardized in the IETF.

The first IETF QUIC draft draft-ietf-quic-http-00 for translating HTTP to the QUIC transport logically simplified the name of the protocol to “HTTP by QUIC” (HTTP over QUIC). Unfortunately, the work was not completed, therefore, throughout the organization, various HTTP / 2 terms were used. The new standards editor repository store, Mike Bishop, saw the problem and began correcting the incorrect HTTP / 2 mentions. In the next protocol version, the description has changed to “mapping HTTP semantics over QUIC” (mapping of HTTP semantics over QUIC).

Gradually, over time and with newer versions, the term “HTTP / 2” was used less frequently, if necessary, simply by pointing to RFC 7540 . Two years later, in October 2018, the seventeenth version of the draft (number 16) was published. Although the HTTP over QUIC protocol has similarities with HTTP / 2, in fact it is an independent and incompatible HTTP syntax. However, for people who are not closely following the work of the IETF (and this is a very large percentage of the world's population), the title of the document does not reflect this difference. One of the main objectives of standardization is the promotion of communication and interoperability. And such a simple thing as naming has become the main cause of confusion in the community.

Recall what was said in 2012: “HTTP / 2.0 means only that the format is incompatible with HTTP / 1.x for transport”. The IETF followed this precedent. After much discussion before and during the IETF 103 conference, there was still a consensus about renaming “HTTP over QUIC” to HTTP / 3.

The world has become better, and we can move on to more important discussions.

But RFC 7230 and 7231 do not agree with your definition of semantics and syntax!


Sometimes the names of the documents can be confusing. Here are the HTTP documents describing syntax and semantics:


For such names, it can be assumed that the fundamental semantics of HTTP is specific to a specific version of HTTP, that is, HTTP / 1.1. But this is a random side effect of the HTTP family tree. The good news is that the HTTPbis working group is trying to solve the problem. Some brave WG members have begun another round of document revision. This work is being done right now and is known as the work of HTTP Core (you may have heard about this working group under the name HTTPtre or HTTPter: everything is also ambiguous here) Their efforts will allow you to compress six drafts to three:


Within this new structure, it becomes more obvious that HTTP / 2 and HTTP / 3 are syntactic definitions for the general HTTP semantics. This does not mean that they do not have their own functions outside of syntax, but this should help in further discussion.

Putting it all together


This article has superficially described the HTTP standardization process in the IETF over the past three decades. Without really touching on the technical details, I tried to explain how we now came to HTTP / 3. If you missed the middle and are looking for the essence in one phrase, here it is: HTTP / 3 is just a new HTTP syntax that works on IETF QUIC, a multiplexed and secure UDP-based transport . There are many interesting technical nuances, but you have to postpone them for another time.

We looked at the important steps in developing HTTP and TLS, but separately from each other. Now at the end of the article once again publish the full cladogram. You can calmly and carefully study it in a comfortable environment. And for super funders: here is an absolutely full version, including drafts .

Source: https://habr.com/ru/post/438810/