The internet is already growing, no need to say that. With that is growing demands, increasing load on existing infrastructure. New, better protocols are defined in every direction, let it be OpenAPI or HTTP/3. It is not limited to HTTP. With time there has been seen a significant boom in messaging protocols like RabbitMQ and similar related services like Amazon SQS. With this post I have tried to collect as much information as I can about evolution of HTTP without getting too into any particular part.
In today’s post I will dig and show what changes have been made over time with a little bit of context.
Hypertext Transfer Protocol (HTTP) is an application-layer protocol for transmitting hypermedia documents, such as HTML. It was designed for communication between web browsers and web servers, but it can also be used for other purposes. HTTP follows a classical client-server model, with a client opening a connection to make a request, then waiting until it receives a response.
Before getting started, for those who don’t know, when you send or receive an HTTP request, the packets go through these layers.
This video does a good job of explaining briefly how each layer works.
And HTTP is just the application layer protocol, among tons other protocols in same layer. HTTP is implemented by different HTTP clients and servers.
HTTP only presumes a reliable transport; any protocol that provides such guarantees can be used.
Without further ado, let’s get started.
HTTP was developed by Tim Berners-Lee as you might have heard of. It was build between 1989-1991 and has no version number back then. It was later dubbed as HTTP/0.9.
Back in the time. HTTP was..
- a simple protocol to transfer HTML files from server to client
- there were no headers, no status codes. Just one TCP connection.
- The client sent a request, and server sent a response.
It was that simple.
HTTP/0.9 was developed by Tim Berners-Lee at European Organisation for Nuclear Research in 1989. The initial version of HTTP consist of 4 building blocks:
- A web browser
- A web server
Yes, it all started at the same time.
The first servers wer running out of CERN by early 1991.
A simple request made by an HTTP client might look like this today:
GET /index.html HTTP/1.1
But back then, HTTP/0.9 has no HTTP version. The requests used to look like this:
HTML also had some very basic tags, all using all-caps. Some of the HTML tags of that time are
A (anchor) tag,
P (paragraph) tag,
OL tags, which are still being used till date. Some of them are either not being used or are replaced by something else, which includes
HP2 tags et cetra. You can still find the documentation at CERN website.
Unlike subsequent evolutions, there were no HTTP headers, meaning that only HTML files could be transmitted, but no other type of documents. There were no status or error codes: in case of a problem, a specific HTML file was send back with the description of the problem contained in it, for human consumption.
The first official spec of HTTP came out in 1996 HTTP/1.0. Some of the notable advancement in HTTP version 1.0 include:
- Status codes were started being used to let client know if request has been failed or successeded. Many more status codes were introduced in later version of HTTP.
- HTTP headers were introduced, both for the requests and the responses, allowing metadata to be transmitted and making the protocol extremely flexible and extensible. With the help of HTTP headers, this was the first time the ability to transmit other documents than plain HTML files has been added (thanks to the Content-Type header).
- HTTP method such as GET, HEAD, POST were introduced.
Around this time only second iteration of HTML was introduced.
If you feel deterministic, here is the official spec for this version of HTTP https://tools.ietf.org/html/rfc1945
HTTP 1.1 was a proper standardization of HTTP and ran around for around 18 years until HTTP 2 came out.
- In protocol version 1.1, we were able to make requests without waiting for other to complete.
- Additional cache control mechanisms were introduced. Servers were more versatile on delivering content on ground such as language and encoding.
Hostheader field was must in all HTTP/1.1 request messages. A 400 (Bad Request) status code may be sent to any HTTP/1.1 request message that lacks a Host header field. With the availability of Host headear it was possible to host multiple website on same IP adderss.
- A connection can be reused, saving the time to reopen it numerous times to display the resources embedded into the single original document retrieved.
- Pipelining has been added, allowing to send a second request before the answer for the first one is fully transmitted, lowering the latency of the communication.
- Content negotiation, including language, encoding, or type, has been introduced, and allows a client and a server to agree on the most adequate content to exchange. This means if language in your HTTP client is set to Spanish and a server has content in multiple languages, it would return the client the content in appropriate language.
- Caching was introduced. More on this at https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control
In 2000, a new pattern for using HTTP was designed: representational state transfer (or REST), providing interoperability between computer systems on the internet. The term is intended to evoke an image of how a well-designed Web application behaves.
Transport Layer Security was introduced around the same time. So was CORS.
More about HTTP/1.1 in https://tools.ietf.org/html/rfc2616. But later six-part specification obsoleting RFC 2616 which can be found at the bottom of https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#History section.
Officially standardized in May 2015, the new HTTP/2 protocol is faster than HTTP/1.1 and only works over HTTPS. HTTP was already being used by 8.7% of all websites.
Version 2 is focused on performance. With version 2, more performance can be yield with the same server. We still use HTTP requests. A single TCP connection is used to send as many request client or the server wants. This is called multiplexing (link to appropriate content.).
So if the client sends 3 requests, and the server responds with all 3. How does the client knows which response is for which request? This is where streams come in. Each http packet is tagged with a stream id.
The HTTP/2 protocol has several prime differences from the HTTP/1.1 version:
- It is a binary protocol rather than a textual.
- It is a multiplexed protocol. Parallel requests can be handled over the same connection, removing the order and blocking constraints of the HTTP/1.x protocol.
- It compresses headers. As there are often similar amount a set of requests. This removes duplication and overhead of data transmitted.
- It allows a server to populate data in a client cache, in advance of being required, through a mechanism called server push.
At the time of writing this, HTTP/3.0 has come out. But we’ll not talk about this as it is out of the scope of this post.
This site at the time of writing this post is being served to you via HTTP2.
More about HTTP/2 at https://httpwg.org/specs/rfc7540.html. For a gentle introduction, see https://www.ssl.com/article/an-introduction-to-http2/.
HTTP 3.0 - QUIC
Until now, TCP was used as a transport medium for HTTP. In HTTP3 that transport medium is switched to UDP.
Is still in development and being incorporated by various libraries in different languages.
As of now, the latest version of Chrome and Firefox already support HTTP3.
This is not a very exhaustive list or features I can list, but I have provided links wherever I can if you want to dig deeper.