Wednesday, February 29, 2012

The World Wide Web: Past, Present, and Future

Introduction

This post is all about the World Wide Web, from its inception, all the way to its current state, as well as a look into the future. Three core technologies are discussed: the URI, HTTP, and HTML.

Berners-Lee — A Pioneer

In 1991, a 35-year-old British computer engineer had finally cracked it. He had a good feeling about it. This was to be his legacy. He knew this was going to change the world. After around a decade of work, the hard-working Timothy John Berners-Lee, a scientist at CERN, had created what was to be the World Wide Web.

In 1980, Berners-Lee developed an innovative software project named ENQUIRE. It featured hyperlinks. Pages linked to each other in a closed loop. In 1989, realising some of the limitations of ENQUIRE, Berners-Lee proposed the creation of a "Hypertext project" in 1989. The World Wide Web project was born.

By 1991, the ever-industrious Berners-Lee had created:

That's a lot of firsts. Additionally, Berners-Lee's genius was to add a hyperlink system, allowing pages to freely (and easily) link to each other. This would create a "web" of pages, all of which were interlinked.

The best part? Berners-Lee knew the potential, and went ahead and made everything freely available.

Three Core Technologies

The World Wide Web (WWW) is essentially a client-server architecture that relies on three core technologies.

  • URI (Uniform Resource Identifier)
  • HTTP (Hypertext Transfer Protocol)
  • HTML (HyperText Markup Language)

URI

URIs play a similar role to URLs (Uniform Resource Locator). Every resource on the internet (e.g. a document or picture) needs a location on the internet. A URI is split into four sections: the protocol, the computer name, the path, and the resource name. This is what a typical URI looks like:

URI scheme for an example URL

The URI syntax is defined in a number of RFCs (1630, 1738, 2141, 2396, 2717, 2732, and finally, 3986).

HTTP

The Hypertext Transfer Protocol is the backbone of data transmission on the World Wide Web.

It works using a simple client-server model. The client (a web browser), would send an HTTP request to a program on another computer (the server). The server attempts to fulfil the request and issues a response (such as an HTML file).

Past — HTTP V0.9

HTTP V0.9 was the first version, developed in 1991, and had a single method: GET (the client asks the server to respond with an HTML file).

Past — HTTP V1.0

The next version of HTTP, introduced in 1996, added several methods: HEAD, POST, PUT, DELETE, TRACE, CONNECT, and PATCH. Version 1.0 also added support for caching.

Present — HTTP V1.1

The current standard, introduced in 2000. Version 1.1 of HTTP added several speed-related enhancements. Persistent connections allow for multiple request-response transactions per connection, compensating for the inefficient TCP Slow Start. Caching and compression also received significant improvements.

HTTP/1.1 added the OPTIONS method, although it isn't often used.

Future — SPDY?

While there doesn't seem to be a forthcoming HTTP/2.0, some organisations have stepped up to the plate to help speed up and improve the web. Google, a company literally obsessed with speed, has come up with an open protocol named SPDY.

SPDY does not replace HTTP directly. It simply acts as a middleman for HTTP requests. By simplifying, multiplexing, and compression, Google have claimed speed increases of around 55%. Is it effectively HTTP 2.0?

Keeping Safe — HTTPS

Standard HTTP traffic is essentially sent in plaintext — unencrypted. It is fairly trivial for a malicious computer user to read sensitive data being transmitted between a server and an unsuspecting client.

With the rise of the WWW, there grew a need to encrypt traffic between a client and a server.

Hypertext Transfer Protocol Secure is simply HTTP with the addition of the SSL (Secure Sockets Layer) protocol.

There are very few practical differences for the end user between HTTP and HTTPS. Modern web browsers are very good at showing the user that they are on an HTTPS connection.

SSL showing for Facebook on Google Chrome

So why isn't the whole of the World Wide Web on HTTPS? Well, a number of reasons, none of them very good. Here are two of the most commonly-cited:

  • Caching isn't supported for data transferred via SSL.
  • Keys need to be transmitted, encrypted, and decrypted. SSL adds a performance hit.

Clint Ecker of Ars Technica lays out a very strong argument in favour of using HTTPS everywhere.

HTML

HyperText Markup Language is the main building block of any web page. It's a simple markup language comprising of tags to define content. Most tags come in pairs, referred to as opening and closing tags (such as <p> and </p>), but some are standalone (such as <img>).

Below is an example HTML document:

<!DOCTYPE html>
<html>
  <head>
    <title>Hello HTML</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

Past — HTML 1.0

In 1991, Tim Berners-Lee (yes, that man again) defined the HTML standard. It is considered an implementation of SGML, an ISO standard for documents.

Past — HTML 2.0

Version 2.0 was incrementally improved upon up to 1997. Throughout its lifespan, it added basic file-uploading, tables, image maps, and support for more locales.

Past — HTML 3.2

In 1997, the W3C published HTML 3.2. It consolidated a lot of overlapping tags, and dropped support for mathematical formulas.

Present — HTML 4.01

From 1997 up to 2000, HTML was incrementally improved upon. HTML 4.0 marked many visual tags (such as <font>) as deprecated. The W3C decided to set up HTML as a content language, leaving the styling up to Cascading Style Sheets.

Future — HTML5

HTML5 logo

It's tough to categorise HTML5 as a "future" language, as it's been a working W3C draft since 2008, and it's in common use today.

New Features in HTML5

HTML5 is a big update. Here are a few of the key features. The W3C maintains a comprehensive list of all the new features added to HTML5.

Changes to Markup

HTML5 introduces some new tags that replace old, traditional <div> or <span> elements. These include <header>, <section>, and more. This creates a more semantic web, as well as cleaner, prettier code

This is how a current web designer has to implement different sections of a page.

HTML4 DIV tag technique

With HTML5, a designer will be able to use these more semantic tags.

HTML5 with more semantic tags

(The above two images are taken from this article).

Certain deprecated elements have been completely dropped, including <font> and <center>.

New Doctype

A new, simplified doctype declaration is used: <!DOCTYPE html>.

Offline Storage

HTML5 offers the capability for some form of offline storage. This reduces the need for technologies like Google Gears — which was platform-specific.

Video and Audio

Unfortunately, one of the biggest problems with displaying videos is its lack of uniformity. Flash came along and offered a reasonably efficient way of embedding videos securely, while simultaneously being offered on almost any platform.

The problem is that Flash isn't perfect. It stutters, stalls, doesn't work on every device (especially not on mobile devices), and is a proprietary plugin.

HTML5 attempts to solve this problem with its <video> and <audio> tags. One of the biggest problems that the W3C still faces is in choosing a single video format as a recommendation, as many of the good, efficient formats are proprietary (such as H.264).

Support

Browsers

The W3C has had to repeatedly push back the date to make HTML5 an official recommendation, as new features are constantly added and adjusted.

Almost all modern desktop browsers support some form of HTML5, including Chrome, Firefox, Safari, Opera, and Internet Explorer (yes, even IE).

In addition, major smartphone operating systems also support some of the features of HTML5 (including the <video> tag), such as Android and iOS.

Website Adoption

Despite not having reached W3C recommendation status, many of the largest websites have found the lure of HTML5's best features too hard to resist. A study conducted back in September 2011 stated that 34 out of the 100 most popular websites are actively using HTML5 as their primary doctype.


Conclusion

That concludes this post about the World Wide Web. While the current technologies (HTTP/1.1 and HTML 4.01 aren't bad, there's definitely room for improvement. As we move into a more mobile world, developers are receiving tools like SPDY and HTML5 to make the web browsing experience faster, safer, and with more varied content than ever before.

It's pretty incredible to think that the brainchild of Sir Tim Berners-Lee a full twenty-three years ago is still in use today. It is a testament to his tremendous vision, which is still in the process of being realised today.


References

URIs:

HTTP:

HTML5:

No comments:

Post a Comment