Monday, February 5, 2007

Server to Web Browser – What Happens?

Each time you click on a link in a web page or type an address into your web browser you are making a ‘request’ for a certain document. That request is handled with the Hyper Text Transfer Protocol (HTTP) and sent over the Internet to the server which holds the document in question. If all goes well the server responds by sending the document — usually a web page of text and graphics.

HTTP is part of the Internet Protocol (IP) suite. It is used by a ‘client’ such as a web browser to establish a connection with the server which hosts a particular website. The server waits for incoming requests by monitoring TCP port 80.

Transmission Control Protocol (TCP) is used to create connections between two computers on the Internet so they can exchange data. TCP has provisions for identifying the requesting computer and for transmitting data with time stamps so that it can be reassembled in the correct order once it arrives at its destination.

There are several TCP ports which have standardized uses. TCP port 21, for example, is usually reserved for FTP (File Transfer Protocol) for uploading and downloading files. Port 80 is usually used for HTTP.

If the server receives a request string on TCP port 80 in the form of GET / HTTP/1.1 it will send a response code depending on whether the requested web page is available or not. A typical request goes like this:

GET /faq.html HTTP/1.1
Host: http://www.mywebsite.com

This is a request for http://www.mywebsite.com/faq.html. The ‘Host’ needs to be specified to distinguish websites which are hosted on shared servers. If faq.html is available the server will respond:

HTTP/1.1 200 OK
Date: Mon, 12 October 2005 22:38:34 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT

…followed by the actual web page.

HTTP/1.1 200 OK means that the requested web page is available. Other codes can also be returned. The code 404, for example, means that the server cannot find the requested page. The web page is sent via TCP as a series of data packets each with a header that specifies its destination and order in the data stream. The various packets can all take different paths to reach their destination. Each is sent through a router which polls other routers which are close by. If a connection with the first router is unavailable the data will be sent through another one.

As the data is received the client (the web browser) sends back an acknowledgement. This ensures that all the packets are received within a certain time. If not, they will be re-transmitted by the server. TCP also checks that the data is undamaged. The data is reassembled in the correct order thanks to the sequence number of each data packet. Voila! The web page appears on your computer screen.

The TCP connection can be kept alive for additional requests from the client. This allows several pages to be requested within a short time period without causing the overhead of opening and closing TCP ports. Either client or server can close the connection at any time.

No comments: