Uncategorized Home @ it-notebook.org

How the webbrowser communicates with the webserver (simplified)

(Kristofer Gafvert, January 06, 2005)

Step 1

The user specifies what domain and port to connect to. Say that the user want to visit www.ilopia.com. He/she then types the URL www.ilopia.com in the browser (Internet Explorer). Internet Explorer will in this case default to the HTTP protocol, and default to port 80. If the user wanted to connect to another port, he or she had to write the port number.

Step 2

The browser must now know what IP to connect to. This is not special for the communication between the webclient and webserver. The DNS name is just a name, which is translated into an IP. This is IP is then used to connect to the server.

The browser will ask the local system for this IP. The system will then find the IP in one or another way. It will first see if it is cached in the local machine (for Windows it could also use the lmhosts file). If it is not, it will ask the DNS server specified in the network settings (probably your ISP's DNS server). If this DNS server does not have this cached, it will ask the top level DNS server.

Step 3

The client connects to the server. This is done using the IP and port only. The DNS name (www.ilopia.com) is not in any way used to make this connection. So far, there is only a connection, the server still does not know what to do.

Step 4

The client now sends a request message using the HTTP protocol. The simpliest request would look like:

GET /index.html HTTP/1.1

This requests ask for the file index.html. A request like this does not specify any host header. So if the webserver is configured so that a host header is required, the webserver does not know what to do with this request. The server should in this case (says the RFC) reply with a "400 Bad Request". A valid request looks like:

GET /index.html HTTP/1.1
Host: www.ilopia.com

Step 5

The server examines the request message, and takes action. In the second example above, the webserver will use the host header sent to see if there is something matching that host header. If there is, it will serve index.html from the home folder of that website. If www.ilopia.com is not found as a host header in the webserver, it will use the default (if any) website (this is IIS default settings, it might not be true for other webservers).

Step 6

The server responds to the request by sending some header information, and the content of the webpage (see demonstration below).

Demonstration

So, let's make a demonstration (it does not work in the public). Let's do a request for a page on the server www.ilopia.com.

telnet www.ilopia.com 80 <ENTER>
GET /index.html HTTP/1.1 <ENTER>
Host: www.microsoft.com <ENTER>

First, we use telnet (telnet will be our "webbrowser") to connect to the server www.ilopia.com on port 80. Telnet will find out the IP to connect to. We then ask for the file "index.html" using the HTTP protocol and version 1.1 of the protocol. We also send the Host field with a value of "www.microsoft.com".

The server will now reply back with the following:

HTTP/1.1 200 OK
Content-Length: 55
Content-Type: text/html
Last-Modified: Sat, 06 Mar 2004 21:24:38 GMT
Accept-Ranges: bytes
ETag: "f2ed676ec13c41:4a5"
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
Date: Sat, 06 Mar 2004 21:39:43 GMT

<html>
<body>
This is NOT Microsoft
</body>
</html>

The first part is the Header fields, the second is the body (this is the content of the file index.html in our case). So, as you can see, even if we asked for the host www.microsoft.com, we got a reply back from a server that is not www.microsoft.com. So, what happened?

We connected to the server www.ilopia.com on port 80. This was looked up to be (when writing this) IP 217.208.8.47. We then sent a request for the page index.html and the Host header field was www.microsoft.com. Obviously, the webserver does not care if the domain www.microsoft.com is looked up to be the same IP as the webserver. What the webserver only did care about was that there was a request for this host. So, it is either on the webserver, or not. It does not in any way use external resources to resolve the domain name to an IP. And since I have a Host header named "www.microsoft.com" on this server (when writing this), the client got back the webpage in this websites home folder.

If we were to look in the log file on the webserver for this request, it would look like this:

#Software: Microsoft Internet Information Services 6.0
#Version: 1.0
#Date: 2004-03-06 22:01:59
#Fields: date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs-version cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-substatus sc-win32-status sc-bytes cs-bytes time-taken 
2004-03-06 22:01:59 W3SVC2122583390 ILOPIA 217.208.8.97 GET /index.html - 80 - 192.168.0.5 HTTP/1.1 - - - www.microsoft.com 200 0 0 302 53 16645

Since we did not give it any other header fields, we have a lot of dashes

Applies to [?]

Windows Server 2003, IIS 6
Windows Server 2008, IIS 7

See also

Host in RFC 2616
RFC 2616