IIS Home @ it-notebook.org

Log Parser: Most popular files

(Kristofer Gafvert, August 16, 2005)

Introduction

Log Parser is a free tool developed by Microsoft that can be used to read and generate reports based on your log files. It lets you run SQL similar queries against your log files, making it extremely easy to get the information you want and need. This article will show you how to use Log Parser to find out the most popular pages and the most downloaded files.

Note: All files are downloaded by the client. By "most downloaded files" I mean files that are not viewed in the browser directly (*.zip, *.mpg, *.pdf).

Why it is important to know the most popular pages

By knowing the most popular pages you know why people visit your website. By knowing this you know "what's hot" and what topics you should focus on to increase the number of visitors and/or have your visitors stay longer.

Using Log Parser 2.2 to get the ten most popular pages

The following code shows how to get the ten most popular pages. In this example, a page has any of these extensions: htm, html, asp, aspx, php. If you have pages with other extensions, you can easily add the extension to the list.

SELECT TOP 10 TO_LOWERCASE(cs-uri-stem) AS File, COUNT(*) As Hits 
FROM 'C:\LogFiles\gafvert\ex*' 
   WHERE EXTRACT_EXTENSION(File) IN ('htm'; 'html'; 'asp'; 'aspx'; 'php') 
GROUP BY File 
ORDER BY Hits DESC

The path is converted to lower case because IIS logs the exact request. So a request for /File.htm is logged as /File.htm, while a request for /file.htm is logged as /file.htm. But these two refers to the same file, so we want them to be treated as one file here. In the FROM clause we specify the location of the log files (in my case, they are located in the folder C:\LogFiles\gafvert\).

The EXTRACT_EXTENSION function returns the extension of the file name. We compare this to the list of extensions we have defined to be "pages". To run this, open up the Log Parser command line and type this (after you have downloaded the file and placed the TopTenPages.sql file in the Log Parser folder):

logparser file:TopTenPages.sql -i:W3C

The ten most popular pages in a specific folder

To do the same as above, but for a specific folder, we can use this SQL query:

SELECT TOP 10 TO_LOWERCASE(EXTRACT_FILENAME(cs-uri-stem)) AS File, COUNT(*) As Hits 
FROM 'C:\LogFiles\gafvert\ex*' 
   WHERE EXTRACT_EXTENSION(File) IN ('htm'; 'html'; 'asp'; 'aspx'; 'php')
   AND  TO_LOWERCASE(EXTRACT_PATH(cs-uri-stem)) = '%folder%'
GROUP BY File 
ORDER BY Hits DESC

To run type this in the Log Parser command prompt (where "/notes" is the folder you want to find the most popular pages in):

logparser file:TopTenPagesFolder.sql?folder="/notes" -i:W3C

Since we only want the file name (we already know the path since we specified it...) we use the EXTRACT_FILENAME function. The other thing that is changed from the previous query is line four. On line four the path is extracted and matched against the folder we specify as parameter (the ?folder="/notes" part). If we did not want to use parameters we could have changed line four to:

AND  TO_LOWERCASE(EXTRACT_PATH(cs-uri-stem)) = '/notes'

Ten most popular downloads

As you have probably already guessed, to find the most popular downloads, all we need to do is change the extension list.

SELECT TOP 10 TO_LOWERCASE(cs-uri-stem) AS File, COUNT(*) As Hits 
FROM 'C:\LogFiles\gafvert\ex*' 
   WHERE EXTRACT_EXTENSION(File) IN ('zip'; 'pdf'; 'mpg') 
GROUP BY File 
ORDER BY Hits DESC

logparser file:TopTenDownloads.sql -i:W3C

Applies to [?]

IIS 6.0

Resources

Download SQL Code for this article
Log Parser at Technet Script Center
Download Log Parser
The Unofficial Log Parser Support Site
Book: Microsoft Log Parser Toolkit