25.3. Transforming Web Log Data

Many methods exist to transform the collected data into your SQL and OLAP data warehouse. Common methods include utilizing SQL Server Integration Services (SSIS), using Log Parser (www.logparser.com), or developing your own parsing process to go through the data. In general, the difficulty is not in transforming your transactional data, but in making sense of the web log data. In this section you learn about the issues involved in transforming the data rather than various parsing technologies.

25.3.1. Filtering

As noted in the previous section, web logs record every single request to the web page. What this implies is that there is a lot of data that needs to be filtered out, leaving the actual data that describes customer actions and patterns. Recall that when a user clicks a web page, there actually is more than one hit recorded in the web server log. This is not applicable in the case of using a Tracking Web Site with client-side JavaScript tagging. The instances recorded include images, style sheets, JavaScript, ASP include files, and other files that are called upon by the web page and are integral for the web site, but not for the purposes of analysis. For example, the table that follows specifically calls the URI stem and referrer:

URI stemReferrer
/shop/WebPage.asp-
/images/ico_A.gif
http://www.wrox.com/shop/WebPage.asp?lid=20&vID=1000
&cat=books
/doc/OLAP.asp-
/images/ico_B.gif
http://www.wrox.com/shop/WebPage.asp?lid=20&vID=1000 &cat=books ...

Get Professional Microsoft® SQL Server® Analysis Services 2008 with MDX now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.