Mapping Out the Entire Web Site

I want to return to our own exploration of current web sites. By following the links contained in a set of pages, either on the original server or within a local copy of the site, you can start to see the architecture of the entire site. Not only can that offer insight into how the site functions, but the directory structure itself may serve as a signature for that particular scam. Seeing the same structure on a second site may allow you to link the two together.

Making a local copy of a site using wget and looking at the directories that are created are easy ways to get an overview of its structure. But this shows you only the pages and files that are directly visible from a web browser. In those same directories, hidden from view, may be other scripts or data files that might offer up information about the operators of the site.

Directory Listings

If you are lucky, a directory listing, also known as an index, may be made available by your target web site. You can view this from your browser by supplying a URL that ends in a directory name, rather than that of a specific web page. Figure 5-3 is an example of what this looks like.

Example of a web server directory listing

Figure 5-3. Example of a web server directory listing

This shows us all the files in the directory /autorank/images/.../template/ on a site. It contains mostly image files with two PHP scripts, a stylesheet, a JavaScript file, ...

Get Internet Forensics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.