Summarizing 404 errors

The status code of the request page is shown in field 9 of the log. The 404 status will represent the page not found error on the server. I am sure we have all seen that in our browsers at some stage. This may be indicative of a misconfigured link on your site or just produced by a browser searching for the icon image to display in tabbed browsers for the page. You can also identify potential threats to your site by requests looking for standard pages that may give access to additional information on PHP driven sites, such as WordPress.

Firstly, we can solely print the status of the request:

$ awk '{ print $9 } ' access.log  

We can now extend the code a little as well as ourselves and just print the 404 errors:

$ awk ...

Get Mastering Linux Shell Scripting now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.