mike chambers | about

Poor Man’s Log File Parser

Thursday, May 25, 2006

I often have to manually parse through web logs to quickly get the number of times something has been downloaded.

This is pretty simple on unix based OS’s (Linux, OS-X, etc…), except that I always forget the exact command (and always have to bug Christian to remind me how to do it).

So, I figured I would post the here in case anyone might find it useful (and so I can easily find them in the future):

cat access.log | grep "myfile.txt" | awk '{print $1}' | sort | uniq | wc -l

Just replace access.log with your log file, and myfile.txt with the file / regular expression you are searching for.

This will give you the number of unique downloads for the file.

If you want to omit certain terms from the match, then just pipe it through grep -v, like so:

cat access.log | grep "index.cfm" | grep -v "remove_this" | awk '{print $1}' | sort | uniq | wc -l

If you don’t care about unique downloads, then it is even easier:

cat access.log | grep "myfile.txt" | wc -l

Tons more stuff you can do with this. Post variants, suggestions or questions in the comments.

twitter github flickr behance