Bash pipe fun
Posted: 16 September 2010 Filed under: productivity | Tags: bash, command line, linux 1 CommentHow about “recursively look at a log of hostnames used to request my site content. Sort them and ensure that only unique ip address and hostname combinations are counted. Find how many use my ‘.biz’ hostname to land on my site”:
find . -iname '*ecommerce-host_log*' | nice cat | nice xargs cut --delimiter=' ' -f 1,4 | nice sort | nice uniq | nice grep \.biz | nice wc -l
I wasn’t sure which commands would be most processor-intensive, so I used “nice
” liberally.
Apache custom logging
Posted: 31 August 2010 Filed under: system administration | Tags: apache, bash, command line, linux, system administration, web analytics Leave a commentAren’t you interested in seeing what requests users, bots, or script kiddies make of your site, especially those things that client-side JavaScript-based analytics packages don’t tell you?
Under Apache, custom logging can give you lots of information you might not have seen otherwise. I’ll let the documentation for Apache’s mod_log_config say most of this, but as a quick preview, you could try defining a custom log format up near the top of your httpd.conf with
LogFormat "%a %t %{Host}i \"%r\"" hostlog
for example, then in all of your Directory containers, you could do
CustomLog logs/forest-monsen-site-host-log hostlog
Then, in my case, /var/log/httpd/forest-monsen-site-host-log
would contain lines like
192.168.0.3 [31/Aug/2010:08:53:24 -0500] www.forestmonsen.com "GET /aggregator/sources/2 HTTP/1.0"
192.168.0.5 [31/Aug/2010:08:53:24 -0500] www.forestmonsen.org "GET /images/house.gif HTTP/1.1"
And I’d be able to tell which hostname was originally requested by the user — before any of my mod_rewrite rules got to it. Good stuff.