Logging visitors to my site

Today I realised I could benefit from knowing what visitors find helpful on my site by logging the pages they visit. Here's how I thought it could work:

  1. Add a visitor tracker (primarily using Javascript to ping my tracking service) to the bottom of the HTML page
  2. Run a web server (which runs the tracker service) which receives the remote ping from the client's web page
  3. Log the visitor's information to file.

Adding the tracker to the bottom of the HTML page

I haven't thought much about how I'm going to add a JavaScript entry to all my static web pages, however I know I can write an ad-hoc script to update all the pages in 1 batch.

Here's the code which I have embedded into an HTML page for testing purposes:

<script type="text/javascript" src="http://a1.jamesrobertson.eu/do/visitor/tracker" id="visitortracker1446915683"></script>
<noscript><img src="http://a1.jamesrobertson.eu/do/visitor/count-png"/></noscript>

If the visitor has disabled JavaScript the tracker can still contact the server using noscript and an image tag. I could have stuck with simple using the image tag and forgetting about JavaScript, however I'm interested in bots which may ignore the image tag and will execute the JavaScript instead.

Implementing the server side code

As you can see below, there's not much to the server-side code:

  <job id='count'>
    <script>
<![CDATA[ 

  require 'logger'
  
  @services ||= {}
  @services['logger'] ||= Logger.new('/home/james/d/visitor.log','daily')
  logger = @services['logger']
  details = %w(http_referer http_x_forwarded_for http_user_agent).map {|x| @env[x.upcase]}
  logger.info  details
  
]]>   
    </script>
  </job>
  
  <job id='count-png'>
    <script>
<![CDATA[ 

  require 'logger'
  require 'png'
  
  @services ||= {}
  @services['logger'] ||= Logger.new('/home/james/d/visitor.log','daily')
  logger = @services['logger']
  details = %w(http_referer http_x_forwarded_for http_user_agent).map {|x| @env[x.upcase]}
  logger.info  details + ['noscript']
      
  canvas = PNG::Canvas.new 1, 1
  png = PNG.new canvas
  [png.to_blob, 'image/png']  

]]>   
    </script>
  </job>  

What I collect about my visitors

I'm only interested in what pages my visitors visit, and it's helpful for me to know who uniquely visits a page. This is why I collect the IP address (http_x_forwarded_for) along with the page visited (http_referer).

Here's a sample of the output collected today while testing it:

# Logfile created on 2015-11-07 18:01:17 +0000 by logger.rb/47272
I, [2015-11-07T18:01:17.741066 #27137]  INFO -- : ["http://www.jamesrobertson.eu/snippets/2015/nov/07/introducing-the-png-gem.html", "192.168.4.122, 192.168.4.159", "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38.2.1", "noscript"]
I, [2015-11-07T18:01:47.008732 #27137]  INFO -- : ["http://www.jamesrobertson.eu/snippets/2015/nov/07/introducing-the-png-gem.html", "192.168.4.122, 192.168.4.159", "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0 Iceweasel/38.2.1"]

You can see above that there is 2 log file entries, the 1st being when I visited the page with JavaScript disabled and then with JavaScript enabled. Also note that there is 2 IP addresses in 1 entry, the 1st is the IP address of my laptop and the 2nd is the IP address of the proxy I use to visit websites with that web browser.

I didn't have to log the useragent, however I'm curious to know what web browsers visitors are using.


I had considered publishing the visitor logs, however I have decided against it to protect the privacy of my web visitors. Hopefully at some point I will publish the most frequently visited pages on my site, but at the moment I've got plenty of other things that need looked at.

Resources

Tags:
Source:
1838hrs4980.txt
Published:
07-11-2015 18:38