Web Traffic Analysis of Christmas Lights Web Site

The Controllable Christmas Lights for Celiac Disease get a LOT of web traffic ... probably because it is just "fun" to not only view thousands of Christmas Lights, but turn them on & off ... plus it is for a good cause. Web Surfers from over 150 countries came by for Christmas/2006 - see the Google Maps mish-mash.

So Ian Dees sent me an Email asking if he could look at some of my raw log data and some some analysis of it. I'm a little concerned about releasing that data since there are privacy issues, which Ian completly understood. So we decided a reasonable approach to balance those privacy concerns, but provide him useful data would be for me to release a random sampling of visitors and XXX out the last octet of the IP address.

So I went through the Apache log data for December 24th, 2006 and pulled all the IP's hitting the christmas lights webcam. I then wrote a short Perl script that and generated a random sub-sample of 10,000 records which is sorted by timestamps and shows the first 3 octets of the IP address. Note that same IP addresses show up - this is because people reload the page (not actually neccessary, since iframes/AJAX are used), proxies are the same IP's, etc. All times are MST (GMT-7) on the NTP'ed web server. Traffic was coming from all over the place with noticeable spikes from Slashdot (front page at 20:33:38) and DIGG (front page at 23:10:36) - yes, you can tell from the raw log data! ;-) Note that since it was Christmas Eve, traffic was probably lower than normal from these sites ... plus DIGG is a single link whereas the Slashdot article also had several links in it to other sites.

You can access/download the 10,000 random sample data file here.

I would ask that if you download this file and do anything "interesting" with it to please let me know. And if it is really useful and you want to express your appreciation, please consider donating to the University of Maryland Center for Celiac Research.