Event tracking in Google Analytics

Posted on April 28, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

One of the more exciting developments I have seen lately is the introduction of a feature called event tracking to Google Analytics.

What it allows you to do is register the occurrence of ‘events’ within your pages using JavaScript. Google Analytics then spits out the statistics on your events within the Event Tracking section of your dashboard.

You can also tag events with categories that allow for a breakdown within these statistics.

The JavaScript function that you use to register an event is as follows, straight from the Google Analytics Event Tracking Guide:

_trackEvent(category, action, optional_label, optional_value)

So you have three levels of categorisation through this function: category, action and optional_label. And, you can also record a optional_value.

I did some basic experimentation with integrating this into the Apprentice game that we have developed at thrudigital. I have implemented event tracking on five different categories of actions:

  • Clicked to vote
  • Clicked on a sharing link
  • Clicked on a Twitter profile pic
  • Submitted the contact thrudigital form
  • Clicked on the link to the thrudigital website

This gives us a nice little breakdown of what people are clicking on once they get to our one-page mini-site. We could have even gone one step further and tagged each of the vote events with the name of the contestant voted for – that might be a job for tomorrow…

Theoretically, you could trigger an event for almost any sort of event within the user interface of your web site. You could even do tricky stuff like record how far users get through a form before bailing out. Or even how long users are taking to get through a form on average, using the optional_value field.

If you know of anyone using this new functionality in a creative way, please leave a comment here and let us know.

A particularly exciting API

Posted on March 27, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

idiomag, a partner company of ours, is an online music magazine. It serves articles and relevant song tracks, photos and videos about artists it believes you’ll like, based on your behaviour. Check it out here http://www.idiomag.com

The site allows you to enter your username from sites like Last.fm, Pandora, iLike, Strands or imeem and idiomag grabs your public listening profile from those sites. Those past interests are used to recommend playlists of music and videos that a user would probably like. idiomag then brings in semantically indexed articles from syndication partners about those artists, and videos and photos from around the web from events such as live concerts.

‘Your’ magazine is completely dynamic so it’s always fresh and relevant to you, and comes with some nice touches, for example the background colour and text always works well with whatever the page content (dynamic page and text colouring is, incidentally, one of the tasks thrudigital asks prospective coders to undertake http://www.thruserver.com/proveyoucanworkforus).

Recently, idiomag opened up its dynamically aggregated content to third-party developers through ‘a particularly exciting API’ (according to ReadWriteWeb). Beyond just media content, idiomag is also opening up access to user Attention Data through APML (attention profile markup language), which is the emerging open standard for Attention Metadata, and will soon offer a range of topical content coordinated to suit any user’s interests.

In essence, the API’s offering is:

  • Thousands of top news articles, reviews and interviews
  • Filtered videos, images and MP3s for tens of thousands of artists
  • Collaborative filtering and “discovery” music recommendation based on single or multiple artists, and via profile-import from a wide variety of 3rd party sites including Last.fm, Pandora, iLike and iMeem
  • A leading implementation of APML, providing access to user’s music interest profiles, for both input and output of the recommendation technology



Several partners are already using idiomag’s data through the API, including entertainment discovery platform TheFilter (URL: http://www.thefilter.com), social music community MOG (URL: http://www.mog.com), and the leading platform for University websites, Oncampus.

The idiomag API is RESTful, with API calls requiring a unique identifier key to be passed. The calls are available in a range of formats including xml, json, apml, xspf and foaf. It is really simple to understand and work with. You simply specify which format which format you need by adding /format to the end of the URL, so for example to receive the latest articles about Radiohead in RSS format, you use the following URI: http://www.idiomag.com/api/artist/articles/rss?key=<key>&artist=Radiohead. Or you could get a playlist of videos for the genre tag ‘indie’ in xspf format, by using the following: http://www.idiomag.com/api/tag/videos/xspf?key=<key>&tag=Indie. This, along with the smart implementation of APML, providing access to a user’s music interest profiles, could mean for example that you could even access your own personalised XSPF video feeds using VLC. Smart.

The documentation is extensive and really easy to understand, which should make take-up a lot faster. Mashup and API guru John Musser called the first draft of the Idiomag API “interesting, smart” and unusually thoughtful about the ways it serves up different kinds of data. We’re looking forward to seeing some really smart applications being built upon the API in the coming months, and will post up anything interesting. No doubt idiomag will at the same time be one degree ahead of the curve, as they always are, in their development of the system.

Script for getting Google Analytics data into your database

Posted on March 24, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

Here at thrudigital we recently struck an interesting problem when a client got in touch wanting us to create them a custom analytics solution that combined usage statistics from the forum section of their website with the whole-of-site statistics they already got from Google Analytics. They wanted to be able to see, for example, what proportion of pageviews are generated from the forum, as compared with the whole site.

We thought, ‘Too easy, we will just hook into the Google Analytics API and bring down the stats from the client’s account…”. Whoops, a quick search derailed that plan as we realised that Google Analytics can not be currently accessed via an API! Guess they just haven’t got around to that feature yet…

What Google Analytics does have is an option to to schedule emails containing data in a number of formats to an email address of your choice. The formats that are available are CSV, TSV, XML and PDF.

The CSV and TSV formats are limited to providing information about one particular metric, for example, ‘Whole-of-Site Pageviews’. The XML format is much more useful in that it provides the statistics for all the top-level metrics in a single document, which allows you to pick and choose what you want to extract.

We figured that what the client needed was a script that could:

  • check a given mailbox for a new email from Google Analytics
  • pull the XML attachment out of the email
  • parse the XML and insert the data into a table in the database

This script could then be scheduled to run at any specified interval (via cron) in order to keep the database copy of the Google Analytics data up-to-date.

We wrote the script in Ruby, but it is simple enough that it could be easily ported to other languages. We noticed that there is even an API for Java for parsing the Google Analytics XML format, if you wanted to get into more complex stuff.

The script relies on a couple of assumptions, if you are planning on ‘dropping it in’ and using it:

  • You are using a MySQL database
  • You have the mysql and tmail Ruby gems installed (you can install them by typing gem install mysql tmail)
  • You are sending your Google Analytics emails to a POP3 mailbox

Once you have the script you can simply include instantiate the ’sucker’ in a simple Ruby script like the following:

require 'ga_tools'

sucker = GoogleAnalyticsTools::Sucker.new({
  :mail_host => 'somewhere.com',
  :mail_user => 'analytics@somewhere.com',
  :mail_password => 'password',
  :mail_subject_string => '** Somewhere.com Analytics **',
  :db_host => 'somewhere.com',
  :db_user => 'analytics',
  :db_password => 'password',
  :db_schema => 'somewhere_db',
  :db_table => 'google_analytics'
})

sucker.get_latest_data
sucker.update_database

Here is an explanation of the parameters that you need to feed into this thing:

  • mail_host, mail_user and mail_password: The details for accessing your POP3 mailbox.
  • mail_subject_string: A special string that the script looks out for in the subject line, that lets it know which emails to get the XML data from. You would have specified a subject when you set up your Google Analytics scheduled email.
  • db_host, db_user, db_password, db_schema: The details of the database that you wish to load the Google Analytics data into.
  • db_table: The name of the table that will hold your Google Analytics data. If it does not exist when you run the script, it will be created automatically.

The script itself follows. Any feedback is always appreciated.

ga_tools.rb
require 'net/pop'
require 'rexml/document'
require 'rexml/xpath'
require 'date'

# This module requires the tmail and mysql gems, install them by typing:
# gem install mysql tmail
require 'tmail'
require 'mysql'

module GoogleAnalyticsTools

  # Provides an interface through which to:
  # * communicate with a mailbox that contains emailed Google Analytics data; and
  # * insert the data into a MySQL database table
  class Sucker
    public

      # The attribute data_file_contents is read-only outside this class
      attr_reader :data_file, :data

      # Returns a new Sucker with the specified set of configuration parameters
      # * mail_host
      # * mail_port
      # * mail_user
      # * mail_password
      # * mail_subject_string
      # * db_host
      # * db_user
      # * db_password
      # * db_schema
      # * db_table
      def initialize(parameters)

        # Set any configuration defaults
        @configuration = { :mail_port => 110 }

        # Update configuration
        @configuration.update(parameters)
      end

      # Checks the specified mailbox for the latest matching email and extracts
      # the XML from its attachment, returning a hash of the data.
      def get_latest_data
        output_status "Connecting to #{@configuration[:mail_host]}..."

        # Connect to mail server
        Net::POP3.start(@configuration[:mail_host], @configuration[:mail_port], @configuration[:mail_user], @configuration[:mail_password]) do |pop|

          # Check whether mailbox is empty or not
          if pop.mails.empty?
            output_status "No mail found in the mailbox of #{@configuration[:mail_user]}."
            return false
          end
          output_status "Total #{pop.mails.size} messages in mailbox of #{@configuration[:mail_user]}."

          # Get last email that has a subject containing the specified subject
          # string
          message = get_last_mail_with_subject(pop.mails, @configuration[:mail_subject_string])

          # Abort if there was no matching message
          if !message
            output_status 'No emails found from Google Analytics, aborting.'
            return false
          end

          # Get the first XML attachment from the message
          xml = get_attachment(message, 'text/xml', 'application/xml')

          # If no XML was found, abort and print a message
          if xml

            # Print status
            output_status "Found XML in last Google Analytics email, posted at #{message.date}."

            # Parse XML data file and store contents
            @data_file = xml
            parse_data_file

            # Return hash of data from file
            return @data
          else
            output_status "No XML found in email, aborting."
            return false
          end
        end
      end

      # Updates the specified database table with the data currently stored in
      # the data hash.
      def update_database
        output_status "Saving data to table #{@configuration[:db_table]} in database #{@configuration[:db_schema]}@#{@configuration[:db_host]}..."

        # Connect to database
        db = Mysql.new(@configuration[:db_host], @configuration[:db_user], @configuration[:db_password], @configuration[:db_name])

        # Select database
        db.query("USE #{@configuration[:db_schema]};")

        # If the target table does not exist yet, create it
        create_table(db)

        # If it does, update the record to reflect the latest value
        # If not, insert a new record
        @data.each do |date, measures|

          # For each date key in the data hash, check if data already exists for
          # this date in the database
          result = db.query("SELECT *
            FROM #{@configuration[:db_table]}
            WHERE date = '#{date}';")
          if result and result.num_rows > 0

            # If a record does already exist, update the record to reflect the
            # latest value
            statement = db.prepare("UPDATE #{@configuration[:db_table]}
              SET page_views = '#{measures[:pageviews]}',
                visits = '#{measures[:visits]}',
                time_on_site = '#{measures[:tos]}'
              WHERE date = '#{date}';")
            statement.execute
            statement.close
          else

            # If no record exists yet, insert a new record
            statement = db.prepare("INSERT INTO #{@configuration[:db_table]} (date, page_views, visits, time_on_site)
              VALUES ('#{date}', '#{measures[:pageviews]}', '#{measures[:visits]}', '#{measures[:tos]}');")
            statement.execute
            statement.close
          end

          # Free resources
          result.free
        end

        # Free resources
        db.close
      end

      # Convenience method for running a set of statements on the database.
      def run_script(name, sql)
        output_status "Running script '#{name}' on database #{@configuration[:db_schema]}@#{@configuration[:db_host]}..."

        # Connect to database
        db = Mysql.new(@configuration[:db_host], @configuration[:db_user], @configuration[:db_password], @configuration[:db_name])

        # Run script
        db.real_query("USE #{@configuration[:db_schema]};")
        db.real_query('START TRANSACTION;');
        sql.split(';').each do |statement|
          db.real_query(statement) if !statement.strip.empty?
        end
        db.real_query('COMMIT;');

        # Free resources
        db.close
      end

    protected

      attr :configuration
      attr_writer :data_file, :data

      # Prints a status message to standard output
      def output_status(message)
        puts message
        $stdout.flush
      end

      # Finds the last mail in the supplied array with the specified string in
      # the subject
      def get_last_mail_with_subject(mails, string)
        message = false

        mails.each do |mail|
          peek = TMail::Mail.parse(mail.pop)
          if peek.subject.to_s.index(string)
            message = peek
          end
        end

        return message
      end

      # Gets the first attachment of the supplied mail that has one of the
      # specified content types
      def get_attachment(mail, *content_types)
        attachment = false

        if mail.multipart? then
          mail.parts.each do |part|
            if content_types.include?(part.content_type)
              attachment = part.body
              break
            end
          end
        end

        return attachment
      end

      # Parses the XML data file and populates the data hash
      def parse_data_file
        output_status 'Parsing XML data file...'

        # Parse XML
        document = REXML::Document.new(@data_file)

        # Get starting and ending dates of document
        date_from = Date.strptime(REXML::XPath.first(document, "//PrimaryDateRange").text.split(' - ').first, '%B %d, %Y')
        date_to = Date.strptime(REXML::XPath.first(document, "//PrimaryDateRange").text.split(' - ').last, '%B %d, %Y')
        output_status "Date range: #{date_from} to #{date_to}"

        # Get page views from document
        output_status 'Getting page views...'
        pageviews = get_text_elements(document, '//Sparkline[@id="PageviewsSparkline"]/PrimaryValue')

        # Get visits from document
        output_status 'Getting visits...'
        visits = get_text_elements(document, '//Sparkline[@id="VisitsSparkline"]/PrimaryValue')

        # Get time on site from document
        output_status 'Getting time on site...'
        tos = get_text_elements(document, '//Sparkline[@id="TimeOnSiteSparkline"]/PrimaryValue')

        # Iterate through each date in the date range, adding the page views,
        # visits and time on site values to a hash
        number_of_days = date_to - date_from
        @data = Hash.new
        for i in 0..number_of_days

          # Use date as key for each set of values
          date_key = (date_from + i).to_s

          # Each date has its own hash of the three values
          @data[date_key] = {
            :pageviews => pageviews[i],
            :visits => visits[i],
            :tos => tos[i]
          }
        end
      end

      # Finds all elements within the supplied document that match the specified
      # XPath expression and returns their text contents as an array
      def get_text_elements(document, xpath)
        text_elements = []

        document.elements.each(xpath) do |element|
          text_elements.push(element.text)
        end

        return text_elements
      end

      # Create table to hold Google Analytics data using the specified database
      # connection, if one does not already exist
      def create_table(connection)
        statement = connection.prepare("CREATE TABLE IF NOT EXISTS #{@configuration[:db_table]} (
            date DATE,
            page_views int(11) NOT NULL,
            visits char(8) NOT NULL,
            time_on_site int(11) NOT NULL,
            PRIMARY KEY  (date)
          );")
        statement.execute
        statement.close
      end
  end
end

Image Editing with JavaScript, AJAX, PHP and GD

Posted on March 9, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

Ever wanted to allow your users the ability to create and edit images within their browser?

Options

There are a number of different ways you can achieve this sort of functionality:

  • Using a Flash or Flex application;
  • Using the <canvas> element from the HTML 5 Working Draft specification
  • Using AJAX, sending requests to the server which takes each edit request, manipulates the image and sends back the result

Canvas support across the browsers that our users have installed is still patchy at best: It is supported by recent versions of Safari, Opera and Firefox – but support from Internet Explorer is still lacking. Some talented individuals have started to find creative solutions to this current lack of support.

The AJAX Way

The example I am going to use is very simple – allowing a user to paint bucket tool to flood fill colour on an image. This example could easily be expanded to use any of the image manipulation functions that the PHP GD library provides.

User interface

The first thing we need is an HTML page which will serve as the interface to our paint bucket app:

<img id="picture" src="picture.png"/>
<form action="paintbucket.php">  <input id="x" name="x" type="text" value="" />
  <input id="y" name="y" type="text" value="" />
  <input id="red" name="red" type="text" value="255" />
  <input id="green" name="green" type="text" value="0" />
  <input id="blue" name="blue" type="text" value="0" />
  <input id="image" name="image" type="text" value="" />
</form>

Client-side script

Then we need a client-side script to detect when the user clicks on the image, populates our hidden form and sends the form data as an AJAX request. The jQuery library has been used, as has the getRelativeCoordinates function from this post by Acko:

$(document).ready(function() {

  // Reset all form values
  $('form input').val('');
  $('form input#red').val('255');
  $('form input#green, form input#blue').val('0');

  // Add click and paint behaviour to picture
  $('#picture').click(function(event) { sendPaintRequest(event); });
});

function sendPaintRequest(event) {

  // Get co-ordinates of mouse click and put into form fields
  coordinates = getRelativeCoordinates(event, $('#picture')[0]);
  $('#x').val(coordinates['x']);
  $('#y').val(coordinates['y']);

  // Post form data
  data = $('form').serialize();
  jQuery.post('paintbucket.php', data, function(data) { receivePaintResponse(data); }, 'json');
}

function receivePaintResponse(data) {

  // Get image data from response and update the picture
  $('#picture').attr('src', data.image);

  // Put image data into form field for next request
  $('#image').val(data.image);
}

Server-side script

Then we need a server-side script (PHP) to process the paint request and pass back the modified image to the client as a base64 encoded PNG.

<?php

  $x = intval($_POST['x']);
  $y = intval($_POST['y']);
  $red = intval($_POST['red']);
  $green = intval($_POST['green']);
  $blue = intval($_POST['blue']);

  // Set configuration
  $encodedPngHeader = 'data:image/png;base64,';
  $tempFileName = 'temp.png';

  if ($_POST['image'] != '') {

    // Decode image data
    $encodedImageData = $_POST['image'];
    $decodedImageData = base64_decode(str_replace($encodedPngHeader, '', $encodedImageData));

    // Write decoded image data to file
    file_put_contents($tempFileName, $decodedImageData);

    // Create image from file
    $image = imagecreatefrompng($tempFileName);

    // Delete file
    unlink($tempFileName);

  } else {

    // Get the original image from on the server
    $image = imagecreatefrompng('picture.png');
  }

  // Fill image with color at the specified co-ordinates
  $color = imagecolorallocate($image, $red, $green, $blue);
  imagefill($image, $x, $y, $color);

  // Write filled image to file
  imagepng($image, $tempFileName);

  // Read image data from file and encode it
  $imageData = file_get_contents($tempFileName);
  $encodedImageData = 'data:image/png;base64,' . base64_encode($imageData);

  // Delete file
  unlink($tempFileName);

  // Send JSON response with encoded image data
  $response = array('image' => $encodedImageData);
  header('Content-Type: application/json');
  print json_encode($response);

?>

That’s it!

This is a very simplified example, but you could see how this method could be expanded provide any sort of drawing functionality, and combined with the use of JavaScript controls in the user interface (e.g. a colour picker).

Incorporating GeoNames

Posted on March 2, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

For the last five months we’ve been working on a travel based social network, TravelScoops.  Some of the key features of the site include being able to plan your trips and see where your friends are right now, based on the trip they’re currently on.  Back at conception, we researched and toyed around with a few geomapping tools but only one provided enough data and versatility, and that was Geonames.  Geonames is a collaboratively produced online geographic database, available under the Creative Commons licence.  It’s one of those projects, like OpenStreetMap, whose community achievements just will just awe you.  You’d be hard struck coming up with a location that’s either missing or incorrect (and if you do, fix it!)

So back to the project.  We needed a data source that provided location names from village-level up to, but not including, country level.  It needed to be fast, and capable of handling the number of requests from our site.  Understandably, the GeoNames API has a rate limit.  However the database can be downloaded to host locally which, with a bit of work, you can tweak to fit your needs.

The next hurdle was the search algorithm and data structure, which needed to efficiently search a 6.5 million row table with joins.  With a database of this size you’d be best off going with a full-text search engine such as Lucene or Sphinx, however if that’s not possible, indexing the right fields and using MySQL’s full-text features will do a reasonably good job.  Re-organising the GeoNames data into a structure that suits your project’s needs is recommended, alongside producing a script to automate updates from GeoNames.org.

Things to consider when producing your search algorithm;

  • If population is a factor, know that a lot of places are lacking that data in GeoNames.
  • London’s country is ‘United Kingdom’.  If your result has anything in between (say, ‘England’) you’ll find it in the relevant administrative subdivision data (adminCodes).
  • Do some research and have a fiddle with filtering by class and code. London is class P; a city, while Greater London is class A; a state. Islands are class T and have a code of ISL. Make sure you know exactly what types of location you’re algorithm should find!

GeoNames - GeoTree

GeoNames’ GeoTree is a handy tool to help understand how their data is structured.

So, GeoNames is a fantastic data source, but combining that with search and mapping can lead to a lot of work.  Today’s problem in social media is getting users to be more open with their location (personally, I think Google Latitude was launched a little early in that respect) – the result is that geomapping is becoming more prevalent every day, so we’re seeing a lot of activity regarding geomapping frameworks and development tools.

With the combination of OpenStreetMap’s up-to-the-minute data and CloudMade’s fantastic development tools, 2009’s going to be an exciting year for geomapping!


Archive

July 2010

June 2010

May 2010

April 2010

March 2010

February 2010

January 2010

December 2009

November 2009

May 2009

April 2009

March 2009

February 2009

January 2009

December 2008

November 2008

October 2008

September 2008

July 2008

June 2008

March 2008

February 2008

January 2008

December 2007

November 2007

October 2007

About Us

A team of nerds, creatives and strategy ninjas based in central London, building websites, social networks, widgets and social media apps.

We have a portfolio that is good enough to make a male peacock blush, and some killer outside-the-box products...in a box.
Ask us a Question

Blog posts