Incorporating GeoNames

Posted on March 2, 2009 Categories: Coding

post author

Written by: Charles

Charles has spent the past few years as the big cheese at thrudigital. On any normal day you will catch him with a milky cup of tea (no bubbles on top thank you very much) and at least 30 browser tabs open.

For the last five months we’ve been working on a travel based social network, TravelScoops.  Some of the key features of the site include being able to plan your trips and see where your friends are right now, based on the trip they’re currently on.  Back at conception, we researched and toyed around with a few geomapping tools but only one provided enough data and versatility, and that was Geonames.  Geonames is a collaboratively produced online geographic database, available under the Creative Commons licence.  It’s one of those projects, like OpenStreetMap, whose community achievements just will just awe you.  You’d be hard struck coming up with a location that’s either missing or incorrect (and if you do, fix it!)

So back to the project.  We needed a data source that provided location names from village-level up to, but not including, country level.  It needed to be fast, and capable of handling the number of requests from our site.  Understandably, the GeoNames API has a rate limit.  However the database can be downloaded to host locally which, with a bit of work, you can tweak to fit your needs.

The next hurdle was the search algorithm and data structure, which needed to efficiently search a 6.5 million row table with joins.  With a database of this size you’d be best off going with a full-text search engine such as Lucene or Sphinx, however if that’s not possible, indexing the right fields and using MySQL’s full-text features will do a reasonably good job.  Re-organising the GeoNames data into a structure that suits your project’s needs is recommended, alongside producing a script to automate updates from GeoNames.org.

Things to consider when producing your search algorithm;

  • If population is a factor, know that a lot of places are lacking that data in GeoNames.
  • London’s country is ‘United Kingdom’.  If your result has anything in between (say, ‘England’) you’ll find it in the relevant administrative subdivision data (adminCodes).
  • Do some research and have a fiddle with filtering by class and code. London is class P; a city, while Greater London is class A; a state. Islands are class T and have a code of ISL. Make sure you know exactly what types of location you’re algorithm should find!

GeoNames - GeoTree

GeoNames’ GeoTree is a handy tool to help understand how their data is structured.

So, GeoNames is a fantastic data source, but combining that with search and mapping can lead to a lot of work.  Today’s problem in social media is getting users to be more open with their location (personally, I think Google Latitude was launched a little early in that respect) – the result is that geomapping is becoming more prevalent every day, so we’re seeing a lot of activity regarding geomapping frameworks and development tools.

With the combination of OpenStreetMap’s up-to-the-minute data and CloudMade’s fantastic development tools, 2009’s going to be an exciting year for geomapping!

Follow us:

Leave a Reply

Archive

February 2010

January 2010

December 2009

November 2009

May 2009

April 2009

March 2009

February 2009

January 2009

December 2008

November 2008

October 2008

September 2008

July 2008

June 2008

March 2008

February 2008

January 2008

December 2007

November 2007

October 2007

About Us

A team of nerds, creatives and strategy ninjas based in central London, building websites, social networks, widgets and social media apps.

We have a portfolio that is good enough to make a male peacock blush, and some killer outside-the-box products...in a box.
Ask us a Question

Blog posts