World News With Geographic Heatmaps

I stumbled across GeoIQ a few days ago, and I just couldn't resist. What you see above is a world-news heatmap for November 2004 as seen on Yahoo News. This is the same dataset I explored in my 'Visual Database Explorer in Ruby' earlier, there is no particular reason I choose that month, I just had the data handy.

To put things in perspective, shown data consists of the lead stories for each day of November 2004. New York was mentioned numerous times, and there was a story on Florida, hence the stretched area over US. Northern Europe was a source of several mentions also, hence the blip. Middle East was a heavy contributor and there were a couple of stories about Palestine, Sudan and Kenya, hence the stretched component over eastern Africa. Here is a few sample headlines:

Palestinian Leader Arafat Dies at 75Iraqi Troops Reinforce Unsettled MosulBush Selects Rice to Replace PowellUS Pounds Falluja Diehards, Violence in North...

Getting started with GeoIQ

This experiment was a proof of concept, and I think it came out really well. GeoIQ guys have a great product with a very straight-forward API. You simply have to wrap your google map into their handler and feed them your data points. In turn, they generate a PNG image on the fly and overlay it over your map. Simple and effective. Only downside, you have to regenerate and retrieve new images from their server every time you pan or modify the view. For some great examples, visit their official blog.

Geo-coding news stories

The more challenging aspect of this experiment was the task of geo-coding news stories themselves. I wanted to visualize where the news were playing out, not where the story was produced. Unfortunately, this information is usually embedded in the title or the story itself, thus making this task non-trivial. After some head-scratching, I came up with a Ruby script to help me out. I couldn't automate the procedure fully, so instead I made it query the user and learn the selected choices. For example, if the story title contains 'Belgian', intuitively it should be mapped 'Belgium or Brussels', which in turn, is mapped to some specific geographic coordinate. After a brief training period, I was able to build a fairly comprehensive database, and my script was able to parse most stories with little user intervention.

For the actual geo-coding component, I was really pleased to find the MaxMind World Cities database. 2,673,764 geo-mapped locations at your fingertips, for free! Putting the two together, I came up with this:

The map may take a few seconds to load (or more). Each heatmap overlay is regenerated when you modify the view, so give it a few seconds to appear. If nothing is showing, try viewing the file directly.

Mashup waiting-to-happen

This is a mashup waiting to happen, and with a lot of potential. To make things easier, Reuters RSS feed for international news carries the source of the news story right in the description. Looking at the latest feed:

MADRID (Reuters) - Two Spanish men, both charged with providing explosives for Islamist train bombings in Madrid in 2004, were given jail sentences on Wednesday in a separate trial for selling explosives in 2001, a court said.

As you can see, the hardest component is already taken care of - you have the exact location. All you have to do is geo-code it and then query GeoIQ for the overlay! Any takers?

The future of news visualizations

I have a number of ideas for extending this model, but I'm curious, what do you think? How effective are heatmaps for news visualizations? How can we make it better? What kind of additional information can we embed in such visualizations? I would love to hear any kind of input and outrageous blue-sky ideas.

Ilya GrigorikIlya Grigorik is a web ecosystem engineer, author of High Performance Browser Networking (O'Reilly), and Principal Engineer at Shopify — follow on Twitter.