Civic Hacking with Python – Part 1

Not long ago, @mtlouvert tweeted about this Transports Quebec database saying it would be nice to have access to all the data in a usable format. I thought it would be a nice project but was busy working on other things.

Sunday night I had some spare time so I decided to check what could be done. I did not find any way to completely download the data by downloading reports etc… So I turned to Python and the fantastic scraping framework Scrapy.

A few hours later I had a working crawler and a bit later output in different formats. I will post something more technical about the process soon.

Here are the results:

This morning, big surprise, I found on Twitter that James McKinney has done a similar exercise last night. I guess Sunday night is hacking night for him too. I checked his data and realized that not all the database was available. So I thought I’d release my stuff anyway. Others may find it useful.

Today at lunch and later in the afternoon I finalized a few things. I wanted to display the KML on a map but the generated KMZ is too large to display on a Google Maps directly so I ended up using a Google Fusion Table.

Here are links to the Fusion table and the data in different format:

Have fun!

UPDATE 2011/11/10: I’ve noticed a missing double-quote in the KML description field that caused two rows to be “merged” so I have regenerated the data, created a new fusion table with it and updated the links. I also removed the “fusion_marker” field from the non-KML format as it was added specifically for the fusion table.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s