Not long ago, @mtlouvert tweeted about this Transports Quebec database saying it would be nice to have access to all the data in a usable format. I thought it would be a nice project but was busy working on other things.
Sunday night I had some spare time so I decided to check what could be done. I did not find any way to completely download the data by downloading reports etc… So I turned to Python and the fantastic scraping framework Scrapy.
A few hours later I had a working crawler and a bit later output in different formats. I will post something more technical about the process soon.
Here are the results:
This morning, big surprise, I found on Twitter that James McKinney has done a similar exercise last night. I guess Sunday night is hacking night for him too. I checked his data and realized that not all the database was available. So I thought I’d release my stuff anyway. Others may find it useful.
Today at lunch and later in the afternoon I finalized a few things. I wanted to display the KML on a map but the generated KMZ is too large to display on a Google Maps directly so I ended up using a Google Fusion Table.
Here are links to the Fusion table and the data in different format:
- Google Fusion Table (Click on Visualize then Map to get the map. Zoom in a bit to click on actual markers and see the description)
- KML (KMZ)
- CSV (Zipped)
- JSON (Zipped)
- Line JSON (One JSON entry per line) (Zipped)
UPDATE 2011/11/10: I’ve noticed a missing double-quote in the KML description field that caused two rows to be “merged” so I have regenerated the data, created a new fusion table with it and updated the links. I also removed the “fusion_marker” field from the non-KML format as it was added specifically for the fusion table.