Why Open Data Sharing Is Important

posted in: Uncategorized | 0

5

This week we were able to expand TransHub to 7 cities – now including Canberra. We were able to do this because the dataset for Canberra became available on data.gov.au.

Just over 5 years ago we saw the first campains to “Free Our Data” in the UK. Governments in every country collect and hold a massive amount of data, much of it very useful in our everyday lives, and quite often data you have to pay to get hold of and not necessarily in a particularly useful format. The idea behind this campaign was the Governments should collect the best quality data possible and then make it available for free to all.

In recent years there has been a push for Australian government agencies to make more data available to the public. Since then Data.Gov.Au has been created.

Data.gov.au provides an easy way to find, access and reuse public datasets from the Australian Government and state and territory governments. The main purpose of the site is to encourage public access to and reuse of government data by providing it in useful formats and under open licences.

The great thing about putting up the data so the public can use it, is that developers, businesses and hobbyists can use the data and mash it up in ways that were never before thought of and make useful sites, applications and creations for others to use.

How we’ve used Data.gov.au for TransHub Canberra

To show how powerful publishing data in an open format is I’ll take you through the process we went through of creating TransHub Canberra.

When we created TransHub Sydney, we had thought of expanding and reusing the same idea with other cities. Some minor code changes were made from the original competition entry to allow us to apply to different modes of transport in Australia (an other metric countries). This allowed us to easily apply other datasets that we had at our disposal such as Adelaide, Auckland, Cairns, Perth and Townsville.

In a little over a week we were able to add a whole new set of data through the process:

  • We noticed the Journey Planner Data for the ACT had been made available on Wed night 5/10.
  • By Thursday 6/10 we had made the necessary adjustments and internal testing to prepare TransHub Canberra for Beta Testing.
  • Friday 6/10 we started the Beta and put out a call for Beta testers
  • Friday 14/10 TransHub Canberra is live on the marketplace

How we’ve used Data.gov.au previously

We’ve been a fan of data.gov.au for awhile. This year we were also involved in the Library Hack competition.  We placed 2 entries:

QLD Mosaic

The images are from the State Library of Queensland’s out of copyright photographs from their photograph collection “People and places from across Queensland across time” . Data source is here: http://data.gov.au/dataset/picture-queensland/

The image for the mosaic is NASA’s Blue Marble Imagery cropped to the political boundary of Queensland. The mosaic was created using AndreaMosaic’s 64bit professional version, Photoshop CS5 and DeepZoom tools. It was processed on a Dell dual 6 core Xeon X5680 T5500 Workstation with 24GB Ram. It took just under 7 days to compile the images, mosaic, process, tile and upload to our website. The energy used to power the hardware was offset by our 6KW/h Solar System and the awesome Queensland Sun.

Historical Real Estate Maps

Soul Solutions has taken the State Library of Queensland’s collection of digitized estate maps, advertising new housing estates in Queensland from the early to mid 20th Century. 165 of these have been digitized in the collection that can be found here: http://data.gov.au/dataset/real-estate-maps/

The maps are predominantly from Brisbane but also cover some regional areas of Queensland such as the Gold and Sunshine Coasts.

We have chosen to visualise these using an enhanced version of the Microsoft PivotViewer control, from LobsterPot Solutions, that allows us to presents 100’s of things at once and visualise them in a way that can add value by allowing multiple ways to filter and sort the collection and view details and metadata while showing the estate map image in a “zoomable” format.

Here’s a quick video demonstrating the application or you can watch it on YouTube here :http://www.youtube.com/watch?v=5RpauKnxp2w&feature=channel_video_title

What is this GTFS thing anyway?

Many of you have asked us what format we need the data in to be able to use it. There’s a few that are supported by we highly recommend GTFS – General Transit Feed Specification.

GTFS defines a common format for public transport schedules  and associated geographic information. For more information see http://en.wikipedia.org/wiki/GTFS

It’s now a format accepted through many apis around the world for public transport data including Bing Maps and Google Transit.

Why don’t we have TransHub Brisbane and TransHub Melbourne?

Our top 2 feature requests for TransHub is to create a version for Brisbane and Melbourne. Many of you have asked why we haven’t done it and why can’t we? Currently neither Brisbane nor Melbourne publish their data publically for application developers to use. If we can’t get to the data we can’t make an application.

As many of you have pointed out we could screen-scrape or crowd source the timetable and station information we really don’t want to go down this path. We’d prefer to encourage the other cities to get on board with the open formats and publish their data so everyone can benefit and do it properly!

Our only thoughts are that public pressure may encourage them to change their minds and follow what the other cities in Australia and the world have done. So keep voting for the feature!

Why is Open Data Sharing important?

In short, let them do what they do best and let us developers develop the best apps possible on their data.

If we just look at phone application developers – there are many different smart phones out there, and we all like different features and functions in applications – which is why there’s about 66 different “torch” apps just on the WP7 marketplace. The government is good at collecting this data and running it’s services, not necessarily great at writing the applications we all want to use.  The cost for them to develop a fully featured transit application on every smartphone platform, to keep it current, evaluate the features that people want would be enormous, not to mention the time to market for across all of the devices would be substantial.

If they just publish the data to a known standard, then application developers can make their own applications, with reliable data and bring it to market using their specialist skills much quicker. I’d love to see 10 different Brisbane Transit applications on WP7 and I’m sure many of you would love to see it on your favourite devices.

For us, it would mean we can take the investment we made in TransHub Sydney, and apply it to the Brisbane data and make an equivalent application in under 2 weeks – I don’t think any government department could match that time to market. As we have customers using each city version of the application we can address, and add feature requests for a specific city and apply them across the range of cities with not much additional effort.