Under the Hood of the Open Data Engine
Data.gov started almost four years ago with a simple idea—opening up government data helps citizens make better-informed decisions and empowers businesses to be more innovative. In the years since then, Data.gov has grown from 47 datasets to 400,000, from a few agencies to 180, and from providing only the data to also providing context, community, and conversations around the data.
The new Executive Order and Open Data Policy are ground-breaking in their requirement for agencies to open up new data and information, present those in human- and machine-readable formats, and will help to usher in the next stage of open data innovation.
We are seeking your great ideas and constructive criticism as we move forward to the next phase of Data.gov. We want to scale up the quality and quantity of data, be more helpful to American businesses and entrepreneurs looking to use government data and research, more clearly support learning in classrooms, get government data in front of researchers and journalists, and bring the power of open data to American citizens.
- What can be done differently and better to make open government data more useful to you?
- What features do you want to see?
- What topics are missing or incomplete?
- What ways can we better connect with your community?
It’s all about getting you to the data you need as quickly as possible in a variety of machine-readable formats with better search, more APIs, easier ways to share data, more data resources federated. You’ve told us via forums, list serves, hack-a-thons, blogs, and meetups around the country that we need to have more and better capabilities for developers and innovators. We are listening.
So, what's new and different?
Search. We’ve taken a lesson from other open data sites, and built our new catalogs on a great tool, CKAN. This new catalog harvests data from all the US federal agencies, as well as other organizations that are part of the government geospatial community. The improved search lets you search and browse options from simple keyword search to filtering and faceting by tags, formats, publishers, and locations. The geospatial search allows you to draw your own custom bounding box. Found some great data? You can save and share your search results via each dataset’s persistent, unique URL for linked data fans and easy reference by researchers and data journalists. Check out a sneak peek at our new combined catalog. You can compare this to the old separate catalogs for “raw” and geospatial data. We are finalizing a few things on the catalog, so let us know what needs to be different. Need local data? Check out the data published by Cities, Counties, and States.Data.gov.
APIs. What is the most often heard phrase at meet ups and hackdays? “Give me an API and get out of my way.” We hear you. As more and more agencies launch developer portals, an API catalog is under construction to provide an automated, filterable catalog of all APIs across government. While leaders like the Labor Department and Census Bureau already offer a range of advanced APIs, we recognize that other agencies are newer to this. To help, we’ve been scaling out a range of tools and resources to empower all federal agencies to adopt an “API first” model that will grow ever more quickly the web services that developers can use to further their innovation. The new catalog comes with a full RESTful JSON API to all metadata fields so everything in the web interface can also be done via the API (from search queries to downloading data files).
Data publishing. Soon, gone will be the days where agencies have to input their metadata into a Data.gov form or send over a spreadsheet (yikes!). Later this month we will start harvesting JSON files from agencies that are publishing catalogs.
Data pages. We are running through some options for new designs and formats to enhance the usability on the site, including the pages for the datasets. We’ll open these ideas up to you as they evolve, but stay tuned for suggestions developers have given us to show what is related to the dataset you are viewing:
- News results
- Related datasets
- Ideas from you on how to use the data and comments about the data
- Apps and services that are using that dataset
- Questions and answers
Open source. Data.gov has also gone open source. Want to download and use the code or, better yet, contribute extensions, code, and ideas? Jump over to Github and start hacking. You’ll notice multiple forks here contributed from our international partners in open data as part of the Open Government Platform (OGPL). The Government of India contributes a Drupal 6 and Drupal 7 code base, Canada is contributing their Web Experience Toolkit, and the Open Knowledge Foundation in the United Kingdom provides CKAN 2.0. You can contribute directly to one of these code bases, to OGPL overall, or create a new fork.
Open questions. We are encouraging the government data owners to chat with you in a new Open Data community at StackExchange (coming next week) and talk about and improve the quality of the data. This way, questions about open data also become a form of the open data itself.
So if you’re passionate about the possibilities of open data and what new frontiers need to be explored or what barriers need to be demolished, share your ideas publicly or one on one! We will be launching some new features this month and throughout the summer and fall as we hear back from you. Help us put the data to work. Data liberación!
The Data.gov Team