How do I find data on Data.gov?

You can search Data.gov from its catalog of government data from across the Federal Government. Once in the catalog, to find datasets you can:

  • Enter keywords in the search box.
  • Browse on the left side through types, tags, formats, groups, organization types, organizations, and categories.  Clicking on multiple items narrows your search.  You can click on the “x” to the side of any single item to remove it from the search, or “clear all” to remove all selected items in a category.
  • Search by geospatial area by drawing a boundary box on the map at the left side and clicking “Apply” to find all datasets that are tagged for that geographic area.
  • Interactive Datasets” at the top right takes you to datasets that can be explored online through a web browser.

Once you find a dataset or tool of interest, click on the title and you will be taken to a page with more details on that specific dataset or tool. Some datasets are downloadable, while others are links to web sites or apps that help you access or use the data.

Please note that by accessing datasets or tools offered on Data.gov, you agree to the Data Policy, which you should read before accessing any data. If there are additional datasets that you would like to see included on this site, please suggest more datasets here.

How are the datasets in Data.gov collected?

Under the terms of the Federal Open Data Policy, newly-generated government data is required to be made available in open, machine-readable formats, while continuing to ensure privacy and security.

Government data publishers looking to get their data on Data.gov should read the detailed guide: How to get your open data on Data.gov. The Data.gov team typically works with a designated open data point of contact as a liaison for each agency. Data publishers should consult with their agency point of contact to include any additional datasets on Data.gov. If you need help determining who your open data point of contact is, please contact us.

Federal CFO-Act agencies are required to:

  • Create a Single Agency Data Inventory. Agencies are required to catalog their data assets, just like they would inventory computers or desk chairs, to better manage and use these resources.
  • Publish a Public Data Listing. Agencies are required to publish a list of their data assets that are public, or could be made public. This list is made available as a data.json file hosted at the primary domain of the agency (eg. gsa.gov/data.json)
  • Develop New Public Feedback Mechanisms. Agencies are required to set up feedback mechanisms to engage the public about where agencies should focus open data efforts, such as facilitating and prioritizing the release of datasets. Agencies are also required to identify public points of contacts for agency datasets.

Agency Public Data Listings are made available on agency websites as JSON files following the Project Open Data metadata schema (at agency.gov/data.json) and are then harvested into the central catalog for Data.gov.  Each agency is responsible for its own data.

How can I add my government data to Data.gov?

Data.gov is primarily a federal open government data site. However, state, local, and tribal governments can also syndicate metadata describing their open data resources on Data.gov for greater discoverability. Data.gov does not host data directly, but rather aggregates metadata about open data resources in one centralized location. Once an open data source meets the necessary format and metadata requirements, the Data.gov team can pull directly from it as a Harvest Source, synchronizing that source’s metadata on Data.gov as often as every 24 hours.

Step 1: Organize your open data for the Data.gov Pipeline

Getting your data source ready for harvesting by the Data.gov catalog differs depending on the type of source:

  1. Federal Data with Project Open DataThe most common source is the Public Data Listing as required by the Federal Open Data Policy.
  2. Federal Geospatial DataA number of federal agencies hold geospatial data which has separate requirements under different legal authorities.
  3. Non-Federal Data: Non-federal sources are not covered by the Federal Open Data Policy, but can be included in the Data.gov catalog voluntarily. Depending on your platform, creating this harvester might just be the push of a button or it could take a little more work, but the team will walk you through it either way.

Step 2: Coordinate with Data.gov

  1. Contact the Data.gov team. Contact the Data.gov team (datagovhelp@gsa.gov) to let them know you’d like to get started. Please include a link to your metadata in the data.json format or let us know if you have questions about how to create a data.json file from your current database along with any relevant links.
  2. Connecting the pipes.The Data.gov team will create a new Harvest Source that will automatically collect information about your datasets and update Data.gov whenever changes are made on your data catalog.
  3. Testing. The Data.gov team will test to ensure the harvester works properly. If anything seems wrong, the team will help you configure your data catalog so that Data.gov can collect your datasets without any errors.
  4. Live within 24 hours! Once the harvester has been tested successfully, Data.gov will start automatically consuming information about your datasets and all the basic details of your datasets will be available on Data.gov with links to the source and your open data policy.

Who developed Data.gov?

Data.gov is managed and hosted by the U.S. General Services Administration, Office of Citizen Services and Innovative Technologies.

What technology is Data.gov built with?

This catalog is built on an open source data management system called CKAN. CKAN is used by governments around the world including Australia, Austria, Brazil, Germany, Norway, and the UK. Started as a project by the Open Knowledge Foundation in 2007, all development is on Github and contributors are welcome.The Open Knowledge Foundation is a non-profit organization focused on opening up all knowledge (data and content) to see it used and useful. Founded in 2004 and with chapters around the world, it has been pioneering open source tools and open data from the start.

What standards were used to develop the metadata displayed on Data.gov?

Data.gov follows the Project Open Data schema – a set of required fields (Title, Description, Tags, Last Update, Publisher, Contact Name, etc.) for every data set displayed on Data.gov.

What if I am having difficulty downloading a dataset from the catalog?

Some web browser configurations, particularly those that are designed for high-security computing environments, can interfere with access to certain datasets from the catalog. This is most commonly related to government websites that use security certificates and end user browsers that are not configured to recognize those certificates as being authoritative. If you are having difficulty downloading one or more datasets from the Data.gov catalog, please contact your local IT support staff to determine whether browser configuration issues can be addressed for your workstation.

What metrics are available about data on Data.gov?

Data.gov collects information on the total number of datasets and dataset collections. Fluctuations in dataset and dataset collection totals may occur as the result of resolving issues with dataset duplicates or merging datasets into dataset collections. Data.gov also tracks Federal Agency Participation and visitor metrics. To see how well CFO-Act agencies are complying with the Federal Open Data Policy, check out the Project Open Data Dashboard.