Term

Definition

API
(source: howto.gov/api)
An Application Programming Interface, or API, is a set of software instructions and standards that allows machine to machine communication—like when a website uses a widget to share a link on Twitter or Facebook.
Catalog
(source: Data.gov)
A catalog is a collection of datasets. Data.gov has one catalog for all types of datasets at http://catalog.data.gov. The catalog contains both geospatial and non-geospatial datasets.
CKAN
(source: ckan.org)
CKAN stands for Comprehensive Knowledge Archive Network, an open source data management system that is the basis of the Data.gov catalog, as well as the open data catalogs of approximately 50 data hubs around the world.
CSV
(source: Wikipedia)
A comma separated value (CSV) file is a computer data file used for implementing the organizational tool of the Comma Separated List. The CSV file is used for the digital storage of data structured in a table of lists form. Each line in the CSV file corresponds to a row in the table. Within a line, fields are separated by commas and each field belongs to one table column. CSV files are often used for moving tabular data between two different computer programs (like moving between a database program and a spreadsheet program).
Data
(source: Federal Enterprise Architecture: Data Reference Model)
A value or set of values representing a specific concept or concepts. Data become “information” when analyzed and possibly combined with other data in order to extract meaning and to provide context. The meaning of data can vary depending on its context.
Data Extraction Tool
(source: Data.gov)
Data extraction tools allow a user to select a data basket full of variables and then recode those variables into a form that the user desires. The user can then develop customized displays of any selected data.
Dataset
(adapted from: Wikipedia)
A dataset is an organized collection of data. The most basic representation of a dataset is data elements presented in tabular form. Each column represents a particular variable. Each row corresponds to a given value of that column’s variable. A dataset may also present information in a variety of non-tabular formats, such as an extended mark-up language (XML) file, a geospatial data file, or an image file.
KML
(source: Wikipedia)
Keyhole Markup Language (KML) is an XML-based language schema for expressing geographic annotation and visualization of existing or future Web-based, two-dimensional maps and three-dimensional Earth browsers.
KMZ
(source: Wikipedia)
KML files are very often distributed in KMZ files, which are zipped files with a “.KMZ” extension. When a KMZ file is unzipped, a single “doc.kml” is found along with any overlay and icon images referenced in the KML and any network-linked KML files.
Metadata
(source: Federal Enterprise Architecture: Data Reference Model)
Metadata describes a number of characteristics or attributes of data; that is, “data that describes data”. (ISO 11179-3). For any particular datum, the metadata may describe how the datum is represented, ranges of acceptable values, its label, and its relationship to other data. Metadata also may provide other relevant information, such as the responsible steward, associated laws and regulations, and the access management policy. The metadata for structured data objects describes the structure, data elements, interrelationships, and other characteristics of information, including its creation, disposition, access and handling controls, formats, content, and context, as well as related audit trails.
Shapefile
(source: ESRI Shapefile Technical Description)
A shapefile stores non-topological geometry and attribute information for the spatial features in a dataset. The geometry for a feature is stored as a shape comprising a set of vector coordinates. Shapefiles can support point, line, and area features.
XML
(source: Wikipedia)
XML (Extensible Markup Language) is a general-purpose specification for creating custom markup languages. It is classified as an extensible language, because it allows the user to define the mark-up elements. XML’s purpose is to aid information systems in sharing structured data especially via the Internet, to encode documents, and to serialize data.

Metadata

(source: Project Open Data http://project-open-data.github.io)

Title Human-readable name of the asset. Should be in plain English and include sufficient detail to facilitate search and discovery.
Description Human-readable description (e.g., an abstract) with sufficient detail to enable a user to quickly understand whether the asset is of interest.
Tags Tags (or keywords) help users discover your dataset and should include terms that would be used by technical and non-technical users.
Last Update Most recent date on which the dataset was changed, updated, or modified.
Publisher The publishing agency.
Contact Name Contact person’s name for the asset.
Contact Email
Contact person’s email address.
Unique Identifier
A unique identifier for the dataset or API as maintained within an Agency catalog or database.
Public Access Level
The degree to which this dataset could be made publicly available, regardless of whether it has been made available. Choices: Public (is or could be made publicly available), Restricted (available under certain conditions), or Private (never able to be made publicly available).
Data Dictionary
URL to the data dictionary for the dataset or API. Note that documentation other than a data dictionary can be referenced using “related documents” as shown in the expanded fields.
Download URL URL providing direct access to the downloadable distribution of a dataset.
End Point Endpoint of the web service to access a dataset.
Format The file format or API type of the distribution.
License The license with which the dataset or API is published.
Spatial The range of spatial applicability of the dataset, which could include a spatial region like a bounding box or a named place.
Temporal The range of temporal applicability of the dataset (i.e., a start and end date of applicability for the data).
Release Date Date of formal issuance.
Frequency Frequency with which the dataset is published.
Language The language of the dataset.
Granularity Level of granularity of the dataset.
Data Quality Whether the dataset meets the agency’s Information Quality Guidelines.
Category Main thematic category of the dataset.
Related Documents Related documents, such as technical information about a dataset or developer documentation.
Size The size of the downloadable dataset.
Homepage URL An alternative landing page used to redirect a user to a contextual, Agency-hosted “homepage” for the Dataset or API when selecting this resource from the Data.gov user interface.
RSS Feed URL for an RSS feed that provides access to the dataset.
System of Records URL to the system of record related to this dataset.