In addition to making flatfiles available to download on the Web, and providing applications that enable programmatic access to backend databases through the Web, imagine using the Web itself as a database: a massively distributed, decentralized database. This is what Linked Data is about – putting data in the Web. As part of our ongoing collaboration to democratize open government data with Data.gov, the Centers for Medicare and Medicaid Services are now publishing Clinical Quality Linked Data on Health.data.gov, beginning with Hospital Compare.
Linked Data adds an emphasis on structured data, in order to better facilitate machine processing and mashups. Open standards based schema languages are used to classify the kinds of data entities in our domain of interest, so that our Web content can be “typed” as instances of these schema classes. Links, the foundational feature of the Web of Documents, have also matured beyond just having one kind of “source page to target page” link and are now much more fine grained, allowing us to create custom relationships between the things described within our pages. We’re used to relating tables in a single database, but now we can think of Web pages as tables, and relate their data with custom links across the entire Web of Data.
Like other Hospital Compare releases, this Hospital Compare Linked Data provides reports and survey results about how well hospitals treat various conditions, each with specific metrics that apply to measures designed to give citizens an understanding of how well hospitals perform when compared with state and national statistics. What’s different about this Linked Data implementation is that the definition of each class of thing in the Hospital Compare domain (including but not limited to Hospital, Condition, Measure and Metric) and the identity of every instance of each class has a globally unique address on the world wide network of computers, independent from the temporal datasets that contain periodically sampled statistical values about them. This makes it easier to accumulate more samples about how well that specific hospital is doing over time, as subsequent publications will automatically aggregate new data around each and every domain concept and their instances.
There are lots of different ways you can investigate and interact with the Hospital Compare Linked Data, whether you’re a carbon- or silicon-based user agent. You can still download each dataset in its entirety, but now you can also get access to and refer to the data they contain in a much more fine-grained way. Most of the datasets are published as a collection of RecordSets with Records, each Record containing the statistical values about a particular instance of a domain entity (a hospital or State or the US) regarding various Conditions, each with corresponding Measures and Metrics.
You might also begin by browsing these domain entities, where each will provide a dynamically created list of their instances and links to other things they’re related to. So if you start at the definition of the Hospital class, you’ll find a list of hospitals. When you click on a specific hospital, it will provide links to more information about that hospital, and links to records that have automatically aggregated from all of the dataset publications containing data about that hospital. In doing so, you’re interacting with the conceptual data model that relates all the domain entities. If new hospitals come under the purview of these reports and surveys, they’ll show up there on the list of things that are an instance of the class Hospital. If new Conditions are tracked against new Measures and Metrics, these new instances will show up on their respective pages as well.
For a more detailed explanation, start with this presentation, which provides lots of links to various entry points that walk through the features and functionality. Keep an eye out for upcoming Data.gov/semantic community blog posts, where we’d like to engage in transparent and participatory collaboration with you, as we add more Clinical Quality Linked Data domains to augment this first Hospital Compare release. You’re also welcome to participate in the Data.gov Semantic Web/Linked Data community of interest, where you can learn more about this Health and Human Services work and what other federal agencies are doing with Linked Data – just send an email to george dot thomas 1 at hhs dot gov – all are welcome. We think that exposing more data as a service will help us meet many of the challenges we face when seeking to integrate and federate data across our health data ecosystem partners. Of course we also hope that these new data access techniques will stimulate reuse, and we’re excited about enabling the network effect and cross-domain correlation potential as we continue to add more Government Linked Data.
HHS Enterprise Architect
Data.gov PMO Semantic Web and Linked Data Lead
W3C Government Linked Data Working Group Co-Chair