…deserves a curator,
…needs a promoter,
…requires crowdsourcing, and
…smells like money.
by Robert L. Read, Michelle Hertzfeld, and Hillary Hartley
Data.gov publishes free of charge and with no strings attached 87,000 different datasets. Some big datasets, like satellite images and weather data, have become the basis of entire industries, and need no promotion.
Most datasets are not so exciting. Many are small, some contain errors, most need interpretation, and all require searching.
But every dataset, even the most boring, deserves a little love and attention from a curator who catalogs it and makes it easier to understand. Therefore, every Open Dataset published at data.gov represents a business opportunity in one way or another. Finding the people who are willing to pay for, or at least receive ads when viewing, the datasets is of course a problem–and therefore a business opportunity.
In addition, private companies can add value to the data more easily than the government can. For example, the government can provide basic search functionality, but it has trouble providing more advanced context-specific and content-specific searching very effectively. Why? Firstly, the government employs very few programmers compared to its needs. Secondly, the government is not motivated by making money. Finally, the government is bound by laws of privacy, security, and fairness in ways that private firms are not.
For example, imagine that the government attempted to crowdsource some data cleanup and rating for a large dataset. It might have to:
- Conform to the Paperwork Reduction Act, requiring a strict procedure that takes months before citizen input can be collected;
- Conform to the Federal Information Security Management Act (FISMA) which requires much stricter security than private firms generally provide;
- Possibly make a system of records notification (SORN); and
- Perform a Privacy Impact Assessment as required by the E-Government Act of 2002.
All of these things not only add to the expense of a government hosted web or mobile app presence, but also stagnate the all-important process of rapid, agile development that is responsive to user feedback.
In contrast, what would a private company have to do to create a similar data cleanup process? A private firm could build a crowdsourcing platform quickly or could simply provide a website that presents a curated view of one or more datasets. More likely, they would reuse an existing platform which could be stood up in one day. And they should, because the American people deserve not just to see the data, but to see the data in an unobscured and comprehensible way.
This begs the question: how much money can really be made? How much are people willing to pay for a little extra added value? One is tempted to say: not much. But it also begs another question: how much does it cost to provide basic curation, editing, commenting, crowdsourced analysis, searching, and sorting? The answer: not much. Or at least not much using modern, off-the-shelf open-source software, which allows extraordinarily rapid prototyping and high reuse. There are 87,000+ datasets at Data.gov right now, representing opportunities big and small. Do the math.