Google may eventually solve the problem of finding data on the web. Too bad its first effort reports the wrong numbers for unemployment.
Since leaving public service, I have occasionally pondered whether to start a company / organization to transform the way that data are made available on the web. The data are out there, but they remain a nuisance to find, a nuisance to manipulate, and a nuisance to display. I cringe every time I have to download CSV files, import to Excel, manipulate the data (in a good sense), make a chart, and fix the dumb formatting choices that Excel makes. All those steps should be much, much easier.
There are good solutions to many of these problems if you have a research assistant or are ready to spend $20,000 on an annual subscription. With ongoing technology advances, however, there ought to be a much cheaper (perhaps even free) way of doing this on the net. With some good programming, some servers, and careful design (both graphic and human factors), it should be possible to dis-intermediate research assistants and democratize the ability to access and analyze data. At least, that’s my vision.
Many organizations have attacked various pieces of this problem, and a few have even made some headway (FRED deserves special mention in economics). But when you think about it, this is really a problem that Google ought to solve. It has the servers, software expertise, and business model to make this work at large scale. And with its launch of a search service for public data it has already signaled its interest in this problem.
As a major data consumer, I wish Google every success in this effort. However, I’d also like to use their initial effort, now almost three months old, as a case study in what not to do.
Google’s first offering of economics data is the unemployment rate for the United States (also available for the individual states and various localities). Search for “unemployment rate united states” and Google will give you the following graph:
Your first reaction should be that this is great. With absolutely no muss and no fuss, you have an excellent (albeit sobering) chart of the unemployment rate since 1990. I would add myriad extensions to this – e.g., make it easier to look at shorter time periods, allow users to look at the change in the unemployment rate, rather than the level, etc. – but the basic concept is outstanding.
Unfortunately, there is one major problem: That’s the wrong unemployment rate.
Click over to the Bureau of Labor Statistics, open a newspaper (remember them?), or stay right here on my blog – all of them will tell you that the unemployment rate in June was 9.5% not 9.7%.
Continue reading “Google, Unemployment, and the Future of Data”