Wednesday, April 14, 2010


I attended a close cousin's wedding this weekend (congratulations John and Callie), and on the trip out there I was pushing about how important collecting and understanding data will be to our society's future.  My father kept stressing that 'the ones who control whatever data it is will continue to have the ability to manipulate it'.  Yes, maybe, but with a diminishing effectiveness.
FlowingData posted an article today about TransparencyData, a new project aimed to making data more accessible to the public.  TransparencyData is one of a number of projects starting up intent on allowing the public to inform themselves.  These are mostly government-data-type sites for now, but these projects will inspire development into more specific sectors of our economy and our daily lives.

My argument was that, in time, we will reach a point where data has the ability to validate and invalidate itself.  Corruptions and fraudulent uses of data will be more visible to anyone interested in the information that some given dataset provides.

We're only now seeing what might be some sort of start to this notion.  Making data and information more observable than ever is the first step.  Included in this, is the concept of linked data.

I went so far in my argument for what data will do for our society, to say that at some point we will be governed more and more by data.  On the surface, that sounds very USSR/planned-economy/scary type talk.  But, I don't think that's what I mean to symbolize.  The ultimate social decisions, I would think, should still be made via the democratic process.  Data simply provides an avenue for more intelligent decisions to be made, and leaves less room for fraud and other mishandling in government.


  1. The "data skeptics" sound Schwartzian with their "Paradox of Choice" type argument. Thinking that as the amount of data (choices) increases the less effective we will be at processing all that information.

    Tyler Cowen disagrees with that argument because as the mountain of data has grown, we developed methods to increase our ability to cope. We developed better methods of search. I fall into this camp.

    That's a long winded way of saying I think you are on the right track here. What we need to do is just collect stuff. Everything we can. We will figure out what to do with it later, it will emerge. Using meta-data to determine accuracy is probably going to be a huge component of this new world order.

  2. Yes. Let's collect "everything we can." I think one issue nowadays is trying to figure out what to collect... where's the important data? I think you're right, let's just collect it all and let it lead us to the important data. It is costly, but let's see what we can find.