Today, the White House is announcing that it’s spending $200 million on its “Big Data” initiative, which aims to “advance state-of-the-art core technologies needed to collect, store, preserve, manage, analyze, and share huge quantities of data.”
So, what are these state-of-the-art technologies? They’re split between several organizations, including the National Science Foundation, National Institutes of Health, Department of Defense and Department of Energy.
Perhaps the coolest project is the push to put the data from the 1,000 Genomes Project into the cloud. As you might imagine, the world’s largest set of data on human genetic variation is pretty big — 200 terabytes, to be exact.
All of that data will now be free, hosted on the Amazon Web Services (AWS) cloud. Of course, labs will still need the computer horsepower to manipulate and sort that data, but it’s still a big help for researchers with relatively limited resources. Another project getting a grant is EarthCube, which sounds like an obscure PlayStation game from the 1990s but is actually “a system that will allow geoscientists to access, analyze and share information about our planet.”
The Department of Defense is one of the biggest funders of the “Big Data” project, committing $60 million to new projects. Of that, DARPA — the DoD’s research arm — is getting $25 million annually for four years for its XDATA program. The wording on the White House’s press release is a bit vague, but it looks like XDATA’s two main aims will be analyzing semi-structured data (meta-data, etc.) and unstructured data such as text documents.
Sure, it’s not the most exciting announcement in the world, but when you think about how much money companies like Google are committing to analyzing data, it makes sense that the U.S. Government wouldn’t want to be left behind.