These are very interesting arguments and they do begin to get to the heart of the matter – which is that our data sets have grown beyond our ability to analyse and process them without sophisticated automation. We simply have to rely on technology to analyse and cope with this enormous wave of content and metadata.
Analysing human generated big data has enormous potential. More than potential, harnessing the power of metadata has become essential to manage and protect human generated content. File shares, emails, and intranets have made it so easy for end users to save and share files that organisations now have more human generated content than they can sustainably manage and protect using small data thinking. Many organisations face real problems because questions that could be answered 15 years ago on smaller, more static data sets can no longer be answered. These questions include: Where does critical data reside, who accesses it, and who should have access to it? As a consequence, IDC estimates that only half the data that should be protected is protected.
The problem is compounded with cloud based file sharing, as these services create yet another growing store of human generated content requiring management and protection—one that lies outside corporate infrastructure with different controls and management processes.
David Weinberger of Harvard University’s Berkman Center said: "We are just beginning to understand the range of problems Big Data can solve, even though it means acknowledging that we're less unpredictable, free, madcap creatures than we'd like to think. If harnessing the power of human generated big data can make data protection and management less unpredictable, free, and madcap, organisations will be grateful.