How is big data different from normal data?
I got into a bit of a discussion with a friend of mine recently about this. I said big data is nothing special – data has been expanding exponentially since the beginning of time (which was about 1950), and what we are now calling big data is simply a continuation of this trend.
Not so, said my friend. Big data is not just quantitatively different, it is qualitatively different, which means it needs to be treated and analysed in different ways.
Big data is unstructured, he pointed out. It often needs to be analysed in real time, and it cannot be handled by existing data management tools like relational databases and data mining or business intelligence.
We ended up agreeing that we were not really in disagreement, and that we were both right. Yes, big data is a logical extension of the sorts of increases in data that we have always seen. But yes, this growth is so massive that we simply cannot approach data analytics in the same way we have in the past.
I still hate the term, but in the absence of an alternative I suppose we are stuck with it. Big data. How big is it? As Douglas Adams might have said, it’s really big – really, really big.
For many years EMC has sponsored a report called Digital Universe. It comes out every two years, and has most recently been conducted by IDC. The most recent report, from 2011, said that the amount of digital data in the word exceeded 1.8 trillion zettabytes, which is 1.8 trillion gigabytes, and that this figure had increased by an order of magnitude just in the last few years.
A zettabyte is 1000 exabytes (and a yottabyte is 1000 zettabytes). The two words were added to the international standards back in 1990s because we were running out of numbers. They are the last and second last letters (z and y) of the Italian alphabet which means the next multiple of 1000 will be xenta.
There is now just too much data to store. That is one of the drivers for cloud computing, says the EMC report.
“Cloud computing is enabling the consumption of IT as a service. Couple that with the big data phenomenon, and organisations increasingly will be motivated to consume IT as an external service versus internal infrastructure investments.
“This period of ‘space exploration’ of the digital universe will not be without its challenges. But for the ‘astronauts’ involved—CIOs and their staff—it represents a unique, perhaps once-in-a-career opportunity to drive growth for their enterprises. They will need to lead their enterprise in the adoption of new information-taming technologies, best practices for leveraging and extracting value from data, and the creation of new roles and organisational design. Each step will require organisational change, not just a few new computers or more software. The success of many enterprises in the coming years will be determined by how successful CIOs are in driving the required enterprise-wide adjustment to the new realities of the digital universe.”
So there you have it. The volume of data is expanding so quickly that we can’t handle it in conventional ways. IT professionals who understand this, and learn how to adapt to the new ways, will be much more successful than those who don’t.