When I am writing this, there is so much buzz around Big Data, Analytics and technologies like Hadoop, NoSQL and Map Reduce which are used in the “Big Data” Context. Mckinsey, Gartner and many others have forecasted the value and potential of Big Data business. We have been consuming a large portion of digital content from few decades but why is data gaining so much popularity recently?
Facebook’s data warehouses grow by “Over half a petabyte every 24 hours”, Walmart handles one million customer transactions every hour, internet traffic is predicted to reach 667 exabytes by 2013 and research says that content gets doubled every 1.8 years. The way data and content is exploding is posing a big challenge for companies dealing with complex data across different dimensions – volume, velocity and variety – the 3Vs of data.
Companies have to process large data sets from various sources including Emails and other unstructured content, Web Logs, GIS, RFID, social feeds, events, marketing, documents, audio and video information to make critical decisions about their business. A plethora of large scale data gathering and analytical technologies and visualization tools are booming up in the industry. Companies who shape their strategies based on smart data analytical techniques are expected to make a big transformation towards sustained business growth. While a lot of structured data stored in relational databases provided a cake walk to make important decisions, unstructured BIG DATA is the Game Changer.
Data by itself is raw and doesn’t carry any meaning or intent. A whole lot of techniques starting from creation, curing, review, moderation, structuring and evaluation make data useful to the humankind. Content is generated by humans unlike most data, which is usually machine generated. Content could be seen as your emails, tweets and documents, which carry sentiment and emotion and are mostly unstructured. The trick lies in identifying hidden patterns inside the unstructured data and content to generate Value. I’m not going to dig into the different technologies that support the data evolution through analysis which include machine learning, OCR, semantic analysis, pattern recognition, distributed file systems and cloud infrastructures.
What it means for us at the end of the day is how smarter is the world we live in – from the gadgets we use to the services we consume from different sectors!