Data Explosion – It’s the “Big Data” trend…

Data Explosion Can we know which seeds (data) to get in order to have such beautiful flowers’ combination (Business Insights) and healthy fresh garden (company)?


 

 

 

 

 

 

 

When we discuss data and information management in companies nowadays, we directly pinpoint at “structured data” which is data in spreadsheets, databases, datamarts, ERPs, and datawarehouses.

Do you know where do we live now? We are in the DATA EXPLOSION stage…

Data is now everywhere from apps, web, sensors , smart energy meters, mobile phones, social media networks, blogs, forums, email, and documents.

Each and every one of us is now creating, sharing, browsing, communicating, buying and also collaborating, thus, we are not just end users anymore. We are involved in creating what’s called “Big Data”.

Most of us are on Facebook but do you know that in just 20 minutes on Facebook over 1 million links are shared and almost 3 million messages are sent (2011).

That’s huge right?

The statistics below will let you know how large is this data explosion!

All spoken words in the history of mankind represent about 5 Exabytes = 1 million terabytes (UC Berkely).

“Wait! Now you’ll know why it’s a real data explosion ….”

Besides, by the end of this year consumers and companies will create 1.8 zettabytes (1000 Exabyte = 1 zettabyte ) of digital information by the end of the year (IDC) !!

Now, let’s move to explain what’s all about Big Data and how it differs from the structured data.

As we know, we have pipes for gases and liquids, and usually any chemically stable substance can be sent through a pipeline. For instance, we have sewage and water pipes and more valuable pipes like oils and fuel pipes. Engineers architected these pipes where each substance can be move structurally using a pre-defined network to its destination.

Let’s say this model of pipelines resembles our structured data in databases where these pipes(databases) can handle the volume and structure of these structured data.

But what if the pipes exploded and we faced a pipe leakage? Substances of different types will now spread with high speed everywhere. Now, there is no pre-defined structured process for our substances movement.

Unstructured data includes free-form text (articles, blogs, and forums), videos, images, social media streams etc…

Let’s discuss now the process of handling these unstructured data.

Furthermore, managing unstructured data is implemented through Big Data technologies. Big Data are datasets whose size is beyond the ability of databases or datawarehouses to handle or manage. Thus, we consider Big Data because of:

Variety: Complexity of multiple data types and schemas from several data sources and streams.

Velocity: Streaming data and large data movement in Real-time.

Volume: Data is growing from terabytes to zettabytes.

On the other hand, can we just get rid of our traditional models and stop building our databases and datawarehouses?

NO !

Big Data is never Silo! It’s just part of our enterprise architecture along with our traditional relational databases and datawarehouses. Thus, we’ll have an iterative approach capable of creative data exploration.

If we move from data explosion to imagination and to the garden of flowers and deeper advanced analytics…

Imagine that we can gather information from everywhere coming from the web, sensors, devices, apps and integrate a high quality structured and unstructured data.

Let’s also think of radical flexibility where we have easy to use analytics running on the cloud and better ability to handle opportunities in Real-time.

It’s just a BIG Picture of BIG Data….

To be continued…

I will leave you with a glimpse of our future revealed recently by Microsoft and YES data is easily accessible everywhere…

[youtube http://www.youtube.com/watch?v=a6cNdhOKwi0]

I will launch a series of blog about Big Data. Wait for the next blog about Microsoft investment in Big Data and its new strategies towards open source by revealing in PASS conference 2 weeks ago about SQL Server 2012 support to Hadoop with an Apache-derivative distribution and commitment to support Hadoop community..

Sounds good news right? Wait for my continuous series of blog posts very soon…

– Big Data technologies & its potential in business.

-Big Data solution design approach.

-Companies providing Big Data solutions.

-Techniques for analyzing Big Data.

-Big Data & Hadoop.

And a lot more!

     Like this at Facebook!

By Ali Rebaie

Weekly Musings from an AI Phenomenologist on how AI will shape our Human Behaviour, lives and jobs

We don’t spam! Read our privacy policy for more info.

4 Comments

  1. Cannot agree more on the BIG DATA EXPLOSION problem, and on the opportunities that come with it.
    Not only companies but also professionals are getting more and more into a phase in which they will be facing tons of data.
    Processing, managing, distributing, communicating and using that data are interesting problems to solve.

    Eager to get the coming big data posts.

    1. Hello Diego,

      Indeed , the possibilities of big data continue to evolve rapidly, driven by the growth
      of several technologies and platforms that can analyze and handle bid data.

      I’m glad you liked my first post and looking forward for your feedback in my upcoming posts.

      Thanks,
      Ali.

  2. You mention Hadoop in the upcoming topics, but I hope you’ll also be covering Streams in addition to Big Insights. The need for real-time processing on data in motion is arguably just as important if not more so than batch mode an large data sets.

    1. Hello Jim,

      Absolutely! There is a growing need to analyze data on the fly in several domains like : Retail , healthcare , location data etc… So, I will cover real-time processing and IBM great platforms like Infosphere Streams , BigSheets etc…

      Thank you for your feedback and looking forward to hear from you in my upcoming posts.

Leave a Reply

Your email address will not be published. Required fields are marked *