Where can we use Big Data?

In the previous article I wrote about BigData, we talked about the basic characteristics, the so called 4 Vs. The concept is yet very new and many companies and also IT professionals are struggling to understand the concepts behind it and try to imagine a “real world” BigData platform. I will try to give a few examples where Big Data could be very useful and also to make some analogies between other industries techniques and resolutions that could be relevant for the Big Data world.

Let’s take as a first example an oil drilling platform that has between 20000 and 40000 sensors on board. You can imagine the huge amount of data that is produced by these sensors on a daily basis. Now try to guess how much of the data is actively used. If you’re thinking about 5-10%, than I think your starting to get it…

Now think about the logs that a company produces every day from their systems monitoring. In a real-case scenario, a company had retention policy of just a few days for the logs to lower the expensive storage acquisition. So different departments moved the same data to SANs so they could use it for a longer time. About 70% of the data was still using storage and was present 4 or 5 times because was replicated by different departments on several storage environments.(kind of ironic right? :-) ) Centralizing the logs and changing the retention policy to several months saved some millions of dollars…

In the past few years the customer satisfaction dropped dramatically and a lot of companies have difficulties in satisfying their customers. Big data can help on seeing what customers need and expect and offer better visibility to the service provider.

A whole book could be written about fraud detection but just take the following example: unusual events could be tracked and investigated to prevent possible fraud and social media could be analyzed to track down possible piracy for example…

The next question is: How can you achieve this? It seems hard to understand, if we were used to think in terms of structured databases… You sure can understand the limitations and that certain types of data are not suitable for a traditional Data Warehouse but not how you can put them to good value. This I like to call the Gold Rush analogy and it goes like this: some years ago gold could be spotted with the naked eye a mining was a task to do for the people. After all that gold was gone people were sure that some more gold could be present in the mountains that to try to find was too much of a risk. There were too many resources involved and no guarantee of success. So the model adopted was to use machines that could process immense amounts of dirt and find the gold that is hiding in there. Think about it in terms of data now…

Of course, Big Data doesn’t mean that one should throw their structured data systems away. Both components are part of this and they work in tandem, like the hands of a baseball player.(one catches the ball and the other throws it). Other stuff related to this involves: natural language processing, predictive model mark-up language, machine learning techniques and so on, but I’ll let you read about that by yourselfs.

I will end this with two questions and in the next part I will try to give you the proper explanation so you figure out what the response should be by yourselfs: “Is finding patterns easier or slower when dealing with Big Data?” and “Is finding outliers easier or harder as data volumes and velocity increase?”

About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s