There are two words that have been creating a lot of commotion for the past few years in fields from business to philanthropy to government. No one seems to agree precisely on how to define them, but there’s a consensus that it’s a concept that is going to change everything. The words are ‘big data’.
The idea is that more digital information is being generated online than ever before – the number doubles every couple of years, according to one report. While there has been plenty of focus over the past decade on the impact this deluge of data has had on connectivity, productivity, and the democratisation of knowledge, scientists are just starting to uncover the many ways that this information can be mined for patterns used to predict, shape and react to events happening in the real world.
“There is nearly as much digital information as there are stars.”
To borrow an example from Viktor Mayer-Schönberger and Kenneth Cukier’s book Big Data, Walmart’s analysts have trawled through the million-plus customer transactions that are logged digitally by the chain every hour, and one of the many surprising micro-trends they uncovered was that sales of Pop-Tarts spike just before a hurricane. Now, whenever a storm is on the horizon, store managers put Pop-Tarts on display near the entrance. The tweak worked: Walmart increased its profits, and while no one has come up with a theory as to why inclement weather will provoke a craving for that particular breakfast snack, no one needs to. On a big enough scale – so the thinking goes – the numbers speak for themselves.
There are countless uses for this type of large-scale number-crunching. During the run-up to the 2012 presidential election, Barack Obama hired a staff of 100 data analysts to “measure everything”, in the words of campaign manager Jim Messina, who spent US$100m on technology and ran 66,000 computer simulations each day. Each swing-state voter was assigned numbers based on metrics such as whether they could be persuaded to get to the polls and how much they could be swayed by a particular issue. These voters could then be targeted precisely.
“Google’s Eric Schmidt, has stated that every two days we create as much information as we did from the beginning of civilisation up until today.”
Data can be used to track epidemics and make sure aid goes to those who are most in need. It can be used to teach Google’s search engine what users mean by their misspelled words, and it can maximise efficiency in business, sometimes in counter-intuitive ways. In his book Social Physics, the computer scientist Alex Pentland describes a call centre that improved the speed at which calls were handled by making sure that employees went for coffee breaks at the same time and mingled together. The insight was revealed after tracking social interactions and running these numbers alongside productivity rates. Before that, the call centre had told workers to take their breaks one at a time.
“Big data disrupts every industry,” says Dubai-based analyst and consultant Ali Rebaie, who delivered the keynote speech at Dubai’s 2014 Smart Data Summit. “Healthcare, manufacturing, marketing, telecoms, oil and gas, real estate, retail, fashion, transportation – there is no industry not affected.” Alongside his work helping companies come up with data strategies, Rebaie has taught at the School of Data, which teaches civil-society organisations, journalists and ordinary citizens what they can do with data. The school’s tagline is ‘Evidence is power’.
The types of information it’s possible to track are almost limitless, and it can be bewildering for businesses to know where to start when it comes to their data strategy. An oil and gas company, Rebaie points out, may want to track weather reports along with information from sonar, satellite images and airborne data streams. A retail outlet may be interested in a customer’s online ‘clickstream’, their movements around a physical store and the tone of their voice on customer service calls.
“Don’t start by thinking about what data you already have,” advises Gaurav Chhaparwal, who works in Dubai as Head of Analytics at the internet marketplace Souq.com. “Start with the question.” Rather than looking at how many people are already customers, for example, ask where the next 1,000 customers are going to come from.
As businesses start mastering the basics of big data, the field is already changing around them – growing and evolving rapidly. The next big thing, Rebaie says, is going to be about processing data streams in realtime and having them power ‘recommendation engines’. Rather than getting analysts to make a business decision based on past performance, the system will make automatic adjustments to the way the business is run on the fly. To take the Walmart example, this means a hurricane warning would automatically trigger an increased order of Pop-Tart stocks, without any need for human intervention. Programming these systems may not be easy, but those who master them will have an edge.
“Bad data or poor data quality costs US businesses US$600bn each year.”
That’s not to say there aren’t pitfalls. Correlation and causality are two very different things, and when police start using statistics to assess the probability that certain groups of people are more likely to commit crime, for instance, they can begin treating certain sectors of society as guilty until proven innocent. Data about people is never quite as cut and dried as data about raw mathematical or scientific processes, and interpreting the information is at least as important as the information itself.
“The digital universe will grow from 3.2 zettabytes today to 40 zettabytes in only six years.”
Still, when used with caution, sifting through complex data streams can reap very real rewards for organisations, and no one knows for certain where things will go from here. In Gaurav Chhaparwal’s words, the era of big data has “only just started”.
Indeed, Big Data is going to change everything. Think about areas like ecology, sports, agriculture etc…What are other areas that are being disrupted by these two words: “Big Data”? Feel free to share your thoughts and comments below.
Safety in numbers?
As the world gets to grips with big data, authors of all stripes have been exploring some of its consequences, some more optimistically than others.
Among the cheerleaders is Christian Rudder, the OkCupid founder and author of Dataclysm, who shows that the way people use dating sites points to unconscious biases. While 84 per cent of OkCupid users said racism in a partner was unacceptable, there was a preference for partners of the same race and certain minorities were consistently ranked lower than others.
Erez Aiden and Jean- Baptiste Michel, co-authors of Uncharted, mined the text of the 130 million digitised books. Ngram Viewer, the app they developed with Google, shows that the phrase ‘Middle East’ has been on the wane since the 1980s.
More sceptical is Julia Angwin, who recorded her attempts to remove all her personal data from the public domain in Dragnet Nation. Her details were in the hands of more than 200 data brokers, and after a long, costly struggle, she only disconnected from 91.
Viktor Mayer- Schönberger and Kenneth Cukier’s Big Data applauds the way number-crunching helps fight problems such as climate change, but warns against over-reliance on stats. We don’t want to be like Icarus, they say, who “adored his technical power of flight, but used it improperly and tumbled into the sea”.
This interview originally featured Ali Rebaie at VISION Magazine – Dubai in January 2015. Written by Jessica Holland, a regular contributor to The Observer in the UK, The National in the UAE and The Wall Street Journal.