Big Data’s Jekyll-and-Hyde Moment

Stephen DeAngelis

May 29, 2018

Whether you choose to call the time period in which we live the Information Age, the Digital Age, or the Big Data Age, the principal feature of this era is data. By the end of 2019, analysts are predicting the world will annually generate two zettabytes of data. A zettabyte equals a trillion gigabytes. That’s a lot of data. Data by itself, however, is of limited value. Data needs to be analyzed. Unfortunately, Big Data analytics is gaining a Jekyll-and-Hyde reputation. Gregg Easterbrook (@EasterbrookG) explains, “‘Big Data’ is the Big Bad of our moment. Companies and governments amass enormous troves of information about our online and offline activities, so they can understand them better than we do.”[1] He unveils the Jekyll-and-Hyde nature of Big Data by asking a few rhetorical questions, “In the future, will Big Data help physicians cure diseases or help health insurers deny claims? Make factories and products safer or accelerate layoffs? Ultimately spawn some kind of hostile artificial intelligence? Right now it’s fair to suppose that many people would favor putting the Big Data genie back into the bottle.” Like most genies, however, Big Data is not going back into the bottle.

Extracting Value from Big Data

Regulators always find themselves in a tail chase with technology. We’ll never know whether regulations like the European Union’s General Data Protection Regulation would have prevented the Facebook/Cambridge Analytica fiasco or reduced the number of past data breaches; but, businesses can expect to see more regulation in the future. What we do know is that there is value locked within large datasets and advanced analytics can release that value. Cynthia Harvey explains, “Big data analytics is the process of using software to uncover trends, patterns, correlations or other useful insights in large stores of data.”[2] Harvey cites Gartner analysts who place big data analytics tools into four different categories:

1. Descriptive Analytics: “These tools tell companies what happened. They create simple reports and visualizations that show what occurred at a particular point in time or over a period of time. These are the least advanced analytics tools.”
2. Diagnostic Analytics: “Diagnostic tools explain why something happened. More advanced than descriptive reporting tools, they allow analysts to dive deep into the data and determine root causes for a given situation.”
3. Predictive Analytics: “Among the most popular big data analytics tools available today, predictive analytics tools use highly advanced algorithms to forecast what might happen next. Often these tools make use of artificial intelligence and machine learning technology.”
4. Prescriptive Analytics: “A step above predictive analytics, prescriptive analytics tell organizations what they should do in order to achieve a desired result. These tools require very advanced machine learning capabilities, and few solutions on the market today offer true prescriptive capabilities.”

The quality of analysis depends on the quality of the data. Larry Greenemeier (@lggreenemeier) observes, “Never before has so much data been available covering so many areas of interest, whether it’s online shopping trends or cancer research. Still, some scientists caution that, particularly when it comes to data, bigger isn’t necessarily better.”[3] How important is quality data? Bob Violino (@BobViolino) reports a study conducted by International Data Corporation (IDC) for Alteryx Inc. concluded, “Data professionals are spending more time governing, searching and preparing data than they are on extracting business value. … One of the key findings is that data professionals spend 60 percent of their time getting to insight, but just 27 percent of that time is spent on actual analysis while 37 percent is spent searching for data and 36 percent on preparing data.”[4]

Even if you get the right data in the right format, you need to ensure algorithms are right as well. There have been some well-publicized cases of biased algorithms. Kristen Clark writes, “Mathematician and data scientist Cathy O’Neil has a name for these wide-reaching and discriminatory models: Weapons of Math Destruction. … Weapons of Math Destruction have a way of creating their own reality and then using that reality to justify their model, says O’Neil. An algorithm that, say, targets financially vulnerable people for predatory loans creates a feedback loop, making it even harder for them to get out of debt.”[5] The lesson to be learned is that good analytics takes a lot of hard work. Sam Ransbotham (@Ransbotham), David Kiron (@DavidKiron1), and Pamela Kirk Prentice (@pamelakprentice) observe, “The reality is that many companies still struggle to figure out how to use analytics to take advantage of their data. The experience of managers grappling, sometimes unsuccessfully, with ever-increasing amounts of data and sophisticated analytics is often more the rule than the exception. In many respects, the hype surrounding the promise of analytics glosses over the hard work necessary to fulfill that promise. It is hard work to understand what data a company has, to monitor the many processes necessary to make data sufficient (accurate, timely, complete, accessible, reliable, consistent, relevant, and detailed), and to improve managers’ ability to use data. This unsexy side of analytics is where companies need to excel in order to maximize the value of their analytics initiatives, but it is also where many such efforts stall.”[6] They note the competitive advantage gained through analytics is diminishing because everyone is doing it. Companies not taking advantage of advanced analytics are falling behind their competition and the gap will continue to grow.

Harvey lists six advantages Big Data analytics can provide organizations. They are:

1. Business Transformation. “In general,” Harvey explains, “executives believe that big data analytics offers tremendous potential to revolution their organizations.”
2. Competitive Advantage. Ransbotham and his colleagues note, “Optimism about the potential of analytics remains strong, despite the decline in competitive advantage. Most managers are still quite positive about the potential of analytics. They’ve seen increased interest in analytics over the past few years, and they expect its use to continue to grow in their organizations.”
3. Innovation. Harvey writes, “Big data analytics can help companies develop products and services that appeal to their customers, as well as helping them identify new opportunities for revenue generation.” Ransbotham et. al. add, “Use of analytics for innovation remains steady.”
4. Lower Costs. Harvey reports, “In the NewVantage Partners Big Data Executive Survey 2017, 49.2 percent of companies surveyed said that they had successfully decreased expenses as a result of a big data project.”
5. Improved Customer Service. Guy Greenberg, co-founder and president at CoolaData, asserts, “With a fully managed advanced analytics solution, businesses don’t need a staff of BI analysts or data scientists. There’s no down time to wait for implementation. Businesses are using the same resources as before, but simply having the freedom to ask any question and receive deeper insights.”[7] Harvey adds, “Organizations often use big data analytics to examine social media, customer service, sales and marketing data. This can help them better gauge customer sentiment and respond to customers in real time.”
6. Increased Security. “Another key area for big data analytics,” writes Harvey, “is IT security. Security software creates an enormous amount of log data. By applying big data analytics techniques to this data, organizations can sometimes identify and thwart cyberattacks that would otherwise have gone unnoticed.” In the new GDPR era, IT security is an absolute necessity for companies collecting, storing, and analyzing personal data.

Ransbotham and his colleagues believe companies willing to put in the hard work will reap the rewards. They write, “More analytically advanced organizations ensure that the right data is being captured or created on an ongoing basis. In these organizations, information management is an organizational goal, not a technical one. For many organizations, especially among the growing numbers of the Analytically Challenged companies, it is time to recognize that to get the most out of data and effectively improve decision making with data across the organization, better algorithms and better analytical talent are necessary but not sufficient. … The right information might not exist if the right questions have yet to be asked.” To ensure they remain a Dr. Jekyll rather transform into a Mr. Hyde, companies need to ensure their collection and use of big data is both ethical and transparent.

[1] Gregg Easterbrook, “‘The Efficiency Paradox’ Review: Big Data, Big Problems,” The Wall Street Journal, 22 April 2018.
[2] Cynthia Harvey, “Big Data Analytics,” Datamation, 27 July 2017.
[3] Larry Greenemeier, “Why Big Data Isn’t Necessarily Better Data,” Scientific American, 13 March 2014.
[4] Bob Violino, “Data pros waste half of their work time chasing costly data,” Information Management, 20 February 2018.
[5] Kristen Clark, “Is Your Big Data Project a ‘Weapon of Math Destruction’?IEEE Spectrum, 5 October 2016.
[6] Sam Ransbotham, David Kiron, and Pamela Kirk Prentice, “Beyond the Hype: The Hard Work Behind Analytics Success,” MIT Sloan Management Review, 8 March 2016.
[7] Guy Greenberg, “Taking a painless path to advanced analytics,” Information Management, 31 August 2017.