Big Data Dilemmas

Stephen DeAngelis

December 14, 2012

We’ve all heard the old saying, “There are three kinds of lies: lies, damned lies, and statistics.” The adage has been variously attributed to the 19th-century British Prime Minister Benjamin Disraeli and to Mark Twain. Although the use of statistics to bolster weak arguments or paint a skewed picture is an ethically questionable practice, I suspect the average person on the street generally views number crunching as a fairly innocuous activity. As a result, he probably doesn’t see how ethics has any significant role to play in big data analytics. On the other hand, Kord Davis, a former business strategist and technical consultant, believes there are a lot of ethical issues that must be considered. He lays out his arguments in a short (82-page) book entitled Ethics of Big Data: Balancing Risk and Innovation. In the book, Davis reports that more and more companies are coming to “depend on big-data technologies including dozens of familiar names and a growing number you’ve never heard of.”

Big data is gathered from almost every kind of human and machine activity that is connected in some way to a network. We create mountains of data each and every second of each and every day. Davis’ book discusses how the use of this data can create ethical dilemmas for organizations. In a review of the book, Wayne Hurlbert writes, “The author describes how large companies, in possession of vast amounts of personal information and data, face many ethical challenges regarding their policies on privacy and identity.” I have repeatedly noted in past posts that privacy issues are probably the greatest challenges that need to be faced by organizations involved in big data collection and analysis. Although some people would simply like organizations to stop collecting data, that is not going to happen. It’s too valuable. In fact, the World Economic Forum has labeled data as a new class of asset. Big data is not only valuable in economic terms, but the insights that can be drawn from it hold the promise of a brighter future in a number of areas, including smarter cities and a cleaner environment. That is why handling big data can create a dilemma. Hurlbert continues:

“Kord Davis recognizes that companies face real ethical dilemmas and potential for abuse due to their collection of massive quantities of data through an array of big-data technology. The author points out that big data also has an impact on brand quality, customer relationships, and revenue. Due to the enormous volume of information collected by big-data technology, the data can have far reaching consequences, serious repercussions, and move ever faster in its effect on those same brands. Kord Davis provides insights into how this increasing pressure from big data forces companies to reexamine their organizational values and ethical behavior. With more ways for brands and customers to engage, communicate, and interact, these ever more pressing ethical questions must be considered and become part of the company behavior.”

An example of a dilemma created by data collection was recently discussed by Amy Dockser Marcus and Christopher Weaver. [“Heart Gadgets Test Privacy-Law Limits,” Wall Street Journal, 28 November 2012] They describe how implanted heart defibrillators beam “all kinds of data” to the company that makes the implant. They write:

“The U.S. has strict privacy laws guaranteeing people access to traditional health files. But implants and other new technologies—including smartphone apps and over-the-counter monitors—are testing the very definition of medical records. … Companies, including Medtronic, are pushing to turn the data into money. … The company is contemplating selling the data to health systems or insurers that could use it to predict diseases and possibly lower their costs. At a July industry event, a senior Medtronic executive, Ken Riff, called these kinds of data ‘the currency of the future.'”

It’s not difficult to understand why patients receiving these implants might feel violated; especially if they don’t have access to the collected data themselves. Patients may not want to share that kind of data and yet they may be forced to sign away their privacy in order to save their lives. That’s an ethical dilemma! Or how do you feel about a store mannequin that surreptitiously gathers “demographic data on the customers”? Ben Coxworth reports that such mannequins are being developed. [“EyeSee store mannequins gather intelligence on shoppers,” Gizmag, 23 November 2012] He writes:

“Using facial recognition software, they can reportedly determine things such as a person’s age range, gender and race. The mannequins will also keep track of the number of people to pass through a certain area within a given amount of time, and how much time each person spends there. Almax suggests that store owners could then use that data to develop targeted marketing strategies, to place salespeople in the parts of the store with the highest traffic, to see what times of day are busiest (and with what sort of customers), and to gauge the effectiveness of window displays or the popularity of displayed items. Needless to say, privacy concerns are definitely an issue. According to the company, all the data is processed within the mannequins, so no outside computers are involved, and nothing is transmitted. Nonetheless, that doesn’t change the fact that the mannequins would actually be watching you – and scrutinizing you.”

Hurlbert continues:

“Davis understands that while the ethical questions are usually framed in terms of individuals, that the real impact of how the organization acts on its values have far deeper ramifications. The author addresses the implications of big data for both individuals and for companies. As a result, … Davis offers analysis of how both individuals and organizations of any size or type must understand the handling and applications of data. The author considers the benefits, risks, and unintended consequences of this immense volume of data on very large numbers of people. [He] finds four common principles in the framework for the ethics of big data. Those four components are as follows:

“* Identity: The relationship between online and offline identity

“* Privacy: Who and how is access to data controlled

“* Ownership of data: Who owns it and what are the responsibilities of holders

“* Reputation: Is the data trustworthy”

Perhaps the most profound thing in Davis’ book is identifying the importance of those four components. That’s because almost all discussions concerning big data can be centered around them.  Hurlbert concludes:

“For me, the power of the book is how Kord Davis combines an analysis of the ethical and values based challenges of big data, with a practical strategy for ensuring the organization behaves ethically with that data. The author provides evidence that the very overwhelming volume of data collected is changing the very way people value privacy, identity, ownership, and reputation. … It’s critical for organizations to develop strategic and tactical plans to overcome the rift between ethics and behavior. The author demonstrates how to create alignment between values and practices.”

One of the ways to address privacy and ethical issues by big data collection and analysis is to establish voluntary relationships. For all the talk about big data collection and analysis, and the concerns they raise, Peter Tufano, the dean of Oxford’s Said Business School, claimed, during an Oxford-sponsored conference in November, that, “while awareness of the topic was high among enterprises, only about 6% of companies have got beyond a pilot stage, and 18% are still in one. That means three-quarters of industries are looking at this and saying ‘what is this all about?'” [“Big Data Is on the Rise, Bringing Big Questions,” by Ben Rooney, Wall Street Journal, 29 November 2012] Perhaps that’s a good thing because it means that companies still have time to consider all of the issues raised by Davis as they enter this brave new world of big data analytics. That begs the question: Is getting involved with big data worth all the fuss? Another speaker at the Oxford conference answered that question with a resounding “yes.” Rooney reports:

“Michael Chui has extensively researched the area for McKinsey Global Institute. His conclusion is emphatic: ‘The use of data and analytics in general is going to be a basis of competition going forward for individual firms, for sectors and even for countries. Those companies that are able to use data effectively are more likely to win in the marketplace.’ MGI’s research showed that in just one field—personal location data—some $100 billion of value can be created globally for service providers through use of data. He suggested at a talk last year that the benefits for consumers could be six times that. ‘We find that Big Data tends to accelerate the capture of surplus [value] by consumers.’ In other words, not only do companies do well, but customers do even better. And if companies need even more persuading, what about the claim at the conference that Big Data played a part in re-electing Barack Obama to the White House? John Aristotle Phillips, Chief Executive of Aristotle International — a nonpartisan company that applies technology to politics and political communication — said the use of data analytics had a material effect on outcomes.”

Since it appears that both organizations and consumers benefit from big data collection and analysis, there should be a path to a win-win scenario that addresses most of the concerns raised by Davis. Rooney says that they are corporate cultural obstacles that must be faced by many organizations in order to make big data an important part of their business model. He also reports, “There are a raft of other obstacles, including a regulatory framework that was designed for a different data world and a lack of skills to actually do the work.” Stephen Sorkin, vice president of Engineering for the U.S.-based Splunk, told conference participants, that companies had to watch out for the “creepy factor” (i.e., many of the issues raised by Davis by spying mannequins). “The richest examples of Big Data are to understand consumer behavior and optimize your product for it,” he stated. “That is where the danger can lie.” Rooney concludes:

“He suggests that unless companies are careful, optimizing a product can end up putting consumers off, like ads that follow you from site to site. ‘Companies will customize some aspect based on the consumer and the consumer can think it is a violation of their privacy, or it can just feel creepy. I can make something perfect, but perfect may not be what the consumer is looking for.”

Despite the issues, challenges, and obstacles that lie ahead, big data collection and analysis is likely to increase in importance. That makes it imperative that organizations entering the big data arena do so with an awareness of the potentially treacherous landscape they must traverse.