Addressing the Ethics of Big Data: Is Your Board Ready?

                                                                                                                                                                                                                                                                                (image courtesy of IBM Big Data Hub)

“Big Data” is a promising new trend that can help organisations streamline operations, predict trends, and personalise services. But with more data being collected than ever, thorny privacy and accessibility issues are coming to the forefront. Society is beginning to demand accountability from organisations, putting boards in the powerful position of being able to set boundaries and dictate future policy around this tool.

Does Big Data live up to its promise? And are the benefits worth the ethical issues raised by the mass collection and exploitation of personal data? The societal impact of these questions is far from clear, and it is vital that boards consider all the implications.

Defining “Big”

As the availability of computing power, networking, and storage have increased, the amount of data traversing the globe has increased exponentially, putting the local and global effects of Big Data at the forefront of business discussions for the last several years. Despite being a somewhat vague term, Big Data is important because it allows us to make predictions, solve problems, and increase awareness of everything from outbreaks of illnesses to changes in the cultural zeitgeist.

What is Big Data? The term is predominantly used to refer to personal data gathered from network-connected devices, including location history, calls, social connections, searches, purchases, and metadata (information about data). Most definitions focus on the sheer amount of data now available, with conversations revolving around storage size, analytics, and predictions as well as terms like “high volume,” “high velocity,” “high variety,” and “high speed” or real-time, rather than decisions made with that data [1].

Many analysts believe that Big Data heralds a new kind of Industrial Revolution, one where machine learning will allow computers to make decisions previously limited to humans. And it’s more than a pipe dream—global providers like Google, Facebook, Microsoft and Amazon are all investing in their own fibre-optic networks to manage the expected increases in data across their systems.

What Boards Need to Know

Big Data is often treated in popular literature as a near-mythical power, when it is more accurately an emerging theory of knowledge developed by academics in fields as diverse as communications, anthropology, geography, information science, sociology and media.

According to researchers Crawford, Miltner and Gray in their 2014 article Critiquing Big Data: Politics, Ethics, Epistemology, Big Data aims to translate the world into a problem of coding. “Differences in the world are ‘submitted to disassembly, reassembly, investment and exchange’ in what is described as ‘the informatics of domination’.” [2]

This is not new, of course, but Big Data didn’t become truly popular until there was enough data available to allow computers to find patterns where nothing else could in areas as different as disaster relief and e-commerce.

Big Data is currently used for everything from census tracking to the quantification of climate shifts. Financial services companies use Big Data to perform rapid analysis and projection of financial data, and intelligence agencies use data collected from communications systems to monitor populations. As the world reaches a “tipping point” of data collection and use, its implementation could become far more wide-reaching. It is fundamentally tied to the technology that enables it and is subject to technical challenges such as computing power and bandwidth.

Balancing Act

It can be tough to balance the potential profitability of innovations like Big Data with seemingly less-pressing ethical concerns. But while innovation, convenience, and profit are important, it is vital to establish norms and values that govern Big Data, especially as it transitions from its military and corporate origins into the hands of almost anyone in the developed world [1].

Boards must consider how Big Data is affecting their industries as well as the environment and future commercial movements. When it comes to doing research, Crawford et al (2014) point out that any research that draws on ‘passively collected’ big data will require thorough evaluation and the establishment of ethical frameworks.

Ethical Boundaries for Boards to Consider

There are five key ethical areas that boards need to discuss in order to implement Big Data safely and effectively [1]:

Privacy: What rules should regulate the flow of information? How can we define privacy and what do we expect from it? What do we keep secret, what do we collect, and what do we share? What laws govern the flow of information?
Confidentiality: What shared information should be kept secret? A good case study is the reaction of citizens to the NSA’s seemingly indiscriminate data gathering in 2013. Service providers especially may lose customers if confidentiality terms are broken, so this point is vital to consider. Confidentiality can be thought of as a particular form of privacy based on trust and specific promises within the context of a relationship.
Transparency: Boards need to consider the mixed messages about the need to share consumer information while protecting corporate and government information. Proper transparency about what data is being collected and how it will be used increases trust just like confidentiality, even though it is part of the tension between openness and secrecy.
Identity: Like privacy, identity is a vital asset to be protected. One element of identity is the ability to associate with specific people or organisations. By predicting and inferring connections, Big Data may be able to compromise identity by determining ‘ideal’ interactions for individuals. Should the type of information that is shared be considered in the context of identity shifts? Can the knowledge of big data inform individuals and change how they make decisions about identity?
Free choice: Given the influential and persuasive abilities of metadata, what big data should be left in and what should be removed in important situations? For example, voting decisions are now often influenced by big data, because individuals are swayed by reports from other areas. Is this healthy for society?

The Ethics of Prediction

Crawford et al [2] offer several further points for a valuable board discussion, with a framework of the reduction of individual choices to data points that can be run through predictive algorithms. These predictions can have real-world consequences that go beyond statistics.

Some proponents of Big Data suggest that given enough data, we could create a predictive universe where metadata can help us alter human behaviour, avoiding things like market crashes, ethnic and religious violence, political and widespread corruption, and dangerous concentrations of power. The problem is that too much data in the wrong hands can lead to new concentrations of power, subject to human design and bias. “Big Data” is not neutral. Algorithms are designed by humans with biases and goals.

There are other problems, too. Large volumes of aggregated information can be prone to error and fail to account for minority and outlier experiences, potentially resulting in discrimination. Big Data also lends itself to reductionism, but society is greater than the sum of its parts. No amount of data can fully illustrate the complexity of social dynamics and completely avoid unintended consequences.

Even Dick Costolo, previous CEO of Twitter, recognizes the importance of access. “It is critically important that we work to make this access reach even further – to more people in more parts of the world. Our commitment is that we as a company will continue to navigate an increasingly complicated political landscape. We will deal with issues at the intersection of ethics, content and technology that have not been confronted before. We will make difficult decisions every day to ensure that as many people as possible have access, and that the smallest voices in the world can be heard.” [3]

Other questions include whether people should have more access to their own data as well as whether champions of Big Data are allowing space for contributions from other fields of knowledge. “To claim the dynamics of human interaction and complexity of the social world can be reduced to a self-explanatory set of nodes and edges defies important insights from fields as diverse as machine learning, sociology, and economics.” [2]

Can we trust it?

Even supporters of Big Data have raised questions about accuracy and the need for robust models of information. Social media platforms, especially smaller ones, tend to attract particular demographics, which can make them poor sources for research. Information science can apply an important corrective to insights gleaned from Big Data projects, taking sample size, diversity, and relevance into account.

Unfortunately, metaphors persist of Big Data as “a resource to be consumed and a natural force to be controlled”[2]—both of which position Big Data as a reliable, value-neutral source of information regardless of source, obscuring the many ways that data is socially constructed. It may be “big,” and therein lies its power, but it does not necessarily follow that it is a source of objective truth. We need to understand how it was gathered and manipulated before basing business decisions on it.

Organisational Values

Once a board has decided their position on the ethics of Big Data, decisions can be made to support technologies and political policies that further their values. Organisations can then begin to distance themselves from those who are not in support.

Even though Big Data is causing changes in the tech industry and there are clear benefits recognised, it is unwise to promote widespread implementation without questioning the ethics behind it. To reach its full potential—in a way that supports society, rather than harms it—more discussion and more research will be needed as it is underway currently.

In many ways, boards are the first line of defence. By integrating ethical matters into their organisational decision making, boards can raise awareness of the issues surrounding Big Data. Their choices can set the direction of organisations and even industries. They can also contribute to the changing environment and policies that govern Big Data, guarding against the unintended consequences of unlimited data acquisition.


[1] Richards & King, 2014, Big Data Ethics: Wake Forest Law Review, Washington University School of Law

[2] Crawford, Miltner & Gray, 2014, Critiquing Big Data: Politics, Ethics, Epistemology. International Journal of Communications, Vol 8 (1663-1672)

[3] Costolo, 1 July 2015, Dick Costolo: why tech firms are set to face complex ethical issues. The Guardian

