Case Analysis on Data Ethics

Jonathan Mukuye

CYSE 355E

2.4 Case Analysis on Data Ethics

In Danny Palmer’s article, What is GDPR?, we’re presented with an exposition of a body of legislation introduced in the European Union (EU) that seeks to address privacy and data protection for an evolving digital landscape. GDPR is short for General Data Protection Regulation and it’s a set if rules and reforms that apply to all organizations operating in the EU and organizations outside the EU that has consumers within the EU. GDPR creates two new categories for data-handling organizations: controllers and processors. Compliance with GDPR means that data-handlers ensure that data is gathered legally and under strict conditions, as well as protected from misuse or theft. It places legal obligations on these data-handling entities to protect their consumer’s data and have measures in place to inform them of breeches. Liability is placed on data-handlers for breaches, with fines ranging from 10 million Euros to 4% of a company’s annual global income. In this Case Analysis, I will argue that the Consequentialist ethic shows that the right thing for the United States government to do, is to adopt a body of legislation similar to GDPR because doing so would shift the responsibility of protecting privacy from the consumers, who are the majority, to data-handlers, who are the minority, thus taking the course of action that would maximize happiness for the most people.

Michael Zimmer gives us a case study of his own in his article discussing the ethics of research. In the article we’re presented with a team of researchers from the “Taste, Ties, and Time” (T3) project that collected data on a group of college students from Facebook. The researchers made various efforts to maintain the privacy of the students including removing what they called “personally identifiable data” but fell short in that endeavor and very quickly after the dataset was released, elements of the data set that were not supposed to be identifiable were in fact identified. Zimmer uses this event to raise questions about the ethics of data collection. Specifically, the amount of personal information that can be collected, what constitutes proper access to that information, unauthorized secondary use, and potential errors in personal information are all explored. He asserts that privacy violations can occur when “extensive amounts of personally identifiable data are being collected,” “when information about individuals might be readily available to persons not properly or specifically authorized to have access to the data,” “information collected from individuals for one purpose might be used for another secondary purpose without authorization from the individual,” and when subjects aren’t given access to the data to “correct for errors or unwanted information.”

When looking back at the actions taken by the researchers under the lens of Zimmer’s assertions, we see that the researchers fell very short of preventing privacy violations. The researchers downloaded information from nearly 1,700 Facebook profiles and included in their dataset data from each of the years of study for every individual. This certainly fits the description “extensive amounts of personally identifiable data” that Zimmer says can lead to privacy violations. If this were done within the context of GDPR, Facebook would’ve been had measures in place to comply with the law that prevents mass harvesting of data like this. Zimmer also notes that the legal definition of personally identifiable data in the EU is stricter, so the researcher’s would’ve had to have been more thorough in the data they left out, but according to US law, it was ok. The T3 team also released the dataset for any researchers that they deemed worthy after submitting a statement detailing how the data will be used. This flies in the face of Zimmer’s unauthorized secondary use concern since the university only authorized the T3 team to have the data. In GDPR, data-handlers are responsible for keeping data contained in the context the user trusted the data-handler to keep it in and they could face penalties for such actions as data-handlers. Another concern raised by Zimmer is the possibility of errors in data due to lack of communication with subjects. There was never any contact made with students to let them know that their data was being collected, which means that if there were any errors in personal information or information an individual wanted removed, that could not take place. Under GDPR laws, organizations are required to notify users when their data is taken from a database, which would allow the users to then take action.

Because the actions by the research team failed to address the ethical concerns that Zimmer brings to light, the students have to the potential to have large quantities of their personal information in the hands of essentially anyone, which to a consequentialist is wrong because it sacrifices the desires of many subjects for the desires of a small group of people for data. Each of the mistake would’ve been prevented if the US had laws similar to those implemented in the EU under GDPR. When the students posted on Facebook it was not their intention to publish their lives to the world for them to do what they will with their data. The mass collection of data without user consent and publishing of that data without consultation of the subjects leaves the possibility of that data to be used for possibly negative intentions, like the re-identification of the subjects, or worse. The potential consequences of the researchers’ actions transgress the will of the individuals for privacy, which does harm to nearly 1,700 students. In order to have done the right thing according to the consequentialist ethic, the researchers should have taken those actions which are in the best interest of the subjects. In order to this permission should’ve been granted by the subjects directly for data collection and publication, and they should have been consulted for any correction to the data. If this can’t be done without skewing data, then the project would simply have to take place, since that would be in the best interest of the most people. Applying that more generally to the American public, adopting similar legislation to GDPR in the EU would be in the best interest of many Americans which would make it the ethical thing to do under consequentialism.

Elizabeth Buchanan in her commentary presents her reflections on contribution by Benigni et al where using an IVCC model, researchers took a large dataset of well over 100,000 Twitter users and were able to identify supporters or sympathizers of ISIS. She uses this case to present her thoughts on the ethics of big data analytics. Specifically, she discusses the rights of the people who own the accounts from which the data was collected. She observes that research on human subjects “has grown in scale and become more diverse” with massive development in research methodologies because of the sophistication of the technologies we have today. She also notes that developments for human subject research oversight have stagnated over the past two decades. This leads her to question whether a new category for human subjects, “data subjects,” should be considered when examining the rights of subjects of big data research.

In the United States, not only is big data analytics like this legal, but it’s encouraged and implemented in matters of national intelligence or national security. In the Benigni et al paper, the data collected was used for a task as important as finding potential terrorist sympathizers. Unfortunately, the United States hasn’t passed regulations centered around human research protections since 1991. Buchan’s insights reveal that so many advancements in research technology have been made, and the amount and kinds of data are so greatly expanded from what they used to be, that it is imperative that the United States takes another reflective look at its legislation concerning this kind of research. The US does not really have an adequate concept for understanding what interaction is ethically ok to have with subjects of big data analysis.

Big data analysis has great utility. In in cases of national security it can provide great insight that other methods simply would not be able to compete with. On the other side of the same coin however, this data collection method is so new and sophisticated, that current US law and ethical thinking is behind on the matter of privacy of each individual subject to data collection. At face value, a consequentialist might say to sacrifice the desire for privacy among data subjects for the greater good that analyzing their information would produce for the nation at large. GDPR however, is a set of legal regulations that were designed to address the needs a technologically evolving world. An example of this is categorizing data-handling organizations into “controllers” and “processors” and assigning legal obligations designed to protect the privacy of their consumers. The US should in like manner adopt a of body of laws similar to GDPR that also has updated categories for understanding the rights of data analysis subjects. By doing this the US would meet the needs of both people that need these large datasets as well as the subjects who’s right to privacy is protected. This course of action would be the best under consequentialism because it maximizes happiness for the most people by addressing both needs.

In summary, an adoption of laws similar to those we see in the European Union’s GDPR, would solve many of the United States’ problems concerning outdated privacy laws. The nation’s current laws simply are too antiquated to address the privacy needs of its citizens and they leave too many questions unanswered for ethical research practices concerning data. GDPR is designed to give data-handling companies motivation to and guidelines for the protection their consumer’s privacy. Similar laws and guidelines in the US would be in the best interest of everyone. It’s entirely possible that not every problem would be solved. The issue of ethical big data analytics for instance, seems to be largely unanswered by the regulations in the GDPR. It provides a good framework for preventing and addressing data breaches, but not so much for the gathering of already public data. It does however provide a good starting point for addressing many issues facing privacy and can be altered to fit America’s specific needs. For this reason, the US should, in fact, adopt something similar to Europe’s new privacy laws.