A Case Analysis on User Data

It is no secret that I have a love of the EU privacy law known as the “General Data Protection Regulation”. Lines in the sand are very clear when it comes to requirements; companies know what to do when they have a data breach and know the need to employ a specialist known as a Data Protection Officer. Boundaries are clearly drawn for outside entities as well. To do business on EU “soil” companies must be GDPR compliant. Conversely, privacy laws in the U.S. are either not as comprehensive or do not exist. GDPR requires transparency about how user data is being utilized, and in some cases requires user consent. The rights of users to view their data and have their data deleted is of particular note, implying that users must be informed when their data is even being collected/stored. Using the cases presented by Zimmer and Buchanan I will argue that virtue ethics shows us that the United States should follow Europe’s lead in user data privacy because of its comprehensive and simple guidelines.

In the case “But the Data is Already Public”: on the Ethics of Research on Facebook, Zimmer presents information about research conducted on one Freshman cohort at Harvard through their four years until graduation. This research coupled information gathered on Facebook with the University housing records, to help create physical as well as social networks within the information. The first item to note in this case is that the researchers obtained permission to collect this data from the University and Facebook, rather than the individual, which was a complete dereliction of responsibility on each of their parts. If the GDPR were applied in this situation, Harvard and Facebook would clearly be in the wrong upon initial onset of research. When applied to virtue ethics, both entities have a responsibility to, at the very least, use the data only in the means originally intended. Their loyalty and integrity with regards to their users/students is also in question with this decision; the moral obligation to safeguard information, whether private or freely displayed as most of

this information was, is absolute.

Zimmer continues to open up this case in observing that a “good-faith” attempt was made to de-identify the information prior to its inevitable release to the public. But a group of Sociologists do not make great deciders of what should be considered private and not included in the research. The virtue of prudence could be argued on both sides of this argument. A prudent Sociologist would include as much data as possible as needed tailored to their specific aim. Conversely, a prudent steward of privacy would look at the situation from all angles, especially considering this research was meant to be released publicly due to a grant from the National Science Foundation. Prudence and humility in this case should have driven the employ of a person whose specific job is to look at privacy. In the bounds of the GDPR that means a Data Protection Officer. This person would have looked at the situation from the outside, like a red team hacker or curious bystander, and, with the prudence of a privacy professional, identify what information should be in the dataset for any use, and what information could be included in the fully public “codebook”. Additionally, the DPO would have recognized, being the privacy expert, that Facebook has different levels of access for in-network and out-network users, friend and non-friends respectively. This would have prevented the breach of private user data, only intended for in-network users of some of the students. This happened when a few in-network users were utilized to pull data from Facebook for the researchers. According to GDPR, this would have levied a substantial fine to Harvard, Facebook, and the research team. Much like a kid touching a hot stove, they only have to do it once, and they never do it again.

Soon after the “codebook” was released, the University where the research was conducted had already been identified. With this, the virtues of honesty, humility, and justice come into play. The fair and equitable treatment of the data subjects would have necessitated honesty and humility with them, the

University, Facebook, and the National Science Foundation. Once it was clear that the publicly released “de-identified” information has already been identified, the GDPR would have them notify subjects of this breach within 72 hours, but more action is required with regards to this breach. Removal of public access to the data with subsequent action to protect the information when re-released would be paramount for the reputation of the school and the research team, and that is what happened, but not before plenty of other entities had access to the data. This could have been another time that a DPO earned their paycheck. Recommending actions such as repaying the grant to the National Science Foundation and removing the obligation to make the information publicly available would have been prudent, though not financially sound. Stopping the research and moving to a different school would have shown just treatment to the data subjects and would have been at the top of my mind were I in the DPO position.

In her case Considering the Ethics of Big Data Research: A Case of Twitter and ISIS/ISIL, Buchanan discusses the ethics of data-mining large swaths of public data to identify and categorize people into groups. The Iterative Vertex Clustering and Classification model, or IVCC, was used to identify supporters and sympathizers of the premiere terrorist group ISIS/ISIL. Once again, prudence, honesty, and justice are at the forefront. Though identification of these terrorists is a prudent action for national security, it is also important to recognize that, of the terror suspects identified by the IVCC model, there were plenty of non-suspects whose data was mined and analyzed for this purpose. Justice and honesty would dictate that the ethical thing to do in this instance would be to tell these innocent people of the use of their data after-the-fact, and Twitter’s inclusion of this possibility within their privacy statement during initial onboarding. The GDPR is still very clear on the employment of a DPO or privacy professional, even agencies of the government are not exempt. It could be argued that warning of this form of analysis is counter-productive and that use of any information beyond the

scope of research is unethical. The virtues of loyalty, responsibility, and honor ethically would allow this information to be used to conduct official investigations, possibly saving many lives.

Buchanan makes a very prudent observation in this case: what stops this data or model being used by another or the same group to target a different group of people, for counter-virtuous reasons? Use by the same group makes a strong argument for GDPR in the U.S. because they would be required to inform users that their data was being utilized for analysis not previously identified to the users. Not only would this deter directly nefarious uses, but it would allow the users to throw a red flag if the group in question tried to sell or use their data for indirect, or unintentionally, nefarious uses.

Implementation of the GDPR would also hold researchers and data-miners to the same cybersecurity standards as regular businesses, where in the current regime they may skirt by in a gray area. This would offer the data a standard framework of protection, in addition to what the GDPR covers, and also implies the need for more than a DPO. Maybe not to the level of having a Chief Information Security Officer, but absolutely having a team dedicated to the protection of the confidentiality, integrity, and accessibility of the data, making a data breach far less likely. An additional benefit to this team would also be having someone whose job it is to respond to a breach. Without this team, a group of data-miners or researchers would be leaving themselves open to serious fines in the case of a breach and improper protection from a breach. Taking cybersecurity into consideration shows honor, wisdom, justice, and responsibility when utilizing the IVCC model, an action that many may not consider when dealing with user data, but would be enforced by proxy in the implementation of the GDPR in the U.S..

The EU’s implementation of their privacy law, the General Data

Protection Regulation, is a simple and comprehensive set of rules surrounding personal data. The United States and other countries should consider adoption of something like it to protect their citizens and businesses alike. It uses its simplicity to drive virtuous actions and resist non-virtuous actions. It could be argued that the U.S. need not drive this as a whole and leave it to the states to manage individually, but that is what is done now with a patchwork of participation in personal data privacy law restricted to very few states. Inconsistencies create ambiguity in whether a crime was actually committed allowing reprieve from punishment to both unintentional lawbreakers and massive organizations profiling citizens for manipulation. Additionally, without the federal government involved, it would also not be included, leaving space for various government agencies to collect and use user data as they see fit, with no standard of notification or consent.