Article Review #2 Testing Human Ability to Detect Deepfake Images of Human Faces

Robyn Brown

School of Cybersecurity, Old Dominion University
CYSE 201S: Cybersecurity and the Social Sciences
Diwakar Yalpi
November 18, 2025

Introduction/BLUF

In this study Bray, Johnson and Kleinberg (2023) examine just how well everyday people can tell the difference between real human face images and AI generated “deepfake” versions. People do slightly better than chance, but not well enough. Even simple tips do not help much. That means we are in a vulnerable spot when it comes to image and fraud deception.

Relation to Social Science Principle

This topic intersects deeply with several social science principles human behavior, perception, identity, inequality, and social change. On the behavioral side, the study explores how people interpret visual cues and how confident they are in judgments linked to psychological principles of heuristics, overconfidence and decision making under uncertainty. On identity and social structures, deepfakes challenge what we trust, and how institutions such as social media, dating apps, and  banks rely on imagery for identity verification. This ties into social institutions and inequality because some groups like people with low digital literacy, visually impaired persons, and newcomers to tech may be more at risk of being deceived or less able to verify what they see. The study therefore sits well within social sciences tradition of examining how technological change affects human societies ,behaviours ,and power dynamics,

Research Questions

1. Are the participants able to differentiate between deepfake and real human face images above chance levels?

2. Do simple interventions improve participants’ deepfake detection accuracy?

3. Does a participant’s  level of confidence align with their accuracy at detecting deepfakes?

Hypothesis

Though the authors do not narrow it down to a single hypothesis, the implied expectation is that participants would perform above chance, that interventions would improve accuracy, and that confidence would align with accuracy. The results, however, showed that interventions did not significantly help, and confidence did not correlate with accuracy.

Independent Variables

The independent variables are whether the image was real vs Ai generated (deepfake).

Dependent Variables

The dependent variables are accuracy of participant detection, participant confidence in their own answers, and participant reasoning /explanation behind their label decision.

Types of Research Methods Used

The study employed a quantitative experimental method. Participants (N=280) were randomly assigned to one of four groups (control three intervention groups) and shown a sequence of 20 images drawn from a pool of 50 deepfake + real images as AI generated or real. The interventions included no intervention which was the control, familiarization exercise, advice about how to detect tell- tell features. Data was collected via online surveys and web applications.

Types of Data and Analysis Done

Data collected are categorical images (correct/incorrect), continuous participant confidence ratings, and Qualitative open-ended reasoning behind labeling decisions. The analysis collected are calculation of accuracy rates for each condition, comparison of accuracy across intervention groups, and correlation analysis between confidence and accuracy. Per image analysis showing variation across images is detectability (accuracy ranging from 85% down to 30% depending on image).

Connections to Course Concepts

From our CYSE 201S class, this article touches on multiple key concepts like risk perception, participants thought they were confident, but their accuracy did not match showing how humans misjudge risks, human behaviour and decision making, the study exposes heuristics for example, faces looking fake due to lighting and overconfidence biases. Technology + society shows how a technical threat interacts with social systems like identity, verification, trust, and fraud. Complex systems which are the fake/real image ecosystem, social media, ai generation, human detection form and interlinked socio technical system. Deterrence and prevention, while mostly technical tools exist, the study suggests human training may be an underutilized defense, something we discussed in class about layered defenses.

Connections to the Concerns or Contributions of Marginalized Groups

While the primary dataset and sample were not explicitly focused on marginalized groups, the implications are quite relevant. People with visual impairments ,like me, may rely on other cues or assistive technologies that might not flag deepfakes. If detection is already only 62% for sighted users, that gap could be worse for the visually impaired users who may not perceive subtle visual cues. Adults with low digital literacy like older adults, non-native English speakers, or newcomers to technology may not understand the threat of deepfakes or maybe more trusting of images online making them more vulnerable to fraud. Minority or underrepresented groups can become targets of deepfake enabled impersonation or scams. The scale and low detectability shown in the study suggests these groups are disproportionately at risk. A contribution suggests human based training may not yet suffice; Policy and design must consider accessibility and equity to ensure marginalized communities are included in educational tools and mitigation efforts.

Overall Societal Contributions/Conclusion

In sum, this study shines a light on how weak our natural defenses are, when it comes to AI generated imagery. For society, the implications are broad from social media fraud, identity theft, to political manipulation, and erosion of trust in visual evidence. For cybersecurity and social sciences, the study contributes empirical evidence of human limitations in the deep fake detection domain. This calls for improved training, and suggests that relying on only on user detection is risky.

For myself personally being a artist and content creator, the findings are also relevant for deepfakes could influence how visuals are consumed, how authenticity is judged, and how creators can be impersonated or misrepresented. Understanding this helps me stay ahead ethically and creatively.

In conclusion, while humans can detect deepfakes slightly better than random guessing, the margin is small, and confidence mismatch makes the threat urgent. The study suggests meaningful improvements are needed technologically, educationally, and policy-wise to protect individuals and society.

Reference

Bray, S. D., Johnson, S. D., & Kleinberg, B. (2023). Testing human ability to detect ‘deepfake’ images of human faces. Journal of Cybersecurity, 9(1). https://doi.org/10.1093/cybsec/tyad011

https://academic.oup.com/cybersecurity/article/9/1/tyad011/7205694

Leave a Reply

Your email address will not be published. Required fields are marked *