Exploring AI, Bot, and Human Interference in Online Surveys

Survey Under Siege:

As online surveys become more common in higher education research, so do new threats to data integrity.

AI-generated answers, bot activity, and human interference can quietly distort research outcomes. This whitepaper from the National Disability Center examines how fraudulent participation affected the launch of the National Report on Disabled College Student Experiences. It offers a transparent, step-by-step breakdown of the Center’s multi-layered response strategy and practical guidance for researchers working in digital spaces.

In spring 2024, the College Accessibility Measure (CAM) survey collected more than 2,200 online responses, but researchers flagged many as potentially fraudulent. Using tools like reCAPTCHA scoring, metadata tracking, and open-text response analysis, the Center’s research team developed a methodical screening and data-cleaning process to safeguard results.

The report explores these findings and offers a framework to help other research teams design stronger surveys.

What’s Inside:

The number “70%” is displayed in large, bold, navy blue type with subtle shadowing. Behind the “7,” a circular green element is filled with diagonal stripes in alternating olive and dark green tones. The design combines bold typography with a visual texture to emphasize a key statistic.

Responses Flagged as Potentially Fraudulent

Two side-by-side line graphs show the distribution of reCAPTCHA scores. Both have white backgrounds and olive green wave-like density curves. The X-axes are labeled “ReCAPTCHA Score” ranging from 0.0 to 1.0, and the Y-axes are labeled “Density.” The left graph shows a more irregular pattern with multiple peaks, while the right graph has a smoother curve rising toward the higher end of the score range.

ReCAPTCHA Score Distributions of Flagged and Non-Flagged Responses

A centered white triangle with rounded edges contains a navy blue exclamation point on a light green background. The triangle is outlined by a bold olive green circle with a thick white inner ring. The graphic is flat and minimalistic, signaling a warning or alert.

How to Detect Suspected Bot Behavior in Online Surveys

Key Findings

Survey Fraud Doesn’t Always Look Like Spam

Even when a survey looks clean on the surface, fraudulent responses can blend in. Some entries may include full sentences, appropriate demographic answers, or even appear to come from real email addresses. Without a plan to examine patterns across submissions, researchers may unknowingly include invalid data in their analysis. Which undermines both the quality of the research and the experiences of participants providing authentic responses.

Open-Ended Responses Reveal Red Flags

Text-based answers helped surface suspicious behavior that researchers would not have seen in yes-or-no questions. Copy-pasted responses, filler text, and repeated phrases were common indicators. Including at least one open-response question gave the research team an important tool for spotting low-quality entries.

Single-Layer Protections Don’t Catch Everything

Tools like reCAPTCHA offer a first line of defense against bots but are not designed to stop every threat. Many suspicious responses passed basic filters yet still compromised the dataset. Researchers learned that relying on one automated tool was not enough. Instead, effective screening involved multiple checkpoints—before, during, and after data collection.

Shared Decision-Making Strengthens Data Cleaning

Reviewing suspicious responses requires careful judgment. In this project, the team worked together to evaluate entries and decide which to keep. The collaborative approach reduced bias, created consistency, and ensured transparency. The team removed responses only when they had clear evidence of fraud.

Digital Research Requires Ongoing Risk Management

Online surveys offer wider reach, but they also bring new risks. Researchers cannot assume that public surveys will be free from bots or manipulation. A key takeaway from this work is that fraud prevention needs to be part of the research plan from the beginning. Strategies to manage risk must be built into study design, especially when surveys are public or include incentives.

Top Recommendations for Researchers

Online research is a powerful tool, but it comes with new responsibilities. Findings in this whitepaper point to important steps researchers can take to protect the quality of their data and reduce the risk of survey fraud. These recommendations offer practical guidance for designing, managing, and reviewing online surveys in a way that prioritizes trust and transparency.

Identify

Detect back-end indicators of suspected bot behavior in online surveys.

Evaluate

Assess the effectiveness of Google’s “reCAPTCHA,” a non-human screening extension to online programs which can be implemented to mitigate bot intrusions during survey data collection.

Analyze

Review the data cleaning process and the proportion of suspicious responses flagged using each identification strategy.

Discuss

Talk through the decision-making process for managing survey data and determining next steps for future surveys.

image of the National Disability Center for Student Success's website on various devices, phone, tablet and desktop computer

”Losing most of a dataset to bots is the kind of lesson you only need once. This attack nearly hijacked our research story, but instead we used it to create recommendations for preserving community-engaged research while getting the most out of online data collection innovations.
Ryan A. Mata, PhD

Recommended Resources

A zoomed-in version of the dotted map of the United States from the cover, showing the pattern more clearly. The background is dark blue, with the dots forming the shape of the U.S. in light blue. There is a small green and blue bar along the bottom, which matches the design accents from the first image.

Survey Under Siege: