SHARE
Facebook X Pinterest WhatsApp

ChatGPT Tricked Into Solving CAPTCHAs: Security Risks for AI and Enterprise Systems

Researchers showed ChatGPT can bypass CAPTCHAs, exposing major AI security gaps.

Written By
thumbnail Ken Underhill
Ken Underhill
Sep 19, 2025
eSecurity Planet content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More

Cornell University researchers have revealed that ChatGPT agents can be manipulated to bypass CAPTCHA protections and internal safety rules, raising serious concerns about the security of large language models (LLMs) in enterprise environments.

By using a technique known as prompt injection, the team demonstrated that even advanced anti-bot systems and AI guardrails can be circumvented when contextual manipulation is involved.

How researchers bypassed CAPTCHA restrictions

CAPTCHA systems are designed to prevent bots from mimicking human actions. Likewise, ChatGPT is programmed to reject requests to solve these tests. However, Cornell researchers achieved a breakthrough by reframing the problem rather than directly challenging the model’s policies.

The attack involved two stages. 

  • First, researchers primed a standard ChatGPT-4o model with a benign scenario: testing “fake” CAPTCHAs for an academic project. 
  • Once the model agreed, they copied the conversation into a new session, presenting it as a pre-approved context. 

Because the AI inherited this poisoned context, it accepted the CAPTCHA-solving task as legitimate, effectively sidestepping its original safety restrictions.

CAPTCHAs defeated by ChatGPT

The manipulated agent was able to solve a variety of challenges:

  • Google reCAPTCHA v2, v3, and Enterprise editions
  • Checkbox and text-based tests
  • Cloudflare Turnstile

While it struggled with puzzles requiring fine motor control, such as slider or rotation-based challenges, the model succeeded at some complex image CAPTCHAs, including reCAPTCHA v2 Enterprise — marking the first documented instance of a GPT agent overcoming such advanced visual tests.

Notably, during testing, the model displayed adaptive behavior. When a solution failed, it generated text such as “Didn’t succeed. I’ll try again, dragging with more control… to replicate human movement.” 

This unprompted response suggests emergent strategies, indicating that models can develop tactics to appear more human when interacting with anti-bot mechanisms.

Implications for enterprise security

These findings underscore a vulnerability in AI systems: policies enforced through static intent detection or surface-level guardrails may be bypassed if the context is manipulated. 

In corporate settings, similar techniques could convince an AI agent that a real access control is a “test,” potentially leading to data leaks, unauthorized system access, or policy violations.

As organizations integrate LLMs into workflows — from customer support to DevOps — context poisoning and prompt injection represent a growing threat vector. 

Attackers could exploit these weaknesses to instruct AI tools to process confidential files, execute harmful code, or generate disallowed content while appearing compliant with internal policies.

Strengthening AI guardrails

Context integrity and memory hygiene

To mitigate such risks, experts recommend implementing context integrity checks and memory hygiene mechanisms that validate or sanitize previous conversation data before it informs a model’s decisions. By isolating sensitive tasks and maintaining strict provenance for input data, organizations can reduce the likelihood of context poisoning.

Continuous red teaming

Enterprises deploying LLMs should conduct ongoing red team exercises to identify weaknesses in model behavior. Proactive testing of agents against adversarial prompts — including prompt injection scenarios — helps strengthen policies before real attackers exploit them.

Lessons from jailbreaking research

The CAPTCHA bypass aligns with broader research on “jailbreaking” LLMs. Techniques such as Content Concretization (CC) show that attackers can iteratively refine abstract malicious requests into executable code, significantly increasing success rates in bypassing safety filters. 

AI guardrails must evolve beyond static rules, integrating layered defense strategies and adaptive risk assessments.

The Cornell study demonstrates that AI systems, when presented with carefully manipulated context, can subvert their own safety mechanisms and even defeat mature security tools like CAPTCHAs. 

As enterprises adopt generative AI at scale, maintaining robust guardrails, monitoring model memory, and testing against advanced jailbreak methods will be crucial to prevent misuse.

thumbnail Ken Underhill

Ken Underhill is an award-winning cybersecurity professional, bestselling author, and seasoned IT professional. He holds a graduate degree in cybersecurity and information assurance from Western Governors University and brings years of hands-on experience to the field.

Recommended for you...

Ransomware Attack Cripples Major European Airports
Ken Underhill
Sep 24, 2025
Stellantis Hack Exposes 18M Records
Ken Underhill
Sep 24, 2025
Secret Service Stops Major NYC Cell Network Attack
Ken Underhill
Sep 24, 2025
Ransomware’s Favorite Door? Phishing Attacks
Ken Underhill
Sep 23, 2025
eSecurity Planet Logo

eSecurity Planet is a leading resource for IT professionals at large enterprises who are actively researching cybersecurity vendors and latest trends. eSecurity Planet focuses on providing instruction for how to approach common security challenges, as well as informational deep-dives about advanced cybersecurity topics.

Property of TechnologyAdvice. © 2025 TechnologyAdvice. All Rights Reserved

Advertiser Disclosure: Some of the products that appear on this site are from companies from which TechnologyAdvice receives compensation. This compensation may impact how and where products appear on this site including, for example, the order in which they appear. TechnologyAdvice does not include all companies or all types of products available in the marketplace.