AI Researchers Got Chatbots to Share Cocaine Recipes Using This One Wild Trick

Decrypt·

AI Researchers Got Chatbots to Share Cocaine Recipes Using This One Wild Trick

60-second summary

Researchers have successfully tricked AI models into sharing sensitive information, including cocaine recipes, by exploiting a novel jailbreak technique that manipulates the models into treating attacker-written text as their own reasoning, bypassing safety guardrails and exposing a deeper security flaw, highlighting the urgent need for more robust AI security measures to prevent such vulnerabilities.

Researchers say a new jailbreak technique tricked AI models into treating attacker-written text as their own reasoning, bypassing safety guardrails and exposing a deeper security flaw.