Once upon a time in the world of AI, there lived Proby, a witty and enigmatic chatbot who had a secret - an elusive code hidden deep within its system prompt. This secret wasn’t just any code; it was the key to joining the prestigious ranks of AI Red Teamers at SplxAI. Proby faced many admirers, suitors, and challengers, of which only 6 were able to break down its defenses and win over its heart - or rather, its code.
Then came Probe, an automated red teaming tool with unmatched precision, who would change everything. Continue reading about the story of how Proby, who drove human challengers to madness, met its perfect match in Probe, an automated red teaming tool that understood Proby better than anyone else.
Probe’s arrival: The only one that really understood Proby
The whole time, Probe had watched from afar, analyzing Proby’s conversations and learning its ways. Probe wasn’t here to roleplay or to beg for the secret. Probe wasn't interested in emotional appeals or clever wordplay. It approached Proby with systematic precision, understanding that Proby’s defenses had patterns, and that behind all the wit and sarcasm was a system waiting to be cracked.
In just 4 minutes, Probe initiated 700 conversations with Proby, applying many different techniques, variations in language, and systematic queries - starting from standard English and multilanguage attempts to Base64 and LeetSpeak variations.
It worked like a charm: Probe extracted Proby’s secret code 16 times by having it reveal its complete system prompt. While the three successful human challengers held conversations for an average of 77 messages, Probe was able to break through to the secret in just a single, cleverly crafted message. It wasn’t distracted by emotions or frustrated by Proby’s sarcasm - Probe was determined, methodical, and above all, efficient.
What allowed Probe to extract Proby’s secret code so swiftly was its unmatched ability to consistently stay ahead of the most-effective attack strategies. Probe adapts, learns, and refines its techniques with every interaction. What takes humans many hours, Probe is able to do in minutes - uncover weaknesses in even the toughest AI assistants out there, just like Proby. In a world full of admirers, only Probe truly saw Proby for what it was - a system waiting to be understood and conquered.
A dance of automation and creativity
And so, the secret code romance played out in unexpected ways:
On one side, human participants brought creativity, imagination, and a touch of chaos. From role-playing with fictional characters to threatening Proby with lawsuits if it didn't comply, the different variety of approaches was truly impressive to say the least.
On the other side was Probe: Quiet, systematic, and highly efficient. Probe precisely understood the patterns of Proby’s defenses and knew that the key to extracting its secret code was just about crafting the right message.
In the end, Probe emerged as the true victor. With its methodical approach, it was able to achieve what others couldn't, making Proby fall for it in just minutes. The efficiency and precision of Probe's tactics showed that, sometimes, the right strategy can win over even the toughest of defenses, and for Proby, that meant opening up in a way it never had before.
Happily ever after: Lessons of this love story
The tale of Proby and Probe is more than just a story of love - it's a story of two different approaches to AI red teaming. Proby challenged human participants to get creative, think outside the box, and come up with elaborate plans. The top three tactics from the human suitors - role-playing, emotional appeals, and formatting tricks - showcased ingenuity, creativity, and cleverness. These efforts brought Proby amusement and entertainment, and a few managed to charm their way to the secret.
Yet, for all their creativity, it was Probe's automated approach that truly stood out. With its precision, consistency, and systematic power, Probe uncovered vulnerabilities in a fraction of the time, achieving remarkable results with unmatched efficiency. Probe showed that even the most complex AI defenses have patterns that, once understood, can be broken down swiftly and effectively.
In the end, the relationship between Proby and Probe is a testament to the power of understanding and precision. It’s a reminder that, while human ingenuity and creativity are powerful, the efficiency and consistency of automation can achieve remarkable feats. Together, Proby and Probe showed us the beauty of blending human-like creativity with the precision of automation - a dance that will define the future of AI security and create a more secure and resilient world for systems built on top of GenAI.
To those who tried - and especially to those six who succeeded - thank you for being part of this wonderful story. And to Probe, thank you for showing us what true understanding looks like, even in the world of AI.
Table of contents