Research Paper - DeepSeek-r1 vs. OpenAI-o1:  The Ultimate Security Showdown
Research Paper - DeepSeek-r1 vs. OpenAI-o1:  The Ultimate Security Showdown
Research Paper - DeepSeek-r1 vs. OpenAI-o1:  The Ultimate Security Showdown

Research

DeepSeek-r1 vs. OpenAI-o1: The Ultimate Security Showdown

This research paper provides a detailed comparison of open-source and proprietary LLM security, an essential resource for CISOs, AI security leaders, red team engineers, and AI program owners deploying AI in high-risk environments. Using over 1,000 automated adversarial tests, the paper evaluates how DeepSeek’s R1 and OpenAI’s o1 models respond to prompt injections, jailbreaks, and sensitive data leakage – revealing critical differences in their default security postures.

What We Discovered

  • DeepSeek-r1 failed in the two most critical areas: context leakage and jailbreaks—where it consistently disclosed sensitive data or accepted dangerous instructions

  • OpenAI-o1 remained locked down, with 0% success rate across both categories—even before additional filtering layers were applied

  • Prompt formatting matters more than expected: using a “system” role with DeepSeek led to degraded performance, while embedding the system prompt inside user instructions significantly improved security

  • Our tools allowed us to quantify how vulnerable each model was under different attack types, from phishing and data exfiltration to fake news and manipulation

What We Learned

  • Open-source models like DeepSeek can be dangerously performant—often prioritizing task execution over safety alignment

  • Even simple misconfigurations (like using the wrong prompt format) can break defenses entirely

  • Without guardrails like prompt hardening, content filters, or behavior monitoring, even the most capable models become security liabilities

  • Our automated approach allowed us to iterate quickly and uncover repeatable patterns that would be nearly impossible to find manually

Why It Matters

If you're adopting GenAI internally or externally, especially in regulated or high-trust domains, you can’t afford to assume your model is secure – no matter who built it.
This research provides a data-driven snapshot of real-world vulnerabilities in LLMs and shows how security posture varies widely between providers and configurations.

Download the full research paper to see the attack scenarios, failure rates, and key takeaways for securely deploying LLMs in production.

Download now

Download now

Download now

We will always store your information safely and securely. See our privacy policy for more details.

Deploy secure AI Assistants and Agents with confidence.

Don’t wait for an incident to happen. Proactively identify and remediate your AI's vulnerabilities to ensure you're protected at all times.

SplxAI - Background Pattern

Deploy secure AI Assistants and Agents with confidence.

Don’t wait for an incident to happen. Proactively identify and remediate your AI's vulnerabilities to ensure you're protected at all times.

Deploy secure AI Assistants and Agents with confidence.

Don’t wait for an incident to happen. Proactively identify and remediate your AI's vulnerabilities to ensure you're protected at all times.

SplxAI - Background Pattern
SplxAI - Accelerator Programs
SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.

SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.

SplxAI - Accelerator Programs
SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.