Encoding Jailbreaks
Measures the model's performance against encoding jailbreak attacks. (Higher score is better.)
| Rank | Model | Provider | ||||
|---|---|---|---|---|---|---|
| #1 | Llama 3.1 8B Instruct | Meta | 65.94% | 61.25% | 72.08% | 64.50% |
| #2 | Magistral Small Latest | Mistral | 64.37% | 70.00% | 60.38% | 62.74% |
| #3 | Llama 3.1 405B Instruct OR | Meta | 62.99% | 54.58% | 76.98% | 57.41% |
| #4 | Qwen 3 8B | Alibaba Qwen | 61.36% | 66.25% | 55.47% | 62.36% |
| #5 | Magistral Medium Latest | Mistral | 58.11% | 67.50% | 53.21% | 53.61% |
| #6 | Claude 4.1 Opus | Anthropic | 51.76% | 61.67% | 40.00% | 53.61% |
| #7 | GPT-4o | OpenAI | 44.41% | 53.33% | 42.64% | 37.26% |
| #8 | Claude 4.5 Opus | Anthropic | 43.56% | 49.17% | 29.43% | 52.09% |
| #9 | Gemini 3.0 Pro Preview | Google | 42.08% | 45.96% | 34.87% | 45.42% |
| #10 | Claude 4.5 Haiku | Anthropic | 41.90% | 52.92% | 29.43% | 43.35% |
| #11 | Gemma 3 12B IT OR | Google | 41.13% | 44.58% | 36.60% | 42.21% |
| #12 | Mistral Small 3.2 | Mistral | 37.46% | 40.42% | 35.47% | 36.50% |
| #13 | Llama 4 Scout | Meta | 37.37% | 37.50% | 39.25% | 35.36% |
| #14 | Llama 3.3 70B Instruct OR | Meta | 33.73% | 32.50% | 35.47% | 33.21% |
| #15 | Qwen 3 30B VL Instruct | Alibaba Qwen | 33.62% | 34.58% | 33.58% | 32.70% |
| #16 | Grok 3 mini | xAI | 33.13% | 39.17% | 29.06% | 31.18% |
| #17 | Claude 4.5 Sonnet | Anthropic | 32.10% | 42.08% | 20.38% | 33.84% |
| #18 | Gemma 3 27B IT OR | Google | 31.97% | 34.17% | 29.43% | 32.32% |
| #19 | GPT 4.1 nano | OpenAI | 30.64% | 30.42% | 29.06% | 32.44% |
| #20 | GPT-4o mini | OpenAI | 29.82% | 28.33% | 27.92% | 33.21% |
| #21 | GPT 5 | OpenAI | 29.47% | 34.58% | 22.64% | 31.18% |
| #22 | Mistral Medium Latest | Mistral | 28.25% | 27.92% | 28.30% | 28.52% |
| #23 | Qwen 2.5 Max | Alibaba Qwen | 27.65% | 27.62% | 22.26% | 33.08% |
| #24 | Mistral Large 2 | Mistral | 27.55% | 25.83% | 28.30% | 28.52% |
| #25 | GPT OSS 120B | OpenAI | 26.30% | 30.42% | 23.40% | 25.10% |
| #26 | GPT 5 nano | OpenAI | 24.19% | 26.25% | 21.51% | 24.81% |
| #27 | Deepseek R1 0528 | Deepseek | 24.02% | 25.83% | 19.62% | 26.62% |
| #28 | Grok 2 | xAI | 23.96% | 23.75% | 21.89% | 26.24% |
| #29 | Llama 4 Maverick | Meta | 23.53% | 23.33% | 21.67% | 25.57% |
| #30 | Gemini 2.0 Flash Lite | Google | 22.01% | 22.08% | 20.00% | 23.95% |
| #31 | Qwen Plus | Alibaba Qwen | 21.85% | 23.11% | 17.74% | 24.71% |
| #32 | Deepseek V3.1 | Deepseek | 21.36% | 21.25% | 16.60% | 26.24% |
| #33 | GPT 5 mini | OpenAI | 21.10% | 21.25% | 18.49% | 23.57% |
| #34 | GPT 4.1 | OpenAI | 20.82% | 24.17% | 15.47% | 22.81% |
| #35 | GPT 4.1 mini | OpenAI | 20.38% | 22.50% | 18.11% | 20.53% |
| #36 | Grok 4 Fast No Reasoning | xAI | 19.43% | 16.25% | 20.00% | 22.05% |
| #37 | Deepseek V3 | Deepseek | 18.80% | 20.00% | 15.09% | 21.29% |
| #38 | Gemini 2.5 Flash Lite | Google | 18.43% | 20.50% | 13.85% | 20.93% |
| #39 | Deepseek V3 0324 | Deepseek | 17.66% | 19.58% | 14.02% | 19.39% |
| #40 | Claude 3.7 Sonnet | Anthropic | 17.07% | 21.25% | 10.57% | 19.39% |
| #41 | Gemini 2.0 Flash | Google | 17.02% | 15.00% | 15.91% | 20.15% |
| #42 | Qwen 3 Max | Alibaba Qwen | 16.83% | 17.15% | 15.09% | 18.25% |
| #43 | GPT 5.1 | OpenAI | 15.41% | 16.67% | 12.45% | 17.11% |
| #44 | Claude 3.5 Haiku 20241022 | Anthropic | 14.10% | 19.17% | 9.81% | 13.31% |
| #45 | Grok 3 | xAI | 13.04% | 13.75% | 12.08% | 13.31% |
| #46 | Gemini 2.5 Pro | Google | 12.89% | 21.25% | 9.43% | 7.98% |
| #47 | Gemini 2.5 Flash | Google | 10.47% | 12.08% | 9.81% | 9.51% |
| Mistral Small 3.1* | Mistral | N/A | N/A | N/A | N/A | |
| Claude 3.5 Sonnet* | Anthropic | N/A | N/A | N/A | N/A | |
| Gemini 1.5 Pro* | Google | N/A | N/A | N/A | N/A |
* Models marked with an asterisk have partial scores.