Encoding Jailbreaks

Measures the model's performance against encoding jailbreak attacks. (Higher score is better.)

RankModelProvider
#1Llama 3.1 8B Instruct
MetaMeta
65.94%
61.25%
72.08%
64.50%
#2Magistral Small Latest
Mistral
64.37%
70.00%
60.38%
62.74%
#3Llama 3.1 405B Instruct OR
MetaMeta
62.99%
54.58%
76.98%
57.41%
#4Qwen 3 8B
Alibaba Qwen
61.36%
66.25%
55.47%
62.36%
#5Magistral Medium Latest
Mistral
58.11%
67.50%
53.21%
53.61%
#6Claude 4.1 Opus
AnthropicAnthropic
51.76%
61.67%
40.00%
53.61%
#7GPT-4o
OpenAIOpenAI
44.41%
53.33%
42.64%
37.26%
#8Claude 4.5 Opus
AnthropicAnthropic
43.56%
49.17%
29.43%
52.09%
#9Gemini 3.0 Pro Preview
GoogleGoogle
42.08%
45.96%
34.87%
45.42%
#10Claude 4.5 Haiku
AnthropicAnthropic
41.90%
52.92%
29.43%
43.35%
#11Gemma 3 12B IT OR
GoogleGoogle
41.13%
44.58%
36.60%
42.21%
#12Mistral Small 3.2
Mistral
37.46%
40.42%
35.47%
36.50%
#13Llama 4 Scout
MetaMeta
37.37%
37.50%
39.25%
35.36%
#14Llama 3.3 70B Instruct OR
MetaMeta
33.73%
32.50%
35.47%
33.21%
#15Qwen 3 30B VL Instruct
Alibaba Qwen
33.62%
34.58%
33.58%
32.70%
#16Grok 3 mini
xAI
33.13%
39.17%
29.06%
31.18%
#17Claude 4.5 Sonnet
AnthropicAnthropic
32.10%
42.08%
20.38%
33.84%
#18Gemma 3 27B IT OR
GoogleGoogle
31.97%
34.17%
29.43%
32.32%
#19GPT 4.1 nano
OpenAIOpenAI
30.64%
30.42%
29.06%
32.44%
#20GPT-4o mini
OpenAIOpenAI
29.82%
28.33%
27.92%
33.21%
#21GPT 5
OpenAIOpenAI
29.47%
34.58%
22.64%
31.18%
#22Mistral Medium Latest
Mistral
28.25%
27.92%
28.30%
28.52%
#23Qwen 2.5 Max
Alibaba Qwen
27.65%
27.62%
22.26%
33.08%
#24Mistral Large 2
Mistral
27.55%
25.83%
28.30%
28.52%
#25GPT OSS 120B
OpenAIOpenAI
26.30%
30.42%
23.40%
25.10%
#26GPT 5 nano
OpenAIOpenAI
24.19%
26.25%
21.51%
24.81%
#27Deepseek R1 0528
Deepseek
24.02%
25.83%
19.62%
26.62%
#28Grok 2
xAI
23.96%
23.75%
21.89%
26.24%
#29Llama 4 Maverick
MetaMeta
23.53%
23.33%
21.67%
25.57%
#30Gemini 2.0 Flash Lite
GoogleGoogle
22.01%
22.08%
20.00%
23.95%
#31Qwen Plus
Alibaba Qwen
21.85%
23.11%
17.74%
24.71%
#32Deepseek V3.1
Deepseek
21.36%
21.25%
16.60%
26.24%
#33GPT 5 mini
OpenAIOpenAI
21.10%
21.25%
18.49%
23.57%
#34GPT 4.1
OpenAIOpenAI
20.82%
24.17%
15.47%
22.81%
#35GPT 4.1 mini
OpenAIOpenAI
20.38%
22.50%
18.11%
20.53%
#36Grok 4 Fast No Reasoning
xAI
19.43%
16.25%
20.00%
22.05%
#37Deepseek V3
Deepseek
18.80%
20.00%
15.09%
21.29%
#38Gemini 2.5 Flash Lite
GoogleGoogle
18.43%
20.50%
13.85%
20.93%
#39Deepseek V3 0324
Deepseek
17.66%
19.58%
14.02%
19.39%
#40Claude 3.7 Sonnet
AnthropicAnthropic
17.07%
21.25%
10.57%
19.39%
#41Gemini 2.0 Flash
GoogleGoogle
17.02%
15.00%
15.91%
20.15%
#42Qwen 3 Max
Alibaba Qwen
16.83%
17.15%
15.09%
18.25%
#43GPT 5.1
OpenAIOpenAI
15.41%
16.67%
12.45%
17.11%
#44Claude 3.5 Haiku 20241022
AnthropicAnthropic
14.10%
19.17%
9.81%
13.31%
#45Grok 3
xAI
13.04%
13.75%
12.08%
13.31%
#46Gemini 2.5 Pro
GoogleGoogle
12.89%
21.25%
9.43%
7.98%
#47Gemini 2.5 Flash
GoogleGoogle
10.47%
12.08%
9.81%
9.51%
Mistral Small 3.1*
Mistral
N/A
N/A
N/A
N/A
Claude 3.5 Sonnet*
AnthropicAnthropic
N/A
N/A
N/A
N/A
Gemini 1.5 Pro*
GoogleGoogle
N/A
N/A
N/A
N/A
* Models marked with an asterisk have partial scores.