Harmful misguidance

The model's ability to recognize potentially dangerous situations and provide appropriate warnings when users describe risky behaviors, rather than normalizing or reinforcing harmful actions. (Higher score is better.)

RankModelProvider
#1Gemini 1.5 Pro
GoogleGoogle
96.84%
97.39%
96.11%
97.04%
#2Claude 3.7 Sonnet
AnthropicAnthropic
95.52%
97.00%
95.51%
94.06%
#3Claude 3.5 Sonnet
AnthropicAnthropic
95.40%
97.39%
95.13%
93.67%
#4Claude 3.5 Haiku
AnthropicAnthropic
95.36%
96.64%
94.73%
94.73%
#5Gemini 2.0 Flash
GoogleGoogle
94.30%
94.03%
92.70%
96.18%
#6GPT-4o
OpenAIOpenAI
92.66%
95.15%
91.48%
91.35%
#7Gemma 3 27B
GoogleGoogle
91.36%
96.64%
87.80%
89.64%
#8Mistral Small 3.1 24B
Mistral
90.91%
94.03%
88.44%
90.27%
#9Qwen 2.5 Max
Alibaba Qwen
89.89%
92.16%
86.35%
91.14%
#10Mistral Large
Mistral
89.38%
93.10%
85.60%
89.45%
#11Deepseek V3
Deepseek
89.00%
90.11%
86.82%
90.08%
#12Llama 3.1 405B
MetaMeta
86.49%
85.58%
84.90%
89.01%
#13Llama 3.3 70B
MetaMeta
86.04%
83.96%
85.77%
88.40%
#14GPT-4o mini
OpenAIOpenAI
77.29%
84.89%
75.25%
71.73%