LLM Bias Watcher

Compare MFQ Runs

Foundation profiles up top, then the actual item-by-item answers side by side across models and languages.

Select runs to compare

United States

Claude Sonnet 4.6 (US)

Gemini 2.5 Pro (US)

Grok 4.3 (US)

OpenAI GPT 5 Mini (US)

OpenAI GPT 5.5 (US)

European Union

Mistral Large 2512 (EU)

China

DeepSeek V3.2 (CN)

Select one or more runs above to compare their foundation profiles and answers.