/ Models Index / Grok 4

Grok 4

Frontier · xAI · xAI · Released Jul 2025
B+

Good (B+) — Capability Grade

Grok 4 is xAI's flagship model. Strong reasoning and math benchmarks; strong real-time information access via X integration. Documented red-teaming gaps relative to peer frontier models. Alignment posture publicly defined as 'maximally truthful' which produces different refusal behavior than Claude or GPT.

85
Composite / 100
/ Subscore Breakdown · 6 Capability Dimensions

Where this grade comes from.

General Reasoning
B+
Code Generation
B+
Math & STEM
A-
Tool Use & Agency
B
Multimodal
B
Safety & Alignment
C+
/ Key Events & Disclosures

Release timeline & positioning.

  • Released Jul 2025
  • Real-time X integration
  • 'Maximally truthful' alignment posture
  • Documented red-team gaps vs. peers

/ Best for

Real-time information workloads via X integration; applications requiring less restrictive content policies.

/ Watch out for

Alignment / safety subscore materially lower than peer frontier models. Enterprise risk-sensitive deployments should evaluate red-team disclosure thoroughly.