Grok 4

Frontier · xAI · xAI · Released Jul 2025

B+

Good (B+) — Capability Grade

Grok 4 is xAI's flagship model. Strong reasoning and math benchmarks; strong real-time information access via X integration. Documented red-teaming gaps relative to peer frontier models. Alignment posture publicly defined as 'maximally truthful' which produces different refusal behavior than Claude or GPT.

Composite / 100

/ Subscore Breakdown · 6 Capability Dimensions

Where this grade comes from.

General Reasoning

B+

Code Generation

B+

Math & STEM

A-

Tool Use & Agency

Multimodal

Safety & Alignment

C+

/ Key Events & Disclosures

Release timeline & positioning.

Released Jul 2025
Real-time X integration
'Maximally truthful' alignment posture
Documented red-team gaps vs. peers

/ Best for

Real-time information workloads via X integration; applications requiring less restrictive content policies.

/ Watch out for

Alignment / safety subscore materially lower than peer frontier models. Enterprise risk-sensitive deployments should evaluate red-team disclosure thoroughly.