Long‑running sessions have nearly doubled in length – The 99.9th‑percentile turn duration grew from under 25 minutes in late September 2025 to over 45 minutes by early January 2026, a smooth trend across model releases suggesting factors beyond raw capability are at play [1].
Experienced users grant more autonomy but also intervene more – Auto‑approve usage climbs from roughly 20 % of sessions for newcomers to over 40 % after 750 sessions, while per‑turn interrupt rates rise from about 5 % to 9 % as users become seasoned [1].
Claude Code asks for clarification twice as often as humans interrupt on complex tasks – On the most demanding goals, the model’s self‑initiated pauses exceed human‑initiated interruptions by a factor of two, indicating built‑in uncertainty handling [1].
Agents operate mainly in low‑risk software engineering, with limited high‑risk use – Nearly 50 % of public‑API tool calls involve software engineering; 80 % include safeguards, 73 % retain a human in the loop, and only 0.8 % are irreversible, while emerging activity appears in healthcare, finance and cybersecurity [1].
Internal Claude Code usage shows higher success with fewer human interventions – From August to December, success on the hardest internal tasks doubled and average human interventions fell from 5.4 to 3.3 per session, reflecting growing trust and efficiency [1].
Study limited to Anthropic data and cannot reconstruct full agent sessions on the public API – The analysis covers only Anthropic‑hosted agents, treats API calls in isolation, and lacks visibility into downstream human review or multi‑step workflows, constraining generalizability [1].