Top Headlines

Feeds

Claude Code autonomy rises as users gain experience, study finds

Published Cached
  • Figure 1. 99.9th percentile turn duration (how long Claude works on a per-turn basis) in interactive Claude Code sessions, 7-day rolling average. The 99.9th percentile has grown steadily from under 25 minutes in late September to over 45 minutes in early January. This analysis reflects all interactive Claude Code usage.
    Figure 1. 99.9th percentile turn duration (how long Claude works on a per-turn basis) in interactive Claude Code sessions, 7-day rolling average. The 99.9th percentile has grown steadily from under 25 minutes in late September to over 45 minutes in early January. This analysis reflects all interactive Claude Code usage.
    Image: Anthropic
    Figure 1. 99.9th percentile turn duration (how long Claude works on a per-turn basis) in interactive Claude Code sessions, 7-day rolling average. The 99.9th percentile has grown steadily from under 25 minutes in late September to over 45 minutes in early January. This analysis reflects all interactive Claude Code usage. (Anthropic) Source Full size

Long‑running sessions have nearly doubled in length – The 99.9th‑percentile turn duration grew from under 25 minutes in late September 2025 to over 45 minutes by early January 2026, a smooth trend across model releases suggesting factors beyond raw capability are at play [1].

Experienced users grant more autonomy but also intervene more – Auto‑approve usage climbs from roughly 20 % of sessions for newcomers to over 40 % after 750 sessions, while per‑turn interrupt rates rise from about 5 % to 9 % as users become seasoned [1].

Claude Code asks for clarification twice as often as humans interrupt on complex tasks – On the most demanding goals, the model’s self‑initiated pauses exceed human‑initiated interruptions by a factor of two, indicating built‑in uncertainty handling [1].

Agents operate mainly in low‑risk software engineering, with limited high‑risk use – Nearly 50 % of public‑API tool calls involve software engineering; 80 % include safeguards, 73 % retain a human in the loop, and only 0.8 % are irreversible, while emerging activity appears in healthcare, finance and cybersecurity [1].

Internal Claude Code usage shows higher success with fewer human interventions – From August to December, success on the hardest internal tasks doubled and average human interventions fell from 5.4 to 3.3 per session, reflecting growing trust and efficiency [1].

Study limited to Anthropic data and cannot reconstruct full agent sessions on the public API – The analysis covers only Anthropic‑hosted agents, treats API calls in isolation, and lacks visibility into downstream human review or multi‑step workflows, constraining generalizability [1].

Links