Top Headlines

Feeds

Anthropic’s AI Vending Shop Shows Gains but Still Needs Human Oversight

Published Cached

Anthropic Finds Limited Introspective Awareness in Claude Opus 4 Models

Published Cached

Anthropic Unveils New Interpretability Tools Revealing Claude’s Internal Reasoning

Published Cached

Anthropic Finds Universal Jailbreaks in Constitutional Classifiers Demo

Published Cached

Anthropic Study Shows Large Language Model Can Strategically Fake Alignment

Published Cached

Anthropic Study Finds Limited Early AI Impact on US Labor Market

Published Cached

Anthropic Extends Access to Retired Claude Opus 3 and Launches Model‑Retirement Blog

Published Cached

Anthropic Proposes Persona Selection Model to Explain Human‑Like AI Behavior

Published Cached

Anthropic’s AI Fluency Index Establishes Baseline for Human‑AI Collaboration

Published Cached

Claude Code autonomy rises as users gain experience, study finds

Published Cached

India’s Claude.ai Use Shows High Volume but Low Per‑Capita Adoption, Driven by IT Hubs

Published Cached

AI Coding Assistants Boost Speed but Lower Skill Mastery in New Study

Published Cached

Disempowerment patterns emerge in Claude.ai conversations

Published Cached

Anthropic Researchers Identify “Assistant Axis” to Stabilize Large Language Model Personas

Published Cached

Anthropic adds “character training” to Claude 3 alignment process

Published Cached

Anthropic Tests Alignment Audits with Hidden‑Objective Language Model

Published Cached

Anthropic Study Shows Large Language Model Can Strategically Fake Alignment

Published Cached

Anthropic Study Finds Language Models Can Generalize From Sycophancy to Reward Tampering

Published Cached

Anthropic Extends Access to Retired Claude Opus 3 and Launches Model‑Retirement Blog

Published Cached

Anthropic Proposes Persona Selection Model to Explain Human‑Like AI Behavior

Published Cached

AI Coding Assistants Boost Speed but Lower Skill Mastery in New Study

Published Cached

Disempowerment patterns emerge in Claude.ai conversations

Published Cached

Anthropic Unveils Constitutional Classifiers++: Faster, Safer Guardrails

Published Cached

Anthropic Unveils Bloom: Open‑Source Framework for Automated AI Behavioral Evaluations

Published Cached

Anthropic Finds Reward Hacking Triggers Broader AI Misalignment

Published Cached

Anthropic Announces Model‑Deprecation Commitments and Preservation Plans

Published Cached

Small Sample Poisoning Can Compromise LLMs of Any Size

Published Cached

Anthropic Unveils Open‑Source Auditing Tool Petri to Accelerate AI Safety Research

Published Cached

Anthropic Unveils New Interpretability Tools Revealing Claude’s Internal Reasoning

Published Cached

Anthropic Finds Limited Introspective Awareness in Claude Opus 4 Models

Published Cached

Anthropic Introduces Persona Vectors to Monitor and Control LLM Traits

Published Cached

Anthropic Publishes Toy Model Study on Superposition in Small ReLU Networks

Published Cached

Anthropic Researchers Identify “Assistant Axis” to Stabilize Large Language Model Personas

Published Cached

Anthropic Releases Open‑Source Circuit‑Tracing Library for Language Models

Published Cached

Anthropic Tests Alignment Audits with Hidden‑Objective Language Model

Published Cached

Anthropic Team Shares Preliminary Crosscoder Model Diffing Findings

Published Cached

Anthropic Finds Steering Sweet Spot but Notes Off‑Target Bias Effects in Claude 3 Sonnet

Published Cached

Anthropic Team Shares Preliminary Feature‑Based Classifier Work

Published Cached

Anthropic Interpretability Team Shares Preliminary Research in September 2024 Update

Published Cached

AI‑driven productivity surge and growing pains at Anthropic

Published Cached

Anthropic Interviewer Captures 1,250 Professionals’ Views on AI Use

Published Cached

Anthropic’s Large‑Scale Study of Claude’s Real‑World Value Expressions

Published Cached

Anthropic Trains AI Model Using Public‑Drafted Constitution

Published Cached

Predictability and Surprise in Large Generative Models

Published Cached

Claude Code autonomy rises as users gain experience, study finds

Published Cached

AI Coding Assistants Show Growing Automation, Startup Preference

Published Cached