Translating Claude's thoughts into language
Translating Claude's thoughts into language
7 May 2026 • 19:01


Author: Anthropic – Duration: 00:03:17
AI models like Claude speak in words but think in numbers. These numbers, called activations, encode Claude's thoughts, but not in readable language. We introduce natural language autoencoders, or NLAs, that translate AI model activations into readable text. NLAs have already helped us improve the way we test the security of our models and better understand why they do what they do. Learn more about this research on our blog: https://www.anthropic.com/research/natural-lingual-autoencoders






