Mechanistic Interpretability lead DeepMind. Formerly @AnthropicAI, independent. In this to reduce AI X-risk. Neural networks can be understood, let's go do it!
Neel Nanda leads the mechanistic interpretability team at DeepMind, focusing on reverse-engineering neural networks to reduce AI existential risk. Formerly at Anthropic. Tagged AI Safety because >=60% of his public output centers on alignment, interpretability, and governance work.
Showing only followers who are currently ranked in the top 2000.
Last updated — · 1999 influencers tracked · 3000 ranked users