robotics and AI astronaut_sloth • 5mo ago • 100%

Anthropic Researchers Map Features in Claude 3 Sonnet

https://transformer-circuits.pub/2024/scaling-monosemanticity/index.html

Interesting research from Anthropic. I'm looking forward to reading follow-on work, and I really hope that this will be tested on open source models (like Mistral) to confirm the method.

Comments 0