Community, Language, Reasoning, Research, Responsible AI, Safety & Alignment, Video generation
Language models can explain neurons in language models
We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset