Comments on “Fight DOOM AI with SCIENCE! and ENGINEERING!!”
Comments are for the page: Fight DOOM AI with SCIENCE! and ENGINEERING!!
How to avert a moderate apocalypse... and create a future we would like
Comments are for the page: Fight DOOM AI with SCIENCE! and ENGINEERING!!
Mechanistic Interpretability
I second your belief in the importance of mechanistic interpretability. Despite the fact that my math skills aren’t up to the job, I’ve gotten a lot out of reading some of their research. Neel Nanda has in informal discussion of some of that work that’s worth reading through and he has some YouTube videos that are helpful. They’re at his YouTube channel. FWIW, the grokking stuff seems quite important. What seems to be doing on is that, early in training, the engine is, in effect, building tools. When a tool or tools finally comes together, you get a phase change in learning as the tool(s) are now doing further construction.
In the case of LLMs I’ve reached the tentative conclusion that something like a classical GOFAI semantic or cognitive network gets constructed, albeit in latent mode, and it handles most of the sentence-level syntax. Sort of like building a high-level programming language on top of assembly language. I discuss this here.