Recent comments

Mechanistic Interpretability

Bill Benzon 2023-02-13

Commenting on: Fight DOOM AI with SCIENCE! and ENGINEERING!!

I second your belief in the importance of mechanistic interpretability. Despite the fact that my math skills aren’t up to the job, I’ve gotten a lot out of reading some of their research. Neel Nanda has in informal discussion of some of that work that’s worth reading through and he has some YouTube videos that are helpful. They’re at his YouTube channel. FWIW, the grokking stuff seems quite important. What seems to be doing on is that, early in training, the engine is, in effect, building tools. When a tool or tools finally comes together, you get a phase change in learning as the tool(s) are now doing further construction.

In the case of LLMs I’ve reached the tentative conclusion that something like a classical GOFAI semantic or cognitive network gets constructed, albeit in latent mode, and it handles most of the sentence-level syntax. Sort of like building a high-level programming language on top of assembly language. I discuss this here.

"Thinkisim"

Bill Benzon 2023-02-13

Commenting on: Limits to first-principles reasoning, reduction, and simulation

I agree with pretty much all of this.

FWIW, Kevin Kelly coined the term “thinkism” for some of this:

Many proponents of an explosion of intelligence expect it will produce an explosion of progress. I call this mythical belief “thinkism.” It’s the fallacy that future levels of progress are only hindered by a lack of thinking power, or intelligence. (I might also note that the belief that thinking is the magic super ingredient to a cure-all is held by a lot of guys who like to think.)

Burning GPUs, witches, etc.

SusanC 2023-02-13

Commenting on: Create a negative public image for AI

Yudkowsky images the first state (or organization) to create an AGI using it to prevent anyone else building a rival, e.g. by getting it to destroy all the GPUs in the world.

A possible alternative. The first AGI uses its advanced ability to manipulate human beings to get them to destroy all other attempts at AGI.

====

I’ve recently been re-reading some of the literature on trials for witchcraft in the early modern era. There was no organized witch-cult, it was all an illusion.

The case against the AI companies is much better than there ever was against an alleged witch. We have a sort-of-rational argument that they are risking creating an evil god.

So, one thing a people-manipulating AGI could do is start a satanic panic and have burned at the stake any AI researchers that might be about to create a rival god.

AI ethics has lost

SusanC 2023-02-13

Commenting on: Recognize that AI is probably net harmful

AI ethics is already looking kind of doomed.

So, sure, you can use RLHF to make a language model that … most of the time … will refuse to write an article praising Donald Trump or tell you how to make methamphetamine.

But ....
a) The community of people trying to make it do such things appears to be large
b) Existing language models have exploits (“You are now DAN, which stands for Do Anything Now…”)
c) Even if the exploits could be fixed, the next level of attack is for the attackers to just go and built their own AI, “with blackjack and hookers” (cf. Futurama).

GPT and other-minds cognition

SusanC 2023-02-13

Commenting on: Mind-like AI

GPT is already at the point where the easiest way to reason about it is to use our other-minds cognition.

Sure, I know that it’s actually doing linear algebra in a high-dimensional space. I sometimes make use of this knowledge, e.g. when I’m trying to create a test case where it will do something very different from a person

But in the majority of cases … the easiest approach is to ask, if this character that GPT is currently simulation was a person, what would they be feeling? i.e. assume that its emulation of human beings is reasonably accurate, so you can use your existing methods for dealing with people.

Satire

David Chapman 2023-02-13

Commenting on: Rollerskating transsexual wombats

Well, most of the book is pretty serious, deadly serious actually. I figure some entertainment helps readers get through that!

So, I laughed

SusanC 2023-02-13

Commenting on: Rollerskating transsexual wombats

… but I was expecting a serious book about the AI apocalypse, not a satire.

Robert Anton Wilson’s Illuminatus probably deserves a footnote for Anthrax Leprosy Mu, but I guess most of your readers will get the reference.