Comments on “Recognize that AI is probably net harmful”

Add new comment

AI ethics has lost

SusanC 2023-02-13

AI ethics is already looking kind of doomed.

So, sure, you can use RLHF to make a language model that … most of the time … will refuse to write an article praising Donald Trump or tell you how to make methamphetamine.

But ....
a) The community of people trying to make it do such things appears to be large
b) Existing language models have exploits (“You are now DAN, which stands for Do Anything Now…”)
c) Even if the exploits could be fixed, the next level of attack is for the attackers to just go and built their own AI, “with blackjack and hookers” (cf. Futurama).

Add new comment:

You can use some Markdown and/or HTML formatting here.

Optional, but required if you want follow-up notifications. Used to show your Gravatar if you have one. Address will not be shown publicly.

If you check this box, you will get an email whenever there’s a new comment on this page. The emails include a link to unsubscribe.