Comments on “Recognize that AI is probably net harmful”
Comments are for the page: Recognize that AI is probably net harmful
How to avert an AI apocalypse... and create a future we would like
Comments are for the page: Recognize that AI is probably net harmful
AI ethics has lost
AI ethics is already looking kind of doomed.
So, sure, you can use RLHF to make a language model that … most of the time … will refuse to write an article praising Donald Trump or tell you how to make methamphetamine.
But ....
a) The community of people trying to make it do such things appears to be large
b) Existing language models have exploits (“You are now DAN, which stands for Do Anything Now…”)
c) Even if the exploits could be fixed, the next level of attack is for the attackers to just go and built their own AI, “with blackjack and hookers” (cf. Futurama).