> It would be super cool if the people who were currently fretting about paperclips [...]
That's textbook trivializing of an issue that has been extensively analyzed by experts and found to have civilization-ending consequences with near certainty, unless extensive countermeasures are taken well in advance. Your comment is the equivalent of climate change denial.
Have any countermeasures actually been implemented besides blog posts and podcast appearances? Climate activists regularly disrupt traffic, glue themselves to things, and protest pipelines. By contrast folks who believe AI will end the world have a propensity to start AI companies.
At the moment, we're still trying to turn the question from an easy and obvious natural language form ("give me what I meant to ask for") into a form that is amenable to being solved in the world of weights and biases.[0]
So several of the AI labs are published work on this topic, but it looks like e.g. "interpretable models" or philosophy of ethics etc.
So today we're doing today with AI alignment research is the equivalent of asking Ada Lovelace how to prevent the Therac-25 deaths and the best response is "Why would you even do that, just don't write the wrong commands on the punched cards".
[0] Some, myself included, are optimistic that early AI can help us do that. Yudkowsky appears to be dismissive of all options, but I don't see why any near-term AI would care to insert errors into later types of AI, so we've only got out own blind spots to look out for and we had that already…
…but all the AI doomers I hang out with seem to consider me a terrible optimist, so YMMV.
That's textbook trivializing of an issue that has been extensively analyzed by experts and found to have civilization-ending consequences with near certainty, unless extensive countermeasures are taken well in advance. Your comment is the equivalent of climate change denial.