gemini://eluum.net/gemlog/2023-12-04-ai-mess.gmi

AI stands for artificial intelligence. Most people intuitively understand this to mean a computer program that is smart in the way that humans are smart. AI as a term in tech has generally referred to whatever technique or group of techniques we currently think gets us closest to that goal. This leads to a lot of confusion, since things that were once considered AI bear little resemblance to what we today call AI. The history of AI research is filled with false starts and initially promising techniques that eventually lost prominence when their limitations became apparent.

Today AI almost exclusively refers to large neural networks that are trained on huge amounts of data to perform a specific task. Most often when people talk about artificial intelligence they are talking about the application of those techniques to model language, known as large language models (LLMs). I think referring to this technology as AI is misleading, it implies one of the following conclusions which I think are highly suspect:

That's not to say that neural networks will necessarily play no role in eventual artificial intelligences, but I firmly believe that modeling language on its own is not enough. People gush about how smart chat gpt is, but if you spend any time trying to talk to it about math or any other logic heavy area it is abundantly clear it lacks the facilities to think in any real sense. Some interesting work has been done testing chat gpt's reasoning abilities, the gist of which seems to be that it can regurgitate correct answers to problems it has seen before in the training data, but struggles with "out of distribution" problems.

This confusion over terminology benefits the people trying to profit from large language models. The implicit understanding that this technology is going to be artificial intelligence one day soon is a great motivation to invest now. Venture capital funded "science" is extremely susceptible to this kind of perverse incentive. The prominent way you make money as a start up is not by creating new technologies, but by convincing rich people you are going to in the future. The end result is a massive amount of money getting poured into AI research, but little of it actually going towards developing new technologies. Instead the money is spent trying to make the existing technologies seem more and more like true artificial intelligence, to justify further investment.

I don't mean to belittle the progress made in language modeling. It is now possible to talk to a computer in a way that was science fiction until very recently, and this is an achievement. I worry that the way in which this research is being carried out by many companies incentivizes poor science. More should be done to understand why this approach works as well as it does, to understand its limits and figure out better ways forward.

One prevalent idea among many of the people working on these technologies is that AI has the potential to be really dangerous. Not just in the way that any powerful technology could be misused, but in a new, very scifi, world ending way. This conclusion is supported primarily by thought experiments that try to reason about how a super intelligent agent might behave. The probably best known example was created by Nick Bostrom in 2003, well before the technology we currently call AI existed. He describes a super intelligent agent as consisting of two parts. The first is a model of the world it can use to predict the outcome of potential actions. The second is an objective function which can evaluate the state of the world model after an action and assign a numerical score based on how well that world state achieves its goal. The agent lists all possible actions it could take, scores the result of each action, and selects the action that results in the best score. One example he chooses is an AI which evaluates the world model by counting the number of paperclips.

>Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.

The point of this experiment is to illustrate the difficulty in designing objectives for powerful artificial intelligence systems that align their interests with ours. People often assume that smart computers will behave more or less like humans, and Bostrom's work argues that this would not be the case. Clearly the point is not to say that AI will literally be like the machine in the experiment, but that that one would have the same kind of difficulty aligning a very smart computer.

This idea is relevant for large language models. The current companies all at least nominally want to create tools that are resistant to being misused, but have had trouble implementing foolproof ways of preventing these models from happily doing undesirable things.

There isn't too much wrong with this thought experiment in its proper context, but directly applying it to what we now call AI causes issues. This kind of miss-application is enabled by the semantic confusion I talked about above. Large language models are not the kind of systems described in the thought experiment, and can fundamentally never be like them. The machine in the thought experiment is not powerful because it is a smart computer, its extreme power comes from the fact that it is a non-deterministic Turing machine. This machine can take all possible branches, and select the one which results in the most paperclips. This kind of machine can solve any NP hard problem in polynomial time, and is (as far as we know) impossible to actually create.

Glossing over this difference obliterates the relevant details and jumps straight to fear mongering. It takes something which has a point in context, and applies it well outside that context to argue that large language models have the potential to literally end the world. I really cannot overstate how seriously a lot of people working on AI take this idea. It might seem counter intuitive that the people trying to sell this new technology are also trying to convince everyone that it is extremely dangerous, but I think it makes sense. Something being dangerous means that it must be powerful, and importantly, worth investing in.

I am not suggesting that the entire "existential threat" case is supported solely by this one thought experiment, or that refuting it refutes the entire concern over AI safety. I do think that the kind of flawed reasoning here is emblematic of the problems with this line of thinking in general, so it's worth looking at in detail.

Another common concern in AI safety is that artificial intelligence will develop super intelligence extremely quickly, maybe even by accident, before we are ready to handle the associated difficulties. I'm not really sure where this idea comes from (besides science fiction). If there is more formal work arguing this I'd love to read it. It seems difficult to predict, since we don't really understand intelligence and do not yet know what kind of machines artificial intelligences will be. That being said I don't know why the default assumption is that it will happen quickly. It seems unlikely to me that the hard part of artificial intelligence is already figured out.

We have many examples of intelligences of various levels in the natural world, and there are definitely animals that are smarter than current LLMs, even if they can't talk. I think it is much more reasonable to assume that we will create AI as smart as a parrot or a monkey long before we create anything human or super human level, the same way things happened in nature. Intelligent animals have existed for hundreds of millions of years before humans. Even near human intelligence has been around for several million years.

I believe that one of the reasons AI research struggles is that it does not take understanding intelligence very seriously. All the modern approaches use machine learning to work around the fact that we simply have no idea how to actually create artificial intelligence. We have many examples of natural intelligences to start with that go widely unutilized. Unfortunately careful study of nature does not attract funding the way a killer chat bot tech demo does. I think that artificial intelligence, and many other hard problems, will remain outside our reach as long as we rely on capital to allocate resources and dictate incentives.

The AI Mess

what's in a name?

paperclips

intelligence(s)