The last two posts1 were focused on the lens. They discussed its various modes of operation, the dichotomy between the outward-facing “causal estimate” and the inward-facing “evaluation”, and the problems of foreign context and overgeneralization.
But it’s important to note that the lens is, at heart, a statistical machine. It is only concerned with how accurately it can perform its tasks (casual prediction, event interpretation etc.), and does not intentionally distort itself to satisfy any agenda. In other words, the lens lacks agency and a hypothetical “lens-only” generic is only good for making uninterested predictions or classifications2.
To breathe more life into the generic mind I introduced the agent, a cognitive process with an inherent goal to cause the production of certain evaluations. The concept behind its operation is as follows:
- Use the lens to perform value prediction. In other words, given some known prior information and the desired evaluations, figure out what must be in the posterior information for the lens to produce such an evaluation.
- Try to engineer the actions and circumstances of the generic to increase the odds that the right events happen. If this is done successfully then the lens will produce the evaluations that the agent desires.
The Textbook Artificial Intelligence
An artificial intelligence as classically defined (hereafter called “the AI”) can be thought of as a generic with the following qualities:
- It has an internal sense which estimates some kind of objective function, i.e. how close the AI thinks it is to achieving a certain goal.
- It has an evaluation called
Improvement
which measures differences in the objective function between the prior and the posterior information. - It has an agent which motivates the AI to choose actions that result in
Improvement
.
Some argue that this setup is already enough to produce arbitrarily complex behaviors, and indeed thought experiments like the Paperclip Maximizer3 suggest that a sufficiently advanced AI can be very creative in its approach, developing many instrumental goals such as Stay Alive, Maximize Raw Materials, Maximize Production Capacity,
and eventually Eliminate Rivals (incl. Humans)
in its pursuit for Improvement
. But such thought experiments sweep a huge amount of complexity and numerous hidden assumptions under the phrase “sufficiently advanced AI”. For example:
Who designed the internal sense that is the AI’s objective function?
A scary superhuman AI is often assumed to have an arbitrarily complex, yet perfectly precise method of measuring its objective. In reality, our goals are vague and can be difficult to define precisely on a grand scale.
What is the space of actions the AI can take?
A scary superhuman AI is assumed to be able to do anything, or arbitrarily expand its capabilities. But it is very difficult to reason about actions that were not known to be possible. A paperclip maximizer capable of doing this arbitrarily well may as well simulate entire universes as sandbox environments, which kind of make its paperclip maximization skills seem underwhelming.
How are the instrumental goals being made, and how are they scheduled or balanced?
This hypothetical maximization AI needs to create instrumental goals, and somehow not get sidetracked by these goals. It must have a remarkable amount of dedication and meta-AI knowledge to be able to create these goals without missing the forest for the trees. And arguably, intelligent entities that don’t feel a need to be wholly focused on a singular goal are not necessarily less advanced than intelligent entities that do feel such a need.
This last point is important – the textbook AI definition, as broad as it is, excludes intelligent agents that act on shifting, poorly defined, or internal goals. Humans do exhibit this level of richness, and so should generic sentient beings.
Multiple Agents in a Tug of War
We can get a much richer (and more chaotic) set of behaviors if we allow generics to house more than one agent. How can a generic act in the interest of multiple, possibly conflicting goals? We can consider each agent kind of like a force, where the action of the generic is the overall outcome of each of these forces. Each agent has an intrinsic pool of energy which is expended when the agent exerts its influence on a generic’s behavior. When the lens performs event interpretation of the resulting actions, each agent may or may not replenish energy depending on the evaluations that are produced. The agent can choose to use the energy immediately, save it for specific scenarios, or otherwise spend its energy judiciously according to some strategy.
In principle, the evaluations that replenish energy (the source evaluations) may or may not match the evaluations that an agent tries to generate (the sink evaluations). But clearly, an agent with no energy may as well not exist. It is usually more useful to consider cases where there is considerable alignment between source and sink evaluations, for the simple reason that such agents act to prevent themselves from running out of energy.
It is important to note that many events are outside a generic’s control, so even if these agents take part in positive feedback loops to increase their own energy, random external events will generate a wide variety of evaluations, which may prevent any one agent from dominating. Also don’t forget that internal senses can participate in the interaction, and that the internal senses may depend on any part of the generic mind, including the behaviors of the agents. This gives a recursive flavor to the whole system and explains why actions can be entirely internally motivated. Agents can have strong preferences for or against actions even when the external effects of the actions are negligible, simply because the agents are tuned into changes picked up by the internal senses (e.g. in self-image, perceived social face, etc.).
Lens to Agent, Agent to Lens
All agents depend on the lens for value prediction and event interpretation, so changes in the lens can considerably affect the balance of power between the agents. Particularly of note is the effect of over-generalization. If a lens develops a tendency to produce a certain evaluation more often than usual, then energy can be redistributed in a skewed manner. The generic will act in unusual ways reflecting the new balance of power between the agents, which could result in a self-fulfilling prophecy effect as the generic puts themselves into situations where the mistakenly overproduced evaluation is further enhanced.
But no matter how skewed it can sometimes be, the lens is an uninterested statistical machine. If the lens is put into scenarios where its over-generalization tendencies are revealed through a series of incorrect value predictions, the lens will correct itself and balance between the agents would be restored. Closely related is the complex behavior specialization effect, where generics engaging in more complex (but not wholly unpredictable) behaviors tend to distribute energy between more agents. This is because the lens tends to use more varied evaluations to more accurately explain the behavior of the generic, resulting in more variety in how agents replenish and expend energy. Said more simply, instead of explaining all behaviors under the evaluation Uh, Just Because?
, the accuracy-seeking behavior of the lens encourages a generic to produce finer-grained evaluations for explaining their behavior, which in turn allows energy to be shared across more agents.
Footnotes
- Which were posted nearly 3 years ago! I am coming back from a hiatus caused by prioritizing other work. You may notice that I have changed writing styles, making my posts more curt and technical. This is because I felt that the previous style was hard to maintain and diluted the content too much.
- This sounds an awful lot like the current state of AI. Regardless of how much meaning humans put into an AI agent’s decisions, to the AI itself the decisions are simply the ones observed to maximize some goal function, which is usually some measure of predictive accuracy.
- A superhuman AI that attempts to maximize a factory’s production of paperclips, which eventually grows beyond the factory and threatens humanity in trying to create more paperclips than humans will ever need. See the linked article for more information.