Link: Towards a scale-free theory of intelligent agency
Theory of intelligent agency
Unifying mathematical framework describing
- Understanding the world
- Influencing the world e.g., if this agent has some kind of world model
- The former could be some kind of update function when receiving observational data
- The latter could be an action-output function given the observational data and world model.
Scale-free:
- The formal framework can be parameterized to different levels of scale
- Accurately models convergent dynamics that occur in real-world agents at different levels of scale: simple programs, people, companies, countries
Candidate #1: Expected Utility Maximization
- The economics one everyone knows
- Two disadvantages:
Treats goals and beliefs as separate, when they are entangled in practice.
- E.g., the way an agent internally represents its goals is through the language of concepts it has about the world – in order for an agent to know it’s a “paperclip maximizer”, it must have some internal concept of a paperclip. And this still assumes some level of separability into different concepts like “paperclips” that most agents don’t have.
This framework is not powerful enough to describe the process of learning heuristics that enable agents to reason in high-dimensional environments.
Moreover, this theory often models heuristics as inputs to a reasoning process, whereas in reality these heuristics can often be better thought of as functions themselves that the agent doesn’t necessarily “understand.”
Candidate #2: Active Inference
- While EUM assumes beliefs and goals are exogenous inputs, active inference describes how beliefs and goals are acquired over time. They are modeled as being part of the system itself.
- I need to read much more about this theory on my own, and think my current understanding is pretty incomplete.
- Active Inference / FEP at a High Level:
Models brains as hierarchical networks similar to neural networks (although more graphical): lower layers predict more granular concepts about observations (“lines” in an image, for example) while higher layers predict more abstract concepts (“tigers”, “animals”)
Each layer is trying to predict the previous layer
Actions are taken in order to minimize the expected free energy over future outcomes, essentially a loss function between what is predicted and received at each layer
This is also used to update layers.
Rewards are encoded into this framework as being priors on preferred states, $\propto\exp(r(s))$
- Reconciles bottom-up signals with top-down predictions:
- Signals: bottom up. Raw pixel input turns into shapes turn into objects that we recognize.
- Predictions: top down. If we know the image we receive is someone revving a motorcycle, we can predict it will start moving. This prediction turn from a concept into each of the pixels we would see if the motorcycle started to move.
Towards a scale-free unification
Active inference can’t easily model strategic interaction between different goals within a single agent (whereas game theory gives some framework for doing so in EUMs)
This is a useful desiderata for modelling internal conflict.
In order to model conflict / cooperation of different internal goals, our framework, in some sense, needs to be compositional. Internal conflict could be interpreted as strategic contests between decomposed subagents.
We can call such “decomposable” agent frameworks coalitional agents.
Coalitional Agents in EUM
- Harsanyi’s utilitarian theorem: aggregating EUMs into a larger-scale EUM must have a utility function that’s a (weighted) sum of the subagent’s utilities
- This assumes pareto indifference: if everyone is indifferent between two lotteries, so is their collective agent
- Too constraining
- Scott Garrabrant proposes rejecting the independence axiom of the VNM axioms as a potential solution
- Also, only works from a “god’s-eye” perspective where you know everyone’s utility functions, much harder as a mechanism design question where people can lie.
Incentive-compatible Coalitional Agents
- Idea: bargain over general decision-making procedures
- Possible procedures:
- Prediction markets
- Delegation: delegate particular domains to particular subagents
- Voting
- Randomization (pick a random agent, they choose)
- A hierarchical agent decides given input from lower-level agents