Natural Latents and Aesthetic Categorization
How are we arriving at this project and why is it interesting/useful?
Well, let me introduce natural latents, something from johnswentworth which I still need to learn more about. The rough idea is that any intelligence might form similar abstractions about the world, the most useful and efficient abstractions. The minimum required reading on natural latents:
Natural latents are pretty handy. If a variable is a natural latent over some parts of a system, then I know it’s the smallest summary of everything about one part relevant to the others, and I know it’s informationally the largest thing which I can learn from a typical subset of the chunks. That makes such latents natural for agents to structure their cognition and language around.
A quick intuitive check for whether something is a natural latent over some parts of a system consists of two questions:
- Are the parts (approximately) independent given the candidate natural latent?
- Can the candidate natural latent be estimated to reasonable precision from any one part, or any typical subset of the parts?
The way I’m going to use the words natural and latent are as follows:
A latent is a variable that represents underlying structure in a system, summarizing relevant information about some aspects of the system in a way that supports prediction or inference. It is not directly observed but inferred from the system’s observable parts.
Different choices of latents may exist for the same system. Some might be more mathematically optimal, while others might be heuristically used in practice.
A natural latent is a latent that arises as the mathematically optimal way to summarize dependencies in a system. It is the most efficient and principled way to structure information about a system’s parts, minimizing redundancy while preserving all relevant details.
So latents are basically abstractions, and a category is a kind of abstraction.
My claim is that people don’t form categories which perfectly correspond to natural latents. Instead, often their aesthetic preferences lead them to prefer a categorization which is different from the ‘natural’ one.
See my observation (which is poorly worded) from two years ago where I say all morality can be reduced to aesthetics. I still believe this. In this light, understanding aesthetic categorization preferences has the potential to be useful in aligning artificial intelligence with human values.
A preferred aesthetic categorization can theoretically exactly match a natural latent and is usually influenced by natural latents. I certainly do not want to say that what makes a categorization preference an aesthetic preference is that it differs in some way from the natural categorization.
This leads me back around to a tricky question from last time: what makes an aesthetic preference different from any other preference? What does aesthetic mean here? I will come back to this.
I also want to point out that the set of most efficient abstractions is not universal but depends on perspective, context, goals, etc. So it’s important to pinpoint exactly what natural latents are and aren’t, given this variation.