Please send me your thoughts!
Your messages will get sent to my phone and I’ll get a notification so I can respond right away if I’m free. If you have something too long to fit here, consider sending me an email.

This chat widget is shared across all pages on the site where it is enabled, so I can’t tell what page you’re on—if you have comments about a certain page or post, let me know which one you’re reading. If I don’t respond right away, you can close this page and check back later. Your session will end if you clear your cookies for this site.

The only information I have about you is a randomly generated session identifier which is created at the start of your session.

Natural Latents and Aesthetic Categorization

How are we arriving at this project and why is it interesting/useful?

Well, let me introduce natural latents, something from johnswentworth which I still need to learn more about. The rough idea is that any intelligence might form similar abstractions about the world, the most useful and efficient abstractions. The minimum required reading on natural latents:

Natural latents are pretty handy. If a variable is a natural latent over some parts of a system, then I know it’s the smallest summary of everything about one part relevant to the others, and I know it’s informationally the largest thing which I can learn from a typical subset of the chunks. That makes such latents natural for agents to structure their cognition and language around.

A quick intuitive check for whether something is a natural latent over some parts of a system consists of two questions:

  • Are the parts (approximately) independent given the candidate natural latent?
  • Can the candidate natural latent be estimated to reasonable precision from any one part, or any typical subset of the parts?

The way I’m going to use the words natural and latent are as follows:

A latent is a variable that represents underlying structure in a system, summarizing relevant information about some aspects of the system in a way that supports prediction or inference. It is not directly observed but inferred from the system’s observable parts.

Different choices of latents may exist for the same system. Some might be more mathematically optimal, while others might be heuristically used in practice.

A natural latent is a latent that arises as the mathematically optimal way to summarize dependencies in a system. It is the most efficient and principled way to structure information about a system’s parts, minimizing redundancy while preserving all relevant details.

So latents are basically abstractions, and a category is a kind of abstraction.

My claim is that people don’t form categories which perfectly correspond to natural latents. Instead, often their aesthetic preferences lead them to prefer a categorization which is different from the ‘natural’ one.

See my observation (which is poorly worded) from two years ago where I say all morality can be reduced to aesthetics. I still believe this. In this light, understanding aesthetic categorization preferences has the potential to be useful in aligning artificial intelligence with human values.

A preferred aesthetic categorization can theoretically exactly match a natural latent and is usually influenced by natural latents. I certainly do not want to say that what makes a categorization preference an aesthetic preference is that it differs in some way from the natural categorization.

This leads me back around to a tricky question from last time: what makes an aesthetic preference different from any other preference? What does aesthetic mean here? I will come back to this.

I also want to point out that the set of most efficient abstractions is not universal but depends on perspective, context, goals, etc. So it’s important to pinpoint exactly what natural latents are and aren’t, given this variation.