Are Natural Abstractions Robust to Ontology Shifts?
It’s often said that one of the motivators or desiderata for natural abstraction is robustness to ontology shifts.
Phrased like that, it almost seems impossible—doesn’t an ontology shift by definition entail throwing away your old abstractions and replacing them with new (hopefully better) ones?
Well, Guaranteed Translatability (though it does have some strong assumptions which we need to grapple with) wants to say something like this: if two different agents are both modeling a certain section of the world correctly, then you should be able to translate both ways between their concepts which explain that section.
This works with one agent which is learning, comparing between two different times: if an agent had a good model of a part of the world back in the day with its naive ontology, and now it has a bigger or different ontology which is still making the same predictions over this part of the world, its old concepts should be expressible in terms of the new concepts and vice versa!
In other words, you have guaranteed translatability between the ontologies of past and future versions of the same agent, insofar as both versions of the agent make the same predictions and are using natural abstractions; and you can restrict this to a subset of observables.
So two agents don’t have to agree on absolutely everything in the world, just on one bit of it; and they don’t get translatability everywhere, just with regards to this particular bit of the world which they agree on.
Well that seems really nice, maybe too good to be true—is it true? I said this is what Guaranteed Translatability wants to say—what does it say?
I’ll give the statement in jargon first, and then unpack it. Guaranteed Translatability says that if two agents both have their own (approximate) natural latents over a shared set of observables, and they both have the same probability distributions over those observables, then there is an (approximate) bijection between their natural latents.
Now let’s see how this formal statement could hold without implying the stronger intuitive claim from earlier.
- could be that agents looking at the same part of the world have different observables (understanding what an observable is)
- could be that