Epistemology Sequence, Part 4: Updating Ontologies Based On Values

What’s a good ontology?

Well, the obvious question is, good relative to what?

Relative to your values, of course. In the last post I talked about how, given an ontology for describing the world, you can evaluate a situation.  You compute the inner product


where V represents a value function on each of the concepts in your ontology, and f(N) is a function of which concept nodes are “active” in a situation, whether by direct perception, logical inference, predictive inference, association, or any other kind of linkage.  For instance, the situation of being a few yards away from a lion will activate nodes for “tan”, “lion”, “danger”, and so on.

If you can evaluate situations, you can choose between actions. Among all actions, pick the one that has the highest expected value.

One particular action that you might take is changing your ontology.  Suppose you add a new node to your network of concepts.  Probably a generalization or a composition of other nodes. Or you subtract a node.  How would you decide whether this is a good idea or not?

Well, you build a model using your current ontology of what would happen if you did that. You’d take different actions.  Those actions would lead to different expected outcomes. You can evaluate how much you like those outcomes using your current ontology and current values.

For  modeling the world, the kinds of things you might optimize for are accuracy (how often does your model come up with correct predictions) and simplicity (how few degrees of freedom are involved.)  This is often implemented in machine learning with a loss function consisting of an error term and a regularization term; you choose the model that minimizes the loss function.

Notice that, in general, changing your ontology is changing your values. You can’t prioritize “civil rights” if you don’t think they exist.  When you learn that there are other planets besides the Earth, you might prioritize space exploration; before you learned that it was possible, you couldn’t have wanted it.

The question of value stability is an important one. When should you self-modify to become a different kind of person, with different values?  Would you take a pill that turns you into a sociopath?  After all, once you’ve taken the pill, you’ll be happy to be free of all those annoying concerns for other people.  Organizations or computer programs can also self-modify, and those modifications can change their values over time.  “Improvements” meant to increase power or efficacy can cause such agents to change their goals to those that present-day planners would find horrifying.

In the system I’m describing, proposed changes are always evaluated with respect to current values.  You don’t take the sociopath pill, because the present version of you doesn’t want to be a sociopath. The only paths of self-modification open to you are those where future states (and values) are backwards-compatible with earlier states and values.

The view of concepts as clusters in thingspace suggests that the “goodness” of a concept or category is a function of some kind of metric of the “naturalness” of the cluster.  Something like the ratio of between-cluster to within-cluster variance, or the size of the margin to the separating hyperplane.  The issue is that choices of metric matter enormously.  A great deal of research in image recognition, for example, involves competing choices of similarity metrics. The best choice of similarity metric is subjective until people agree on a goal — say, a shared dataset with labeled images to correctly identify — and compete on how well their metrics work at achieving that goal.

The “goodness” or “aptness” of concepts is a real feature of the world. Some concepts divide reality at the joints better than others. Some concepts are “natural” and some seem contrived.  “Grue” and “bleen” are awkward, unnatural concepts that no real human would use, while “blue” and “green” are natural ones.  And yet, even blue and green are not human universals (the Japanese ao refers to both blue and green; 17th century English speakers thought lavender was “blue” but we don’t.)  The answer to this supposed puzzle is that the “naturalness” of concepts depends on what you want to do with them.  It might be more important to have varied color words in a world with bright-colored synthetic dyes, for instance; our pre-industrial ancestors got by with fewer colors.  The goodness of concepts is objective — that is, there is a checkable, empirical fact of the matter about how good a concept is — but only relative to a goal, which may depend on the individual agent.  Goals themselves are relative to ontology.  So choosing a good ontology is actually an iterative process; you have to build it up relative to your previous ontology.

(Babies probably have some very simple perceptual concepts hard-coded into their brains, and build up more complexity over time as they learn and explore.)

It’s an interesting research problem to explore when major changes in ontology are desirable, in “toy” computational situations.  The early MIRI paper “Ontological crises in artificial agent’s value systems” is a preliminary attempt to look at this problem, and says essentially that small changes in ontologies should yield “near-isomorphisms” between utility functions.  But there’s a great deal of work to be done (some of which already exists) about robustness under ontological changes — when is the answer spit out by a model going to remain the same under perturbation of the number of variables of that model?  What kinds of perturbations are neutral, and what kinds are beneficial or harmful?  Tenenbaum’s work on learning taxonomic structure from statistical correlations is somewhat in this vein, but keeps the measure of “model goodness” separate from the model itself, and doesn’t incorporate the notion of goals.  I anticipate that additional work on this topic will have serious practical importance, given that model selection and feature engineering is still a labor-intensive, partly subjective activity, and that greater automation of model selection will turn out to be valuable in technological applications.

Most of the ideas here are from ItOE; quantitative interpretations are my own.

3 thoughts on “Epistemology Sequence, Part 4: Updating Ontologies Based On Values

  1. >An “ideal” ontology would be some kind of a fixed point; a structure that no longer wanted to self-modify.

    Attractors in mind space could be a major risk in the astronomical waste sense. We don’t know if we are right next to one. Strong attractors would be ones surrounded by incentives to self modify in their direction and minimize concerns that you are headed for a local maxima it will be hard to escape from.

  2. Hi. I’m enjoying this entire epistemology sequence!

    One small quibble with this entry, though: I think most of the time when considering a new ontology, the previous ontology + values won’t have anything to say about it.

    An example: you are a robot in a corridor. Your starting ontology O1 says that positions in the corridor correspond to integers between 1 and 100, inclusive. Your values say to go to the highest numbered position you can, U: O1 -> R, U(i) = i.

    Then you find out that there are ten times as many positions are you thought: the new ontology O2 corresponds to the integers between 1 and 1000. The investigation that produced O2 also gave you map f: O2 -> O1, f(i) = ceil(i/10).

    What does O1 and U say about O2? Nothing, as far as I can tell. Even if you extend grant the robot the ability to run simulations of how it would behave under O2 plus a not yet specified utility function U2, it seems like for any reasonable U2, the robot will go to a O2-position which projects via f to the same O1-position it went to before.

    (I have a post at https://plus.google.com/u/0/+ThomasColthurst/posts/5fcw5wgVrpj which talks about this stuff, and specifically discusses how to get U2 using meta-preferences and ideas from anti-aliasing in image/signal analysis.)

    • I think this is exactly how it’s supposed to work; many new ontologies should spit out the same optimal decision, because they’re describing the same world in different terms. In such a case an agent would be indifferent between choice of ontologies (or you’d set up some kind of rule regarding how to break the symmetry.)

      I like your thoughts on meta-preferences! I may need to talk shop with you some more. (I’m at srconstantin@gmail.com)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s