Why go insane? Some preliminary notes


The ability to go insane is certainly something that could evolve, if it had some utility. For a rational agent with a certain utility function, the act of going insane would entail some combination of:

1. Modifying its own beliefs.

2. Modifying its own utility function.

One might suppose that an agent having such an ability would simply decide to believe "whatever happens, I win". However, this is not something that should evolve. A rational agent should only decide to go insane if it believes that doing so is in its best interests from the point of view of its current beliefs and utility function.

Still, on the face of it, both of these are incredibly stupid things to do. If you believe something that is not true, you might act on the basis of that belief and do something that harms your interests. Modifying your utility function is even worse, you are unlikely to maximize your initial "true" utility function by maximizing some different utility function.

However, suppose that this agent is interacting with other agents that have some ability to read its mind. I'm not supposing telepathy, just the ability to infer internal thought processes from the agent's actions. To make this easier, they might first take the agent down to the pub and get it drunk. Humans do seem to have some ability to do this.

This creates the possibility, the agent will know, of others responding not only to its actions but to its beliefs and goals.

One nice thing about the ability to go insane is that it provides the start of an explanation of where utility functions come from. When making an AI, one has to specify a utility function, and it is hard to consider an AI with a fixed utility function as being truly intelligent or conscious, even if that utility function has been artificially evolved. However an AI that can change its goals maybe is in with a chance.

In a society of such agents, is there a fixed point for the utility function? Will any society of agents with this ability tend to some specific utility function? This would seem to be our society's inescapable fate, so let's hope it is a nice one. I am cautiously optimistic.

By the way, there are a number of basic misconceptions that need to be addressed before having a sensible discussion about this topic:

Finally, there is sometimes a notion that once a human level of intelligence is achieved, we can somehow transcend these mechanistic game theoretic models of behavior. Peter Singer is one person I've seen play with this opinion. I'm not sure if Dawkins goes that far too. Christians are pretty keen on it. Free will.

A non-mechanistic theory is (I think) necessarily not a comprehensible one, nor one that can be conveyed by language. I can't prove that there is not something like this at work, I'm not even sure how to go about asking the right question about it. What I am hoping to show is that it is not a necessary thing to suppose. We can form an argument that some kind of motivation higher than pure selfishness will bootstrap itself into existance without the help of a by-definition-incomprehensible influence.