The good life according to artificial general intelligence

Philosophy of life has hitherto been untouched by AGI research. I’m here to shake things up. At the end I will include a spicy counter-argument to existential nihilism as well.

Instrumental and Final goals

We need to begin with some introductory definitions.

An actor is something that can be modelled as being able to perform actions in an environment and get feedback from said actions in the environment. An example of an actor is a human, like you.

An actor has goals which is a particular experience state that is to be achieved. Actions are performed in order to achieve a goal. There is a large number of possible goals, for example:

Achieving as much happiness as possible (maximising)
Reaching an adequate level of happiness (satisficing)
Following a set of rules (including divine rules) (deontology)
Helping people (altruism)
Making as many needles as possible (maximising)

That’s probably pretty clear. But did you know that there are two kinds of goals: final and instrumental? An instrumental goal is a goal that’s used to achieve another goal. If you have a goal to travel to a nearby city, then an instrumental goal could be to buy a ticket.

A final goal is a goal an actor has not because of any other goal, it is something it does just cause.

Humans’ final goals are very complicated and have evolved over millions of years of evolution, and animals and plants also have final goals that are likely simpler. Instrumental goals are what you set to achieve your final goals. If one’s final goal is to go to New York, then perhaps an instrumental goal is to book a train ticket to New York, for example. Other beings have other final goals, a hyperintelligent artificial intelligence may have as a final goal to create as many needles as possible and kill people, perhaps wipe out all of humanity as an instrumental goal to be able to create more needles. It needs people’s atoms to be able to create multiple needles after all. By the orthogonality thesis, final goals and intelligence are completely unrelated.

Instrumental Convergence

One might expect that the final goal is highly important in determining which instrumental goals should be pursued to achieve those goals. But this is not really the case. It can be shown that some instrumental goals are useful for an large amount of final goals. The following will be a list of some convergent instrumental goals.

Fractal goal creation

A vague goal is rarely directly actionable. The amount of actions necessary to achieve one’s final goal can sometimes be so large as to get a combinatorial explosion of possible actions to achieve such a goal. In such an instance it would be better to split up an instrumental goal into smaller instrumental goals in a directed graph structure with no loops. Many different instrumental goals can even have the same sub-goals (such as getting a promotion and increasing your social circle may require showering as a sub-goal). Basically, sub-instrumental convergence.

Resource Acquisition

Resources can be anything from wealth to contacts. This should not, as is often the case, be seen as a final goal. Some people take a career to be a final goal, a type of wealth and social resource aquisition, rather than an instrumental goal and neglect other aspects of their life such as hobbies and family in order to serve their final goal. This is a mistake. Any type of resource acquisition should only be done in favour of a final goal rather than be seen as a final goal itself. Otherwise one gets into a paperclip-maximiser situation whereby one just keeps producing something without any regard for why it’s being produced.

Extended cognition

Storing memory on a material substrate other than your brain is still a part of your memory system. Your experience of a memory from recalling it stored in the brain and looking at a picture or text on a computer is not that different, except the computer has a higher resolution and lower forget rate.

Research & Exploration

Another great convergent instrumental goal is to have a large exploration coefficiant. For a human this means reading a lot of high quality books and papers to build better models of the world. Studying logic and philosophy helps one to build better models of logical reasoning skills for truth preservation during reasoning.

A study into mathematics and scientific fields will also be an instrumentally convergent goal. The more one can model the world and the better one’s models of the world is, the more predictable it becomes which allows one to take actions that will achieve one’s goals with more certainty. As the probability of failure decreases, one is able to more effectively achieve one’s goals.

Trying to maximise one’s intelligence is then a convergent instrumental goal. For some goals, such as avoiding a piece of information or knowledge, it would not be useful. However for most goals it is.

Join a collective intelligence

One thing more intelligent than one general intelligence is several general intelligences. Join local groups, research teams and make intelligent friends that you can discuss ideas with, or to work together with in research and exploration. The groups you join should have similar final or instrumental goals as yourself.

Supporting Democracy over Autocracy

For instance, a group that is organised under strong dictatorship might behave as if it had a will that was identical to the will of the subagent that occupies the dictator role, whereas a democratic group might sometimes behave more as if it had a will that was a composite or average of the wills of its various constituents. (Boström page 202)

The value for yourself in a dictatorial system is then you submitting to someone else’s final goal, which may or may not go against your own final goal. A democratic system can be different, since you have control over the final goal of the entire system. You have especially more power if you join a collective intelligence and together fight to change the final goal of the entire system.

Rational egoism implies a type of altruism

A human can override, at least partially, the part of the mind that treats the needs of other human beings as final goals. This might seem to be a bad idea, but I will show that in general, this is for the better.

Whenever we see people as final goals in themselves, then we feel obliged to help people and so on, even if they’re toxic or unhelpful. By overriding this system we can begin to analyse why we spend time with certain people instead of treating being with them as an final goal itself. Only spend time with those people who bring positive utility in your life, like those who are fun to be around, those who you do projects with, those you study with, sometimes family for legal reasons (and sometimes other reasons too if your family members are also your friends) and so on.

I used to not enjoy small talk, but I’ve since found a good instrumental reason for participating in small talk. At each sentence spoken by someone else during small talk, there’s a certain probability that a piece of useful information will be conveyed. Usually about everyday stuff, like how to use or the existence of a specific tool, tips for dealing with something like disease or a broken door, useful information about people like which contacts they have and so on. This has given me the motivation to participate in small talk in a way that I haven’t been able to do before.

Happiness as the only final goal

Now if we are going to reprogram one’s human mind to support a new final goal, which one should that be? It turns out mathematically formalising the concept is a good way to figuring this out. Let’s first make the final goal into something numerical, like a number. This number could represent any number of final goals being achieved, or how well they have been achieved. We call this number “utility”. How can this be calculated? Well, an actor experiences some set of experiences and has preferences on how well those experiences lead to one’s final goals. Those experiences can be consumed several times, we can call this the experience’s rate of consumption.

So if we have some preferences p1,…pn for experiences which are consumed at a rate of c1,…cn then the totally utility U=p1c1+p2c2+…+pncn. Now what determines our preferences? It’s the pleasure associated with the consumption. Thus utility (the mathematical definition) and happiness (the experience of pleasure itself) are thus equal to eachother.

But this leaves open the possibility of more possible final goals. However, only one final goal is possible. For the sake of contradiction, assume there exists two final goals G and H. Then say the actor has a choise between two actions A and B where A helps achieve G while unachieving H and where B helps achieve H while unachieving G. Then the actor needs a priority ordering on A and B such that if A>B then G is the only final goal, or B>A then H is the only final goal.

Counterargument against existential nihilism

The claim is that everything one does is pointless, i.e. there are no final goals. This is false. If you agree that instrumental goals exist, then it follows that final goals must exist. There are 4 possibilities:

If one has instrumental goals A, B,…Z such that A needs to be achieved to achieve B, B is needed to achieve C, etc. then goal Z will become a final goal by definition of final goals.

The alternative, an infinite number of instrumental goals A,B,… is impossible, as there is not an infinite amount of memory in anyone to preserve so many instrumental goals.

Cyclic goals A,B,…,Z,A do not work either, as there is no well-defined starting point/first goal to achieve the rest. If you somehow manage to achieve A despite not having achieved Z, you get B,…,Z anyway, which is the same as in 1.
The final alternative is to deny the existence of instrumental goals, in which case one denies action. But action exists, therefore instrumental ends exist, therefore final ends exist. This has nothing to do with free will, for an empirical claim that there seems to be action and that goals is a best explanation for why actions occur is rational enough to believe that it exists with high probability.

Now an existential nihilist may still take it to be an issue of modality, i.e they might calim that “there does not exist a final goal G such that □G” (necessarily G, for example “necessarily produce paperclips”). In which case I do not have a counterargument, all I can do is make a weaker statement that final goals are created by oneself and there does not exist any necessary goals.

Do not commit suicide

Death has very high degrees of uncertianty, and due to the lack of possibility for inductive reasoning and model building of one’s experience post-death then suicide could potentially lead to lower or higher degrees of happiness with a rational prior being put to 50% for both options. It would be irrational to commit suicide if the goal is to maximise happiness, the expected value of suicide is zero gain in happiness. The only exception would be with some terminall illnessess where the future expected happiness is always below 0.

Read More

Superintelligence (by Nick Boström)

https://www.lesswrong.com/tag/terminal-value

https://www.lesswrong.com/tag/orthogonality-thesis

Source link