© 2002 Cambridge University Press
ALTRUISM AND SELFISHNESS
Howard
Rachlin
Psychology
Department
State
University of New York
Stony
Brook, New York, 11794-2500
(212)
632-7807
e-mail:
howard.rachlin@sunysb.edu
(13,641 words in all)
Revision:
8/01
Long
Abstract: Many situations in human life present choices
between (a) narrowly preferred particular alternatives and (b) narrowly less
preferred (or aversive) particular alternatives that nevertheless form part of
highly preferred abstract behavioral patterns. Such alternatives characterize
problems of self-control. For example, at any given moment, a person may accept
alcoholic drinks yet also prefer being sober to being drunk over the next few
days. Other situations present choices
between (a) alternatives beneficial to an individual and (b) alternatives that
are less beneficial (or harmful) to the individual that would nevertheless be
beneficial if chosen by many individuals. Such alternatives characterize
problems of social cooperation; choices of the latter alternative are generally
considered to be altruistic. Altruism, like self-control, is a valuable
temporally-extended pattern of behavior. Like self-control, altruism may be
learned and maintained over an individual’s lifetime. It needs no special
inherited mechanism. Individual acts of altruism, each of which may be of no
benefit (or of possible harm) to the actor, may nevertheless be beneficial when
repeated over time. However, because each selfish decision is individually
preferred to each altruistic decision, people can benefit from altruistic
behavior only when they are committed to an altruistic pattern of acts and
refuse to make decisions on a case-by-case basis.
Short
Abstract:
Many situations in human life present choices between particular and abstract
alternatives. Such choices characterize both problems of self-control and
problems of social cooperation. Choices of social good at a cost to the
particular individual are generally considered to be altruistic. Altruism, like
self-control, is a valuable temporally-extended pattern of behavior. Like
self-control, altruism may develop over an individual’s lifetime. It needs no
special inherited mechanism. Individual acts of altruism, each of which may be
costly to the actor, may nevertheless be beneficial when repeated over time.
However, because each selfish decision is individually preferred to each
altruistic decision, people can benefit from altruistic behavior only when they
are committed to an altruistic pattern of acts and refuse to make decisions on
a case-by-case basis.
Key
Words:
addiction,
altruism, commitment, cooperation, defection, egoism, impulsiveness,
patterning, prisoner’s dilemma, reciprocation, reinforcement, selfishness,
self-control
ALTRUISM
AND SELFISHNESS
1. Introduction
Biological
Compatibility. Altruism and
selfishness, like free-will and determinism, seem to be polar opposites. Yet,
as with free will and determinism (Dennett, 1984), the apparent incompatibility
may be challenged by various forms of compatibility. From a biological
viewpoint selfishness translates into survival value. Evolutionary biologists
have been able to reconcile altruism with selfishness by showing how a
biological structure mediating altruistic behavior could have evolved. (The next
section will briefly summarize one such demonstration.) This structure is
assumed to be more complex than ordinary mechanisms that mediate selfish
behavior but in essence is no different from them. The gazelle that moves
toward the lion (putting itself in danger but showing other gazelles where the
lion is) may thus be seen as acting according to the same principles as the
gazelle that takes a drink of water when it is thirsty. The desire to move
toward the lion stands beside the desire to drink.
Evolutionary biologists do not
conceive of behavior itself being passed from generation to generation; rather,
some mechanism, in this case an internal mechanism – a structure of nervous
connections in the brain – is hypothesized to be the evolving entity. Altruism
as it appears in behavior is conceived as the action of that mechanism
developed over the lifetime of the organism. Tooby and Cosmides (1996, p. 125)
compare the structure of the altruism mechanism to that of the eye: “We think
that such adaptations will frequently require complex computations and suspect
that at least some adaptations for altruism may turn out to rival the
complexity of the eye.”
This biological compatibility makes
contact with modern cognitive and physiological psychology (Sober & Wilson,
1998). Cognitive psychology attempts to infer the mechanism’s principles of
action (its software) from behavioral observation and manipulation while
physiological psychology attempts to investigate the mechanism itself (its
hardware).
From the biological viewpoint,
altruistic acts differ from selfish acts by virtue of differing internal
mediating mechanisms; altruism becomes a motive like any other. In this view, a
person leaves a tip in a restaurant to which he will never return because of a
desire for fairness or justice, a desire generated by the restaurant situation
and the altruistic mechanism within him, which is satisfied by the act of
leaving the tip. Similarly, he eats and drinks at the restaurant because of
desires generated by internal mechanisms of hunger and thirst. For the
biologist, Person A’s altruistic behavior (behavior that benefits others at a
cost to A) would be fully explained if Person A were shown to possess the
requisite internal altruistic mechanism. Once the mechanism were understood, no
further explanation would be required.
The problem with this conception,
from a behavioral viewpoint, is not that it postulates an internal mechanism as
such. (After all, no behavior is possible without internal neural structure.)
The problem is that in focusing on an inherited internal mechanism, the role of
learning over an organism’s lifetime tends to get ignored. To develop normally,
eyes have to interact with the environment. But we inherit good eyesight or bad
eyesight. If our altruism mechanisms are like our visual mechanisms we are
doomed to be more or less selfish depending on our genetic inheritance. This is
a sort of genetic version of Calvinism. Experience might aid in the development
of altruistic mechanisms. Environmental constraints imposed by social
institutions – family, religion, government – might act on selfish motives
(like glasses on eyesight) to make them conform to social good. But altruistic
behavior as such, according to biological theory, would depend (as eyesight depends)
much more on genes than on experience. The
present article, does not deny the existence of such mechanisms. A large part of human altruism and a still
larger part of nonhuman altruism may well be explained in terms of inherited
mechanisms based on genetic overlap.
However, the mechanisms underlying these behaviors would have evolved
individually. The mechanism responsible
for the ant’s self sacrifice in defense of a communal nest would differ from
that responsible for a mother bear’s care for her cubs. There remains some fraction of altruistic
action, especially among humans, that cannot be attributed to genetic
overlap. For the remainder of this
article I will symbolize such actions by the example of a woman who runs into a
burning building to save someone else’s child.
I mean this example to stand for altruistic actions not easily
attributed to genetic factors.
Biological compatibility attributes such an act not to a specific
mechanism for running into burning buildings but to a general mechanism for
altruism itself. The present article will argue that it is unnecessary to
postulate the existence of such a general mechanism. I claim, first, that altruism may be learned over an individual’s
lifetime and, second, that it is learned in the same way that self-control is
learned – by forming particular acts
into coherent patterns of acts. The
woman who runs into a burning building to save someone else’s child does so not
by activating an innate self-sacrificing tendency but by virtue of the same
learning process she uses to control her smoking, drinking or weight.
Behavioral
Compatibility. For biological compatibility, selfishness translates into survival
value; for behavioral compatibility, selfishness translates into reinforcement.[1] From a behavioral viewpoint, an altruistic act is not motivated, as an
act of drinking is, by the state of an internal mechanism; it is rather a
particular component that fits into an overall pattern of behavior. Given this,
the important question for the behaviorist is not, “What reinforces a
particular act of altruism?” – for this particular act may not be reinforced;
it may never be reinforced; it may be punished – but, “What are the patterns of
behavior that the altruistic act fits into?”
To
explain why a woman might risk her life to save someone else’s child it would
be a mistake to look for current or even future reinforcers of the act itself.
By definition, as an altruistic act, it is not reinforced. In economic terms,
adding up its costs and benefits results in a negative value. Some
behavioristic analyses of altruism have tried to explain particular altruistic
acts in terms of delayed rather than immediate reinforcement (Ainslie, 1992;
Platt, 1973). But delayed reinforcers, after being discounted, may have
significant present value, even for nonhumans (Mazur, 1987). If the present
value of a delayed reward is higher than the cost of the act, it is hard to see
how the act can be altruistic. It is certainly not altruistic of the bank to
lend me money just because I will pay them back later rather than now. If the
woman who risked her life to run into the burning building to save someone
else’s child were counting on some later reward or sequence of rewards to
counterbalance her risk (say ten million dollars, to be paid over the next ten
years, offered by the child’s parents), her action would be no more altruistic
than that of the bank when it lends me money.
This
narrow behavioral view of altruism has been criticized by social psychologists
(Edney, 1980, for example) but the criticism focuses mostly on the behaviorism
rather than on the narrowness of the view. These critics have merely replaced,
as an explanatory device, the present action of delayed rewards with the
present action of internal mechanisms. I argue here that it is a mistake to
look for the cause of a specific altruistic act either in the environment or in
the interior of the organism. Rather, the cause of the altruistic act is to be
found in the high value (the reinforcing value, the survival value, the
function) of the act as part of a pattern of acts, or as a habit (provided habit is seen
as a pattern of overt behavior extended in time rather than, as it is sometimes
seen in psychology, as an internal state). According to the present view, a woman
runs into a burning building to save someone else’s child (without the promise
of money) not because she is compelled to do so by some internal mechanism nor
because she has stopped to calculate all costs and benefits of this particular
act; if she did stop to calculate she would arrive at a negative answer and not
do the act. Rather, this act forms part of a pattern of acts in her life, a
pattern that is valuable in itself, apart from the particular acts that compose
it. The pattern, as a pattern of overt behavior, is worth so much to her that
she would risk dying rather than break it.
Biological compatibility says that a
particular altruistic act is itself of high value by virtue of an inherited
general altruistic mechanism. Learning
would enter into the development of altruism, according to biological
compatibility, only in the minimal sense that a baby has to learn how to
eat. The mechanism is there, the
biologist says; you need only to learn how to use it. Behavioral compatibility says, on the other hand, that the
altruistic act itself is of low value and remains of low value. What is highly valued is a temporally
extended pattern of acts into which the particular act fits. The role of the hypothesized internal
altruistic mechanism in biological compatibility – to provide a motive for
otherwise unreinforced particular acts – is taken, in behavioral compatibility,
by the highly valued pattern of acts.
Learning of altruism, the behavioral compatibilist says, is learning to
perform relatively low valued particular acts as part of a highly valued
pattern. Thus, from a behavioral
viewpoint, particular altruistic acts are not in themselves fundamentally
selfish; rather, an altruistic act is selfish only by virtue of the high value
of the pattern.
Teleological
Behaviorism. The kind of behaviorism that this view embodies is called,
“teleological behaviorism” (Baum, 1994; Rachlin, 1994; Stout, 1996).
Aristotle’s psychology and ethics are behavioristic in this teleological sense:
for Aristotle, a particular action has no meaning by itself; the meaning of an
action resides in habits of overt behavior as they are played out in time, not
in internal mechanistic or spiritual events; whether a particular act is good
or bad depends on the habit into which it fits. In Aristotle’s conception of
science, habits are final
causes of the particular acts that comprise them. While a
particular ethical act may be caused (in the sense of efficient cause) by the action of an internal mechanism, it is caused (in the sense of
final
cause)
by an abstract pattern of overt behavior. It is the final cause that determines
whether the particular act is good or bad, altruistic or selfish.
Teleological behaviorism retains
Aristotle’s final-cause system of explanation in psychology. For example, it
explains motives in terms of habits rather than habits in terms of motives. It
is at least arguable that we will not be able to uncover the mechanisms
underlying altruistic behavior until we gain a clear idea of what altruistic
behavior is in its own terms – as a kind of habit. That is the purpose of this
target article.
Outline.
Altruism and selfishness were introduced in Section 1 as apparently
contradictory but nevertheless compatible behaviors. Particular altruistic acts
are compatible with a larger selfishness – selfishness on a more abstract
level. The introduction is followed in Section 2 by a discussion of group
selection, a biological compatibility between altruism of the individual
relative to other members of a group and selfishness (increased survival) of
group members relative to those of other groups. Section 3 draws an analogy
between group selection and self-control; just as particular acts of
self-sacrifice are compatible with a more abstract benefit to a group of individuals,
so particular unreinforced acts are compatible with a more abstract long-term
benefit to the individual. Section 4 tightens the analogy with more formal
definitions of both self-control and altruism. Whether a person acts
impulsively or selfishly on the one hand versus temperately or altruistically
on the other depends on the degree to which that person structures particular
acts in patterns. Such structuring is discussed in Section 5 on commitment. If
the analogy between self-control and altruism reflects a fundamental
correspondence, altruism may be explained as self-control has been explained –
as a choice between high valued particular acts and higher valued patterns of
acts. Section 6 describes how the principles of reinforcement and punishment, which
have been used to determine the value of self-control alternatives, may apply
to social cooperation. Section 7 presents an experiment showing that behavior
in a laboratory social-cooperation game depends strongly on the game’s context. Sections 8 and 9 deal with potential
objections. Section 8 claims that
altruism cannot be fully explained in biological terms, without the concept of
reinforcement. Section 9 claims that altruism
cannot be fully explained in Skinnerian terms, without the concept of intrinsic
reinforcement of behavioral patterns. Section 10 concludes that altruism as
well as self-control involves organization of behavior in patterns and choosing
among patterns as wholes.
2. Group Selection
Biologists have speculated that the degree of common
interest between organisms is fundamentally reflected in their shared genes
(Dawkins, 1976). The innate tendency of
any organism to sacrifice its own interests for those of another organism would
then depend on the degree to which their genes overlapped. To the degree that closeness of familial
relationship correlates with genetic overlap, innate altruism should be
greatest within families and decrease as overlap decreases in the population. The behavior of a mother who ran into a
burning building to save her own child would thus be
explained. But the many documented cases of altruism with respect to strangers
(that of saints, heros, and the like) would not be explained. Why would a mother ever run into a burning
building, risking her own life (100% genetic overlap with herself), to save someone else’s child?
Some principle other than genetic
overlap seems to be necessary to explain the inheritance of an altruism that
goes beyond the family. Recently, Sober & Wilson (1998) described such a
principle – “group selection of altruism.”
To understand group selection you first have to understand a kind of
social contingency called, “The Prisoner’s Dilemma.” An example of a prisoner’s
dilemma game (in this case, a multi-person prisoner’s dilemma) is a game that I
have, for the last ten years or so, been playing with the audience whenever I
present the results of my research at university colloquia or conferences. I
begin by saying that I want to give the audience a phenomenal experience of
ambivalence. Index cards are then
handed to 10 randomly selected people and the others are asked to imagine that
they had gotten one of the cards. They
choose among hypothetical monetary prizes by writing either Y or X on the card. The rules of the
game (projected on a screen behind me while I talk) are as follows:
1. If you choose Y
you get $100 times N.
2. If you choose X
you get $100 times N plus a bonus of $300.
3. N equals the
number of people (of the 10) who choose Y.
Then
I point out the consequences of each choice as follows: “You will always get
$200 more by choosing X than by choosing Y. Choosing X
rather than Y decreases N by 1 (Rule #3), costing you $100; but if you chose X you also gain the $300 bonus (Rule # 2). This results in a $200 gain for choosing X. Logic therefore says that you
should choose X, and any lawyer would advise you to do so. The problem is that if you all followed the advice of your
lawyers and chose X, N = 0, and each of you would get $300; while if you all ignored the
advice of your lawyers and chose Y, N = 10 and each of you
would get $1,000.” Sometimes, depending
on the audience, I illustrate these observations with a diagram like Figure 1
(bold labels).
Then I ask the 10 people holding
cards to make their choices, imagining as best they can what they would choose
if the money were real, and letting no one else see what they have chosen. Then I collect the cards and hold them until
I finish my lecture. I have done this
demonstration or its equivalent dozens of times with audiences ranging from
Japanese psychologists to Italian economists.
The result is about an even split between cooperation
(choosing Y) and defection (choosing X), indicating
that the game does create ambiguity. Although the money won by members of my
audiences is entirely hypothetical, significant numbers of subjects in similar
experiments in my laboratory, with real albeit lesser amounts of money, have
also chosen Y.[2]
Figure
1 (labels in bold typeface) represents the contingencies of the prisoner’s
dilemma game that I ask my audience to play. Point A represents the
condition where everyone cooperates. Point C represents the condition where everyone defects. The line from A to C represents the average (hypothetical) earnings per person at each
value of N (the inverse of the
x-axis). Clearly, the more people who
cooperate, the greater the average earnings. But, as is shown by the two lines, ABC (representing the
return to each player who defects) and ADC (representing the
return to each player who cooperates), an individual always earns more by
defecting than cooperating.
Suppose,
instead of hypothetically giving money to each player, I instead pooled the
money each player earned (still hypothetical) and donated it to the
entertainment fund of whatever institution I were lecturing at. Given this common interest it would now pay
for every individual to choose Y; a choice of Y by any individual would increase N by 1 for all 10 players, gaining $1,000 at a
cost of the individual player’s $300 bonus, for a net gain to the pool of
$700. A common interest thus tends to
reinforce cooperation in prisoner’s dilemma games.
Figure 1. Contingencies of 10-person prisoner's dilemma experiment. Italic typeface: Contingencies of self-control, "primrose path", experiment (1 player, successive choices) to be described later . [In brackets]: Contingencies faced by alcoholic. In all three cases, particular choices of X [having a single drink] are always worth more than particular choices of Y [refusing a single drink] yet on the average it is better to choose Y [to drink at a low rate].
Group
selection relies on common interest. A
highly simplified version of group selection runs as follows: Consider a
population of organisms divided into several relatively isolated groups
(tribes, for example). Within each tribe there are some altruists and some
selfish individuals (“egoists”) interacting with each other repeatedly in
multi-person prisoner’s-dilemma-like games such as the one with which I
introduce my lectures, except instead of monetary reward the players receive
more or less fitness – ability to reproduce.
In these games the altruists tend to cooperate while the egoists tend to
defect. Within each group (as in the
prisoner’s dilemma) altruists always lose out to egoists. However, those groups originally containing
many altruists grow much faster than those originally containing many egoists –
because cooperation benefits the group more than defection does.
Consider
the case of teams, such as basketball teams, playing in a league. It is commonly accepted that, all else being
equal, teams with individual players who play unselfishly will beat teams with
individual players who play selfishly; however, within each team, the most
selfish players will score the most points.
Imagine now, instead of scoring points and winning or losing games, the
teams competed for reproductive fitness.
Then the number of players on teams with a predominance of unselfish
players would grow rapidly while that of teams with a predominance of selfish
players would grow slowly or (in competition for scarce resources) shrink – the
group effect. Although, within each team,
selfish players would still increase faster than unselfish ones (the individual
effect), this growth could well be overwhelmed by the group effect. As time goes on, the absolute number of unselfish individuals (altruists) could increase faster
across the whole population than the absolute
number of egoists even though within each group the relative number of altruists
decreases. If the groups remained
rigidly divided, eventually, because the relative number of altruists is always
decreasing within each group, the absolute number would begin to decrease as
well. However, if, before this point is
reached, the groups mixed with each other and then re-formed, the process would
begin all over again and altruists might maintain or increase their gains. Again, this is a highly simplified version of
the argument. But the essential point
is that while individual altruists may always be at a disadvantage relative to
egoists, groups of altruists may be at an advantage relative to groups of
egoists.
Nothing in the present article
argues against group selection.
Organisms may be born with greater or lesser biological tendencies to be
altruistic. But, it does not follow
from group selection that altruistic behavior is incompatible with a larger
individual selfishness. Sober and
Wilson consider only two forms of human selfishness: that selfishness which
desires maximization of consumer goods and that which desires (immediate)
“internal, psychological benefits” (p.2).
They do not consider individual behavior in the long run and in the
abstract. They leapfrog over behavioral
contingencies that may cause behavioral change (contingencies analogous to the
group selection processes they have just developed) and proceed directly to
“delve below the level of behavior” (p 194) to an internal cognitive mechanism
hypothesized to mediate between the biological selective process and altruistic
behavior. Their cognitive psychology
may well be correct but it is not clear how (or even whether), according to
their psychology, altruism might emerge from selfishness over an organism’s
lifetime. If it implies that we are
born with fixed proportions of selfish and altruistic motives and that
experience cannot teach us to alter those proportions then their theory is not
as optimistic as Sober and Wilson seem to think; it will not be of much use to
those of us trying, despite our weaknesses, to live a better life.
3. Altruism and
Self-Control
The
contingencies of my lecture demonstration of ambivalence in a social prisoner’s
dilemma situation correspond to those of
“primrose-path” experiments with individual subjects facing an
intertemporal dilemma (Herrnstein 1991; Herrnstein & Prelec 1992;
Herrnstein et al. 1986; Heyman 1996; Kudadjie-Gyamfi 1998; Kudadjie-Gyamfi
& Rachlin 1996). In the prisoner’s
dilemma situation illustrated in Figure 1 (bold typeface) many subjects each
make a single choice between X and Y. In primrose path experiments,
on the other hand, a single subject makes repeated choices between X and Y. The rules of the primrose
path experiment, usually not told to the subjects, parallel those of the social
cooperation experiments. A typical set
of rules follows:
1. Each choice of Y gains N
points (convertible to money at the experiment’s end).
2. Each choice of X
gains N points plus a bonus of 3 points.
3. N equals the
number of Y choices in the last 10 trials.[3]
Figure 1 (labels in italic typeface) illustrates
these contingencies in a corresponding way to social cooperation. The reward for choosing X is always greater than that for choosing Y but overall reward (proportional to the ordinate of line AC) would be maximized by repeatedly choosing Y. Ambivalence (reflected in
social cooperation dilemmas as non-exclusive choice between X and Y
across subjects) would be reflected, in primrose path
experiments, as non-exclusive choice by individual subjects across trials. Indeed, in these experiments, subjects
generally distribute choices non-exclusively across X and Y.
Complex
as it is, Figure 1 is a highly simplified picture of real-world
complexity. Lines AD and BC need not be parallel or straight or even monotonic (Rachlin, 1997,
2000). High rates of consumption,
harmful in one context, may be not harmful, or may be beneficial, in
others. Nevertheless, the ambivalence
represented by Figure 1 is real and captures everyday-life problems of
self-control as well as everyday social dilemmas.
The
labels in brackets in Figure 1 illustrate the application of this model to
alcoholism. Let us say that point A represents a low rate of drinking (one or two glasses of wine with
dinner). Dinner would be more
enjoyable, however, with three glasses of wine and perhaps a cocktail
beforehand (point B). But this much drinking every
evening might interfere with sleep, or cause a hangover the next morning, or be
slightly damaging to health. That is,
notwithstanding the distinct pleasure of the extra drinking, the average value
of the drinker’s state over time (line AC) would be ever
so slightly lower as rate of drinking moves one unit to the right. Further
increases in the number of drinks before, during, or after dinner (or instead
of dinner) would always be immediately preferable to continuing at the lower
rate but, if repeated day after day, would bring average value over time lower
and lower (moving to the right on line AC). Eventually, at point C, drinking would serve only to prevent the misery of descent to point D. In other words, positive reinforcement,
in going from point A to B by the social drinker having an extra drink, would have been replaced
by negative
reinforcement (avoidance of point D) in staying at point C by the
alcoholic continuing to drink at a high rate.
The
model of alcoholism as represented in Figure 1 is highly simplistic. Social drinking
may be more valuable than teetotaling even in the long run. As noted above,
lines AD and BC may not be parallel or even straight (see Herrnstein & Prelec,
1992; Rachlin, 1997, 2000, for discussion of more complex cases). Nevertheless,
the model has suggested several methods of bringing behavior back from
addiction (from point C to A). These include formation of
temporally extended behavior patterns (Rachlin, 1995a; 1995b), substitution of
a “positive addiction” such as social activity for a negative addiction
(Rachlin, 1997), and manipulation of discriminative stimuli so as to signal
changes in overall value (Heyman, 1996; Rachlin, 2000).
The
existence of conflicting reinforcement at the level of particular acts versus
that of patterns of acts makes it at least conceivable that a particular
unreinforced act such as a mother’s running into a burning building to save
someone else’s child may nevertheless be reinforced as part of a pattern of
acts. A group of such acts, every one of them unreinforced (altruistic in the
strict sense), may nevertheless form a highly reinforced – a maximally
reinforced – pattern.
Just
as group selection theory postulates more than one level of selection so there
may be more than one level of reinforcement – reinforcement of particular acts
and reinforcement of groups, or patterns, of acts. Just as the behavior
maximizing benefit to the individual may conflict with the behavior maximizing
benefit to the group (which is what generates ambivalence in prisoner’s dilemma
situations) so a maximally reinforced act may conflict with a maximally
reinforced pattern of acts. I have argued (Rachlin 1995a; 2000) that this
latter type of conflict epitomizes many problems of self-control. I call this
conflict complex
ambivalence,
as opposed to simple ambivalence
in which one response leads to a smaller more immediate reward while an
alternative response leads to a larger more delayed reward.[4]
Platt (1973) pointed out the
relation between “temporal traps” and “social traps.” Temporal traps are
conflicts in the individual between smaller-sooner and larger-later rewards –
situations of simple ambivalence. Social traps are conflicts between rewards
beneficial to the individual and rewards beneficial to the group. Platt
speculated that social traps could be understood as a subclass of temporal
traps. But the correspondence between the two kinds of traps breaks down when
attention is focused on particular choices (Dawes, 1980; Messick and
McClelland, 1983). These authors point out that prisoner’s dilemma problems
such as the one in my class demonstration involve immediate conflicting
consequences for the individual versus the group. The people in the audience
are faced with only one momentary choice. Where is the temporal trap? The
answer is that there is no temporal trap as long as temporal traps are limited
to conditions of simple ambivalence. However, the correspondence of altruism
and self-control is based not on simple ambivalence but on complex ambivalence;
single choice exists in a vacuum. Assuming that their hypothetical choices are
those they would make in a real situation, the members of my audience are
making only one in a series of choices extending to their lives outside of the
lecture hall. Messick and McClelland
say (footnote 1, p. 110), “Obviously, a repeated Prisoner’s Dilemma game
requires a temporal component [that is, it can be explained in terms of
self-control] but the opposition that characterizes a social trap exists
without such repetition.” This assertion highlights a crucial difference
between teleological behaviorism and cognitive psychology. For the teleological
behaviorist there can be no social trap without repetition. All prisoner’s
dilemmas are repeated. If a person were born yesterday, played one prisoner’s
dilemma game, cooperated in that game, and then died today, it would be
impossible to say whether the person’s cooperation were truly altruistic or
just an accident or really, in some other conceivable game, a defection.
4. Definitions of Self-Control
And Altruism
Moral
philosophers at least since Plato have claimed that there is a relationship
between self-control and altruism.[5] The fundamental issue
addressed by ancient Greek philosophy was the relation between particular
objects and abstract entities: abstract ideals for Plato; abstract categories
for Aristotle (Rachlin 1994; Stout 1996). The problem of self-control in cases
of complex ambivalence is a conflict between particular acts such as eating a
caloric dessert, taking an alcoholic drink, or getting high on drugs, and
abstract patterns of acts strung out in time such as living a healthy life,
functioning in a family, or getting along with friend.
Neither
self-control nor altruism is a class of particular movements, operants, or acts. Moreover, while self-control and altruism
are both relative terms, depending on alternatives rejected as well as
alternatives chosen, neither term refers to a particular choice independent of
its context. For example, an
alcoholic’s particular choice of ginger ale over scotch and soda cannot be
self-controlled unless it is embedded in a context of similar choices; if a
person chooses scotch and soda 99 times to each choice of ginger ale, the
choice of ginger ale is in no way self-controlled. The person might have been extremely thirsty at the moment when
ginger ale was chosen, or might have been trying to hide his alcoholism at that
moment, or might have made a mistake in his choice. The alcoholic’s verbal claim that he intended to control his
drinking at that moment would be taken as valid by the behaviorist only in the
light of consistent future choices of ginger ales over scotch and sodas. And
this criterion would hold regardless of the state of his nervous system,
regardless of the activity or lack of activity of any internal mechanism. For the behaviorist, self-control as such
has to lie wholly in choice behavior – but need not lie in any particular act
of choice.
Similarly,
no particular act is altruistic in itself – even a woman’s running into a
burning building and saving a child. If
the woman were normally selfish we would look for other explanations (perhaps
she was just trying to save her jewelry and only incidentally picked up the
child). A truly altruistic act
is always part of a pattern of acts (highly valued by both the actor and the
community) particular components of which are dispreferred by the actor to
their immediate alternatives. Altruistic patterns of acts are thus subsets
of self-controlled patterns. The
particular components of an altruistic pattern, like those of a self-controlled
pattern, are less valuable to the actor than are their immediate alternatives;
however, in the case of altruistic acts, they are also more valuable to the
community than are their immediate alternatives.
Self-control
may be defined more formally as follows: If two alternative activities are
available, a relatively brief activity lasting t units of time, and a longer activity lasting T units of time, where T = nt and n is a positive number greater than one, a self-control problem occurs
when two conditions are satisfied:
1.
The whole longer activity is preferred to n
repetitions of the brief activity, and
2.
The brief activity is preferred to a t-length fraction
of the longer activity.
By “brief activity” and “long
activity” I mean classes of activities perhaps not identical in topography but
classified functionally, as Skinner (1938) defined operant class. For example,
eating a steak dinner at a restaurant and drinking a malted at a lunch counter
might be counted as repetitions of the same brief activity – eating
high-calorie food. The long activity
would be going through a period of time (a day, a month, a year) without eating
high-calorie foods. The choice of the longer activity over a series of choices
of the shorter activity is self-control.[6]
According
to this definition, the “self” underlying self-control is not an internal
entity, spiritual or mechanistic, containing a person’s mental life (including
a more or less powerful “will”). Such an entity would imply what Parfitt (1971)
calls “personal continuity,” a concept he believes we would be better off
abandoning. Rather, the self is conceived as existing contingently in a series
of overlapping temporal intervals during which behavior occurs in patterns
(what Parfitt calls, “contingent personal interactions”). People’s “selves”
would thus evolve and change over their lifetimes, as these patterns evolved
and changed, as a function of social and non-social reinforcement.
Social
cooperation situations may now be seen as a subcategory of self-control
situations. A social cooperation situation exists when, in addition to
Conditions 1 and 2:
3. A group benefits more when an
individual member chooses a t-length fraction of the longer activity than it
does when the individual chooses the brief activity.
An
altruistic act is defined as a choice of the t-length fraction of the longer
activity over the brief activity under Conditions 1, 2, and 3. The size of the
group may range from only two people to the population of the world. The cost
of the altruistic act may be a true cost, as when one anonymously donates money
to charity, or an opportunity cost – the loss of the preferred brief
alternative. Note that in this definition a particular altruistic act need not
be reinforced, either presently or in the future. Reinforcement of altruism is
obtained only when such acts are grouped in patterns that are, as a whole,
intrinsically valuable. Thus, the
woman’s act of running into a burning building to save someone else’s child is
reinforced only insofar as it is part of a highly valued pattern. It may not
itself ever be reinforced and may be punished by injury or death. If the woman
died in the attempt, the act may still have been worth doing since not doing it
would have broken a highly valued pattern.[7]
This
way of thinking about altruism and self-control may seem strange but it is not
at all unusual. It is what Plato meant when he held Socrates’ life (and death)
to be both good (ethical) and happy. It is what many thinkers about ethics,
before and since, have been saying. In 20th century psychology, the gestalt psychologists
emphasized that the whole could be greater than the sum of its parts. They
intended this maxim to apply to motivation or value as much as to perception
(Lewin, 1936). Consider listening to a symphony (assuming you enjoy this
activity) on a CD that you just bought. Your enjoyment apparently begins when
the music begins and ends when the music ends. Now suppose, after listening to
the first 57 minutes of the symphony, you discover that the final three minutes
of the 60-minute piece are missing from the CD. Is your enjoyment of the music
just reduced by 3/60 of what it would have been if the whole symphony were
played? Or is the breaking of the pattern so costly that the missing three
minutes ruins the whole experience? In my own case the latter would be true.
Readers who do not agree may imagine some other temporally extended activity
that would be ruined for them by interruption late in the sequence.
The meaning of a single instrumental
act can be found only in a context of other acts. Conditions 1, 2, and 3 place
the act in such a context. For the
cognitive psychologist, on the other hand, the meaning of a single act is to be
found in the mechanism that immediately and efficiently caused the act. Thus,
for the cognitive psychologist, a single act may be altruistic or not
independent of other acts. Obviously, both cognitive and behavioral
investigations need to be pursued. I am not saying that one is any more
valuable or important than the other. But I do believe that it makes more sense
to say that the behaviorist studies altruism itself while the cognitive
psychologist studies the mechanisms behind it than it does to say that the cognitive
psychologist studies altruism itself while the behaviorist studies only its
behavioral effects.
It seems clear that a person may be
self-controlled without being altruistic. That is, Conditions 1 and 2 may obtain
while Condition 3 does not. Although, given our strong social dependencies,
there is usually some social benefit when a person stops drinking or smoking or
overeating or gambling, such benefits are arguably incidental. The opposite
question, whether a person may be altruistic without being self-controlled,
however, is the one that concerns us here. This question is important because
its answer determines whether people need a special mechanism for altruism,
aside from whatever mechanism mediates self-control. Most demonstrations of
altruistic behavior without egoistic incentives have focused on particular acts
(Caporeal et al., 1989). But it is not possible to determine that a separate
altruism mechanism exists by the absence of reinforcement (immediate or
delayed) of particular altruistic acts. The question is rather: Are there
altruistic acts under Conditions 2 and 3 above where Condition 1 does not
obtain? This is a difficult question to answer because Condition 1 does not
specify the appropriate context (the longer activity, T) for a particular act. Is there any context (any relatively long-duration activity, T)
in which a given altruistic act would also be a self-controlled act? I believe that
it will always be possible to find such a context. This makes altruism a
relative concept; in some contexts a given act will be altruistic and in some
contexts, not. Where it is altruistic it will also be self-controlled (although
the reverse may not be true).
The
relativity of the concept of altruism should not be disturbing. First, it does
not imply a moral relativism. Many Nazi soldiers behaved altruistically in the
context of their military units but immorally in a larger context. Morality
does not depend on altruism any more strictly than it depends on self-control.
A moral code may approve of some kinds of altruism but disapprove of others
just as it may approve of some kinds of self-control and disapprove of others.
Secondly,
whether an act is self-controlled or impulsive is no less contextually
dependent than whether it is altruistic or selfish. Even a hungry rat rewarded by food for pressing a
lever is to an extent controlling itself.
The pattern of pressing the lever and eating takes longer (necessarily)
than the act of pressing the lever alone.
Pressing the lever, considered alone, is dispreferred to just sniffing
in the corner of the cage; hence pressing the lever for food to be delivered
within a fraction of a second is an instance of self-control. Correspondingly,
even a slug may be said to exhibit self-control – on a microscopic level. At the other extreme, strict sobriety may be
narrow relative to a still more complex pattern of social drinking.
There is a sense in which all acts
(of choice) are selfish; the same sense in which all instrumental acts are
reinforced and, for the economist, all behavior maximizes utility. These are assumptions of theory, or rather
methods of procedure, not empirical findings.
But this does not mean that selfishness is a meaningless concept (any
more than reinforcement or utility maximization is). The sense in which an altruistic act is selfish (as part of an
ultimately selfish pattern) differs from that in which a non-altruistic act is
selfish. And this distinction is an
empirical one.
Behavioral
psychology has not been able to trace every particular act to a particular
reinforcer – immediate or in the future. Organized patterns of acts occur
despite the existence within them of unreinforced particular acts. What then
reinforces the patterns? In psychology, theories of reinforcement based on
“pleasure” or “need” or “drive” have not been able to explain particular acts.
Such theories have proved to be circular – “pleasures,” “needs,” and “drives”
proliferated about as fast as the behaviors they were supposed to explain. It
is often not possible to use these concepts to predict behavior in one choice
situation from behavior in another. But Premack’s (1965) wholly behavioral
theory and the economic theories based on it (Rachlin et al., 1981) are
predictive and non-circular. These theories use the choices under one set of
behavioral contingencies or constraints to estimate the values of the
alternatives (or the parameters of a utility function) and then use those
values or parameters to predict choice under other sets of contingencies or
constraints.
This method serves to explain
choices among patterns of acts as well as particular acts. And, it answers the
social-cooperation question, “Why is friendship rewarding?” as well as the
self-control question, “Why is sobriety rewarding?” The answer in both cases,
for the behavioral psychologist, is that in a choice test between each of these
patterns as a whole and their respective alternative patterns as
a whole, friendship
would (at least in some cases) be chosen over loneliness and sobriety would (at
least in some cases) be chosen over drunkenness.[8]
5.
Commitment
No amount of calculation by the
mother who runs into a burning building to save someone else’s child will bring
the benefits-minus-risks of this activity considered by
itself into positive
territory. But over a series of actions, a series of opportunities to sacrifice her own benefit for
the benefit of others, the weightings may change. As we have seen (Figure 1)
social and individual decisions may individually be completely negative, their
only value appearing when they are grouped.[9] The problem
is that life ordinarily faces us not with groups of decisions but with
particular decisions that must be made. It is up to us to group decisions
together, and we do this by means of various commitment devices – contracts,
agreements, buying tickets to a series of concerts or plays, joining a health
club, and so forth.
These commitments may work by
instituting some punishment (such as loss of money or social support) should we
fail to carry them through. Green &
Rachlin (1996) have shown that pigeons prefer, A: a future choice between 1) a
small, immediate reward followed by punishment and 2) a larger, delayed reward
to, B: the same future pair of alternatives but without the punishment. Only by
present choice of the future pair of alternatives involving punishment will
they avoid being tempted later by the smaller immediate reward and will they
obtain the larger reward that they prefer at the present time. Another kind of
commitment shown by pigeons
(Siegel &
Rachlin, 1996) is “soft commitment.” At an earlier time the pigeon begins a
pattern of behavior, such as rapidly pecking a fixed number of times on a lit button.
This pattern is difficult for the pigeon to interrupt. Then, in the midst of
this pattern, the tempting alternative (the smaller, immediate reward) is
presented. Only by continuing and completing the previously begun pattern of
behavior will the larger reward be obtained.
By beginning and continuing the pattern the pigeon avoids the temptation
and obtains the larger reward. The further along the pigeon is into the
pattern, the more likely it is that the tempting small reward will be avoided.
In a primrose-path experiment
(italicized labels of Figure 1) in my laboratory (Kudadjie-Gyamfi &
Rachlin, 1996) human subjects chose the self-control option (Y)
more when choices were clustered in threes (patterned) than when they were
evenly spaced. Within a group of three choices, the probability of self-control
on the first choice was high but, given self-control on the first choice, the
conditional probability of self-control on the second choice was higher and,
given self-control on the first two choices, the probability of self-control on
the third choice was higher still. Similarly, in a repeated prisoner’s dilemma
situation, playing against tit-for-tat (a strategy that mimicked, on a given
trial, the subject’s choice to cooperate or defect on the previous trial),
human subjects cooperated more when trials were clustered in fours than when
they were evenly spaced out; moreover, as in the self-control experiment,
conditional probability of cooperation increased as the sequence progressed
(Brown, 2000).
Soft
commitment with pigeons is a model, on a narrow temporal scale, for successful
self-control by humans, on a much wider temporal scale (Rachlin, 2000). The
alcoholic, for example, resolves to stop drinking, and refuses one drink. At
that point he is vulnerable to the offer of another drink. But if he refuses 10
drinks he is less vulnerable and if he refuses 100 drinks he is still less
vulnerable. He refuses the later drinks not because their value is reduced
(their value is actually enhanced as deprivation increases) but because he has
already begun a pattern of refusal that involves some cost to break. As he repeatedly refuses drinks (climbs up
line DA in Figure 1) the long term rewards that sobriety entails – better
health, social support, better job performance – grow apace.
In
experiments on repeated prisoner’s dilemmas some subjects cooperate and
continue to cooperate regardless of whether other subjects cooperate with them
(Brann & Foddy 1988). These people may be said to cooperate out of a sense
of moral duty or for ethical reasons or because they are more altruistic than
others. But these sorts of explanations do not say why such people behave as
they do. To understand their behavior, the laboratory experiment has to be seen
not as an isolated situation but in the context of everyday life. Many
experimental subjects are willing and able to separate decisions made in a
psychology experiment from those they make in everyday life. But others are not
able or not willing to do so. They have decided to cooperate in life and
continue to do so in the experiment, not necessarily because of some innate
tendency to be altruistic, but because altruism is generally valuable and they
would not act altruistically if they made decisions on a case-by case basis. The
experiment is merely one case, one situation out of many in their lives. Moral
duty, ethical concerns, and altruism are apt descriptions of their behavior.
But these qualities do not come from nowhere. They are highly valued patterns
of behavior – just as moderation in eating, moderation in drinking, and
moderation in sexual activity are highly valued patterns.
6.
Reinforcement and Punishment in The Prisoner’s Dilemma
Current discussions of altruism and selfishness in philosophy, biology, economics, and psychology are generally united by reference to strategies of play in prisoner’s dilemma situations. The present analysis does not deny the interest or importance of strategies. Rather, as patterns of behavior, it sees them as crucial. The difference between the present behavioral analysis and cognitive analyses is that, in determining what underlies a strategy, the behaviorist looks for contingencies of reinforcement and punishment rather than internal mechanisms. Thus, it is important to show that the prisoner’s dilemma incorporates reinforcement and punishment contingencies and that prisoner’s-dilemma behavior is sensitive to those contingencies.
Consider
the contingencies of the 2-person prisoner’s dilemma diagramed in Figure
2a. If both players cooperate, each
gets 5 points (convertible to money at the experiment’s end); if both defect,
each gets 2 points; if one cooperates while the other defects, the cooperator
gets 1 point while the defector gets 6 points.
Figure 2b diagrams the game in a corresponding way to Figure 1,
revealing the ambivalence. As in Figure
1, defection results in a higher immediate reward and a lower long-run reward
while cooperation results in the reverse.
Regardless of the other player’s choice, it is always immediately better
to defect than to cooperate; if the other player has cooperated then a player
will gain 6 points by defecting and 5 points by cooperating; if the other
player has defected then a player will gain 2 points by defecting and only 1
point by cooperating. If communication
between players is against the rules, if the game could be played only once
(and no similar cooperative tasks were ever expected to be undertaken with the
other player), then the motive to defect should predominate. However, if there were some way to get the
other player to cooperate, then whatever it takes to do this should predominate
over defection because the gain from the right to the left vertical line in
Figure 2b averages 4 points while the gain from the lower to the upper line
(from cooperation to defection) averages 1 point. The best set of circumstances would be to defect while the other
player cooperates, earning 6 points.
This is an unlikely scenario since the other player would then earn only
1 point. However, if communication were
within the rules, it would be possible to compromise by agreeing to mutual
cooperation, earning 5
points each (the highest pooled score). Or, if the game were to be played many
times, it would be possible to reinforce the other player’s cooperation by
cooperating, and to punish the other player’s defection by defecting. This strategy is called “tit-for-tat.” The dashed line shows average points gained
in repeated trials against tit-for-tat with a distribution of choices proportional
to the distance between the vertical lines.
For example, alternation of cooperation and defection (halfway between
the vertical lines) yields 6 points and 1 point alternately for an average of
3.5 points per trial against tit-for-tat.
The highest point on the dashed line (hence the best strategy against
tit-for-tat) is to cooperate on all trials.
Tit-for-tat has indeed been
highly effective in generating cooperation and maximizing pooled scores in
several situations: computer simulations of prisoner’s dilemma games (Axelrod
1997); 2-person games with human subjects (Rapoport & Chammah 1965;
Silverstein et al 1998; Brown & Rachlin 1999); with a single subject
playing against a computer programmed to play tit-for-tat (Komorita & Parks
1994).
Figure 2.(a) Payoff matrix of 2-person prisoner's dilemma game. (b) Same game. Player A's earnings for cooperation (lower black dot) and defection (upper black dot) as a function of Player B's choice.
The
crucial variable influencing cooperation in 2-person games seems to be
reciprocation (Komorita & Parks 1994; Silverstein et al. 1998). This is also true in games with more than 2
players such as illustrated in Figure 1 (Komorita et al. 1993). The tit-for-tat strategy imposes a strict
reciprocation and thus engenders cooperation.
Prior communication enhances reciprocation and thus has the same
effect. On the other hand, when
reciprocation is low or nonexistent, as when the other player plays randomly or
always cooperates or always defects, cooperation deteriorates (Silverstein et
al. 1998). Baker and Rachlin (2001)
found that a player’s probability of cooperation in a 2-person prisoner’s
dilemma game varied directly with the other player’s probability of
reciprocation.
7.
Context
As
Tversky and Khaneman (1981) showed, context, or “framing,” strongly influences
probabilistic choice behavior. Context is
likewise a strong determinant of self-control. Heyman (1996) cites a study by
Robins (1974) of American soldiers who became addicted to heroin in Vietnam.
The majority of these addicts easily gave up their addiction when they came
home to a different environment. Heyman argues that the boundary line
separating local from non-local events (the duration of the chosen activity) may vary over a wide range (depending on the
salience and relevance of environmental stimuli), thereby explaining how humans
and nonhumans may act impulsively in one situation and self-controlled in
another. A second experiment by Baker and Rachlin (in press) demonstrates a
similarly strong influence of context in a social cooperation experiment with
human subjects.
Tit-for-tat is a teaching
strategy. A computer, playing
tit-for-tat against a player, invariably follows the player’s cooperation by cooperating
on the next trial and invariably follows the player’s defection by defecting on
the next trial. Since the computer’s
cooperation is much more valuable to the player than its defection, the
computer’s cooperation reinforces the player’s cooperation and its defection
punishes the player’s defection. Thus
the computer “teaches” the player to cooperate.
Another
strategy that has been successful in computer tournaments (dominating
tit-for-tat) is called Pavlov (Fudenberg & Maskin, 1990; Nowak & Sigmund,
1993). Pavlov is a learning strategy. Using Pavlov, the computer’s
choice on the present trial, whether cooperation or defection, is repeated on
the next trial if the player cooperates and changed on the next trial if the
player defects. Against tit-for-tat, the player cannot successfully punish the
computer’s defection; the computer would respond to defection by defecting
itself. Using Pavlov, however, the computer would respond to defection by
changing its choice on the next trial: if it had defected, it would now
cooperate; if it had cooperated, it would now defect. The computer using Pavlov
would respond to cooperation by repeating its choice on the next trial; if it
had defected, it would defect again; if it had cooperated, it would cooperate again. That is, the computer would behave as if its
choice were reinforced by the player’s cooperation and punished by the player’s
defection. Thus the computer, playing
Pavlov, “learns” from the player.
Figure 3. Results of Baker and Rachlin’s (in press a) experiment. Average of last 15 of 100 trials.
In this experiment, four groups of
subjects (Stony Brook undergraduates) played 100 trials of a prisoner’s dilemma
game. Against each subject in two groups, the computer played a modified form
of tit-for-tat. Against each subject in the other two groups, the computer
played a modified form of Pavlov.[10] One of the
tit-for-tat groups and one of the Pavlov groups saw a spinner on the computer
screen and were correctly informed that the computer’s responses were determined
by that spinner. The other two groups believed that they were playing the game
against another player rather than against a computer. They did not see a
spinner but they did see the “other player’s” reward matrix (and reward
presumably received) as well as their own.[11]
The results of the experiment are shown in Figure 3. The context of the game – whether or not the subjects were led to believe that they were playing against another subject – had a strong effect on their behavior, but the context effect was opposite for the two computer strategies. When subjects believed that they were playing against a computer, they cooperated more against tit-for-tat (where the computer reinforced and punished the players’ cooperation and defection) than they did against Pavlov (where the computer’s choices were reinforced and punished by the players’ cooperation and defection). On the other hand, when subjects believed that they were playing against a human being, they cooperated more against Pavlov than against tit-for-tat. This result may be attributed to the fact that subjects’ histories of interacting with machines (unlikely to be responsive to reinforcement and punishment) differed from their histories of interacting with other people (more likely to be responsive). When the relatively global histories matched the relatively local set of contingencies (the computer’s strategies) subjects cooperated; when the global histories contradicted local contingencies they defected. (In all cases, however, under the most narrowly local contingencies, defection was immediately reinforced.) Choice in prisoner’s dilemma situations, therefore, like choice in self-control situations, may be understood in terms of global as well as local reinforcement.
Taken
together with the previously discussed experiments of Kudadjie-Gyamfi &
Rachlin (1996) and Brown (2000), in which patterning choices over time
increased human subjects’ self-control and prisoner’s-dilemma cooperation, the
experiment described above shows that, at least in laboratory studies,
self-control and social cooperation are similarly responsive to reinforcement
contingencies and similarly sensitive to context.
Laboratory
models, however, are necessarily diminished representations of everyday-life
processes. The reinforcers in all of these experiments – points convertible to
money – were extrinsic to the subjects’ choices. If, as argued here, the
reinforcers of real-life self-controlled and altruistic behavior are intrinsic
in the patterns of those behaviors and if those patterns are extended over long
durations – months and years – real-life rewards will never be duplicated in a
30-minute laboratory experiment.
The
experiment described above partially gets around this limitation by varying
verbal instructions so as to bring the brief laboratory experiment into
differing long-term, real-life contexts. Moreover, an economic extension of
Premack’s conception of reinforcement (Rachlin et al., 1981) sees all
reinforcement as intrinsic (even that of a rat’s lever press reinforced by
food; the rat is seen as choosing the pattern of lever pressing plus eating
over not lever pressing plus not eating).
Nevertheless,
there remains a vast difference in scale between laboratory experiments and
real life. The point of the experiments is to show that, on a small scale,
self-control and altruism are sensitive to reinforcement and punishment. In the
case of self-control there is ample evidence that large-scale, real-life
behavior is similarly sensitive (Bickel & Vuchinich, 2000). If, as is
argued here, there is no essential difference between self-control and
altruism, the same behavioral laboratory studies that have proved useful in
developing real-life self-control techniques may be equally useful in
developing real-life altruistic behavior.
8. Can Altruism Be Explained Without
Reinforcement?
Does
this way of thinking put more weight on reinforcement than it can bear? Can the job be done entirely by internal
mechanisms with reinforcement playing no part whatsoever? The issue is this: There are some particular
acts, especially by humans, that we normally classify as done through a sense
of altruism, of duty, of principle. No
biologist claims that a separate inherited mechanism exists for each of the
infinitude of possible acts that fall within these categories. To explain such actions as inherited, the
biologist must hypothesize the existence of a general mechanism for altruism
which is somehow aroused by situations such as the game I play with my
audiences illustrated in Figure 1 (bold labels). It seems to me that the postulation of such a mechanism as
inherited – like blue or brown eyes – puts far too heavy a load on inheritance;
we have no idea how such a mechanism could work.
On
the other hand, it is generally agreed that self-control may be taught at some
level even to nonhumans. The crucial
issue then is whether or not altruism is a subcategory of self-control. If it
is, there is no need to postulate an innate altruistic mechanism; the job can
be done by whatever mechanism we use to learn self-control – an innate
mechanism to be sure, but an innate learning mechanism.
This
is hardly an original idea. Plato and
Aristotle both claimed that self-control and altruism were related
concepts. The experiments described in
this article illustrate the correspondence.
However, perhaps the argument is ultimately not empirical. It rests on two assumptions: 1. Habitual
altruism is a happier mode of existence than habitual selfishness and 2.
Particular altruistic acts (together with their consequences) are less
pleasurable (even for saints) than particular selfish acts (together with their
consequences). If you accept both of
these propositions altruism must be seen as a kind of self-control.
9. Can Altruism be Explained Wholly In Terms of
Extrinsic Reinforcement?
How
are patterns of behavior learned and how are they maintained? Consider the following set of cases. Four soldiers are ordered to advance on the
enemy. The first and second advance;
the third and fourth do not. Of the two
who advance, the first is just obeying orders; he advances because he fears the
consequences of disobedience more than he fears the enemy. The second is not just obeying orders; he
advances because he believes it is his patriotic duty to advance. Of the two who do not advance, the third
soldier remains in his foxhole out of fear of the enemy; he weighs the aversive
consequences of disobeying orders less than the aversive consequences of
advancing. The fourth soldier does not
advance because he believes that the orders are immoral.
No one, neither the biologist, the
cognitivist, the Skinnerian behaviorist, nor the teleological behaviorist,
denies that there are important differences between the two soldiers who
advance and between the two soldiers who do not advance. But the biologist and cognitivist alike see
all the differences in thought, feeling, moral sentiment of the soldiers, as
contemporary with their current behavior.
Behaviorists do not disagree that internal differences exist but their
focus is rather on non-contemporary events; the Skinnerian behaviorist is
concerned to discover crucial differences in the soldiers’ extrinsic reinforcement histories. The teleological behaviorist is concerned to discover the
patterns of behavior of which each soldiers’ present act forms a part (intrinsic reinforcement). Note, however, that even the concept
of extrinsic reinforcement must rely at some point on intrinsic
reinforcement. According to Premack’s
theory, for example, eating reinforces lever pressing because eating is
(intrinsically) of high value and lever pressing is (intrinsically) of lower
value. I am claiming here that an
abstract pattern of behavior may be (intrinsically) of high value while the sum
of the values of its particular components are of (intrinsically) lower
value. Value, in either case, would be
determined by a choice test.
Let us first consider extrinsic
reinforcement. By careful selection,
with humans, it is possible to reinforce members of a set of particular acts
belonging to a wide or abstractly defined class of acts (a rule) so that
particular acts that have never been reinforced, but that obey the rule, are
performed. That is, humans are able to generalize across instances of complex
rules and, with simple rules, nonhumans are also able to do so. Behavior thus learned is said to be rule-governed. Imitation
(of certain people) and following orders (in certain circumstances) are two
such kinds of rules. There is no space here to discuss the several techniques
developed for generating rule-governed behavior with extrinsic
reinforcement (see Hayes, 1989, for a
collection of articles on the subject), nor to discuss current disputes about
whether language precedes complex rule-following or whether rule-following
precedes language (Sidman, 1997).
The behavior of the first soldier,
who advances because he fears the consequences of disobeying orders more than
he fears the enemy, and that of the third soldier, who fails to advance because
he fears the enemy more than the consequences of disobeying orders, may be
explained in terms of conflicting rules.
Regardless of the complexity of the relation between the consequences of
the present act and those of past acts, it is the weighting of the extrinsic
consequences of the present act (the magnitudes, probabilities, and delays of
enemy fire versus those of punishment for disobedience) that determines the
behavior of these two soldiers.
Moreover, it may be possible to
account for the initial learning of ethical rules and principles, such as those that
govern the altruistic behavior of the second and fourth soldiers, in terms of
extrinsic social reinforcement at home or school or church. But extrinsic
reinforcement cannot account for the maintenance
of altruistic behavior. An altruistic act
may never be reinforced. The second and
fourth soldiers (as well as the woman who runs into the burning building to
save someone else’s child) are as capable of weighing the immediate
consequences of their acts as are the first and third soldiers. But those consequences are ignored by these
two soldiers. The second and fourth
soldiers, both of whose behavior has been brought under the control of highly
abstract principles (we are assuming), are surely capable of discriminating
between the extrinsic consequences of their present acts and the extrinsic
social approval or disapproval of their past behavior at home, school or church
where the principles were learned. A
person capable of bringing his or her behavior into conformance with an
abstract principle by means of extrinsic reinforcement, and of transferring the
application of that rule across situations, could not fail to discriminate the
present context (where social approval is dwarfed by the possibility of death)
from situations where the rule-governance may have been initially learned. Yet the altruistic act is performed anyway.
Such
acts must be maintained not by extrinsic reinforcement but by intrinsic
reinforcement. The patterns of those
acts (patriotic, ethical, altruistic), perhaps supported during their formation
by a scaffold of extrinsic reinforcement, must be highly valuable in
themselves. If they depended on
extrinsic reinforcement for maintenance they would not be maintained.
In
Premack’s terms, valuable patterns would be chosen if offered as whole patterns
in a free choice situation. In cases
such as the patriotic and ethical soldiers and the woman saving a child,
imagine a giant concurrent-chain schedule with years-long terminal link alternatives:
heroism versus timidity, reverence for life versus toleration of killing,
kindness versus cruelty. Because of
their intrinsic value the chosen patterns are final causes of their component
acts and may themselves be effects of still wider final causes: a coherent
concept of self; living a happier life, living a better life.
Most
of us would indeed choose to be heros rather than cowards, to revere life
rather than to kill, to be kind rather than cruel. We realize that the former alternatives of each pair are actually
patterns of happy lives and the latter, of unhappy lives. But these alternatives are rarely offered to
us as wholes. Rather, we are faced with
a series of particular choices with outcomes of limited temporal extent. The altruists among us, however, have chosen
such more extended patterns as wholes; they are the patterns most of us would
choose if we could choose them as wholes.
But to do this we would need to evaluate particular alternatives not by
their particular consequences but rather by whether or not they fit into the
larger patterns. This of course is a
problem of self-control.
10.
Conclusions.
Some particular altruistic acts are profitable some of the time. Giving to charity is often observed and frequently rewarded by society. But patterns of behavior may be maintained without extrinsic rewards. For example, on a relatively small scale, activities such as solving jigsaw or crossword puzzles are valuable in themselves. People, like me, who like to do crossword puzzles, find value in the whole act of doing the puzzle. When I sit down on a Sunday morning to do the puzzle I am not beginning a laborious act that will be rewarded only when it is completed. Yet, despite the lack of extrinsic and intrinsic reward for putting in that last particular letter, completing the puzzle is, for me, a necessary part of its value. Like listening to symphonies, the pattern is valuable only as a whole. Extrinsic rewards may initially put together the elements of
these
patterns but the patterns, once formed, are maintained by their intrinsic
value. The cost of breaking the pattern is the loss of this value – even that
of the parts already performed. On an infinitely larger scale, living a good
life is such a pattern. This is why the woman runs into the burning building to
save someone else’s child without stopping to calculate the cost of this
particular act, why Socrates chose to die rather than violate the sentence of
the Athenian court.
It
is not possible to tease apart the individual and social benefits of such acts.
High degrees of altruism are infrequent, not because most people lack an
internal altruism mechanism, not because they are selected by evolution to be
egoists rather than altruists, but because of the highly abstract nature of the
valuable patterns. The relation between particular acts of altruism and the
intrinsic reward of the pattern is vague and indistinct. Altruism for most of us (like sobriety for
the alcoholic) is not profitable and would not be chosen considering only its
case-by-case, extrinsic reinforcement.
Consequently the way for most of us to profit from altruism (and the way
for an alcoholic to profit from sobriety) is to pattern our behavior abstractly
– to choose to be an altruistic (or a sober) person. But in order to pattern
our behavior in this way (and reap the rewards for so doing) we must forego
making decisions on a case-by-case basis.
Once we abandon case-by-case decisions, there will come times in
choosing between selfishness and altruism when we will be altruistic even at
the risk of death.
REFERENCES
Ainslie, G. (1992) Picoeconomics, Cambridge
University Press.
Axelrod, R. (1997) The complexity of
cooperation: Agent based models of competition and collaboration, Princeton
University Press.
Baker, F. & Rachlin, H. (2001) Probability
of reciprocation in prisoner’s dilemma games.
Journal of Behavioral Decision Making 14: 51-67.
Baker, F. & Rachlin, H. (in press) Teaching
and learning in a probabilistic prisoner’s dilemma. Behavioural Processes.
Baum, W. (1994) Understanding behaviorism:
Science, behavior, and culture, Harper Collins.
Bickel, W.K. & Vuchinich, R.E. (2000)
Reframing health behavior change with behavioral economics, Lawrence Erlbaum
Associates.
Brann, P. & Foddy, M. (1988) Trust and the
consumption of a deteriorating common resource. Journal of Conflict Resolution
31: 615-630.
Brown, J. (2000) Delay discounting of multiple
reinforcers following a single choice. Thesis, Psychology Department, State
University of New York at Stony Brook.
Brown, J. & Rachlin, H. (1999) Self-control
and social cooperation. Behavioural Processes 47 65-72.
Caporael, L.R., Dawes, R.M., Orbel, J.M. &
van de Kragt, A.J.C. (1989) Selfishness examined: Cooperation in the absence of
egoistic incentives. Behavioral and Brain Sciences 12: 683-739.
Dawes, R. (1980) Social dilemmas. Annual Review
of Psychology 31: 169-193.
Dawkins, R. (1976) The selfish gene, Oxford
University Press.
Dennett, D.C. (1984) Elbow room: The varieties of
free will worth wanting, MIT Press.
Edney, J.J. (1980) The commons problem:
Alternative perspectives. American Psychologist 35: 131-150.
Fudenberg, D. & Maskin, E. (1990) Evolution
and cooperation in noisy repeated games. New Developments in Economic Theory
80: 274-279.
Green, L. & Rachlin, H. (1996) Commitment
using punishment. Journal of The Experimental Analysis of Behavior 65: 593-601.
Hayes, S.C. (1989) Ed. Rule-governed behavior: Cognition, contingencies, and
instructional control, Plenum Press.
Herrnstein, R.J. (1991) Experiments on stable suboptimality in
individual behavior. American Economic
Review 81: 360-364.
Herrnstein, R.J. & Prelec, D. (1992) A theory of addiction. In: Choice over time, eds. G. Loewenstein
& J. Elster, Russell Sage Foundation.
Herrnstein, R.J., Prelec, D. & Vaughan, W.
Jr. (1986) An intra-personal prisoners’ dilemma. Paper presented at the IX
Symposium on Quantitative Analysis of Behavior: Behavioral Economics, Harvard
University.
Heyman, G.M. (1996) Resolving the contradictions of addiction. Behavioral and Brain Sciences 19: 561-610.
Kahneman, D. & Tversky, A. (1979) Prospect
theory: An analysis of decisions under risk. Econometrica 47: 263-291.
Komorita, S.S. & Parks, C.D. (1994) Social
dilemmas, Brown & Benchmark.
Komorita, S.S., Chan, D. K-S. & Parks, C.D.
(1993) The effects of reward structure and reciprocity in social dilemmas.
Journal of Experimental Social Psychology 29: 252-267.
Kudadjie-Gyamfi, E. (1998) Patterns of behavior:
Self-control choices among risky alternatives. Thesis. Psychology Department.
State University of New York at Stony Brook.
Kudadjie-Gyamfi, E. & Rachlin, H.
(1996) Temporal patterning in choice
among delayed outcomes. Organizational
Behavior and Human Decision Processes 65: 61-67.
Lewin, K. (1936) Principles of topological
psychology, McGraw-Hill.
Mazur, J.E. (1987) An adjusting procedure for
studying delayed reinforcement. In: Quantitative analysis of behavior, 5: The
effects of delay and of intervening events on reinforcement value, eds. M.L.
Commons, J.E. Mazur, J.A. Nevin & H. Rachlin, Lawrence Erlbaum Associates.
Messick, D.M. & McClelland, C.L. (1983)
Social traps and temporal traps. Personality& Social Psychology Bulletin 9:
105-110.
Nowak, M. & Sigmund, K. (1993) A strategy of
win-stay-lose-shift that outperforms tit-for-tat in the prisoner’s dilemma
game. Nature 364: 56-58.
Parfit, D. (1971) Personal identity.
Philosophical Review 80: 3-27.
Platt, J. (1973) Social traps. American
Psychologist 28: 641-651.
Premack, D. (1965) Reinforcement theory. In:
Nebraska symposium on motivation, ed. D. Levine, University of Nebraska Press.
Rachlin,
H. (1994) Behavior and mind: The roots of modern psychology, Oxford University
Press.
Rachlin, H. (1995a) Self-control: Beyond
commitment. Behavioral and Brain Sciences 18: 109-159.
Rachlin, H. (1995b) The value of temporal
patterns in behavior. Current Directions 4: 188-191.
Rachlin, H. (1997) Four teleological theories of
addiction. Psychonomic Bulletin & Review 4: 462-473.
Rachlin, H. (2000) The science of self-control,
Harvard University Press.
Rachlin, H., Battalio, R., Kagel, J. &
Green, L. (1981) Maximization theory in behavioral psychology. Behavioral and
Brain Sciences 4: 371-417.
Rapoport, A. & Chammah, A.M. (1965)
Prisoner’s dilemma, University of Michigan Press.
Robins, L.N. (1974) The Vietnam drug user
returns. Special Action Office Monograph, Series A, Number 2, United States
Government Printing Office.
Schelling, T. (1971) The ecology of micromotives.
Public Interest 25: 61-98.
Sidman, M.
(1997) Equivalence relations. Journal of the Experimental Analysis of
Behavior 68: 258-266.
Siegel, E. & Rachlin, H. (1996) Soft commitment: Self-control achieved by
response persistence. Journal of the
Experimental Analysis of Behavior 64: 117-128.
Silverstein, A., Cross, D., Brown, J., &
Rachlin, H. (1998) Prior experience and patterning in a prisoner’s dilemma
game. Journal of Behavioral Decision Making 11: 123-138.
Skinner, B.F. (1938) The behavior of organisms:
An experimental analysis, Appleton-Century-Crofts.
Sober, E. & Wilson, D.S. (1998) Unto others:
The evolution and psychology of unselfish behavior, Harvard University Press.
Stout, R. (1996) Things that happen because they
should, Oxford University Press.
Tooby, J. & Cosmides, L. (1996) Friendship
and the banker’s paradox: Other pathways to the evolution of adaptations for
altruism. In: Evolution of social behavior patterns in primates and man.
Proceedings of The British Academy 88, Oxford University Press.
Tversky, A. & Kahneman, D. (1981) The
framing of decisions and the rationality of choice. Science 211: 453-458.
ACKNOWLEDGMENTS
The
research reported in this article and the preparation of the article were
supported by grants from the National Institute of Mental Health and the
National Institute on Drug Abuse. Some sections of the article are rewritten
versions of sections of the author’s book, The Science of Self-Control,
published by Harvard University Press (Rachlin, 2000).
[1] These are very wide conceptions of selfishness. Usually, by “selfishness,” we mean explicit rejection of a clearly altruistic alternative; so the word has a socially negative connotation. However, in popular explanations of biology, “selfishness” has lost its negative sense. It just stands for survival value (as in “selfish gene”). Similarly I use the term here to stand for reinforcement value.
[2] What counts seems to be how the problem is presented – whether I emphasize the group or the individual benefit – rather than who the players are (Italian economists, Japanese psychologists, Stony Brook undergraduates, and so forth) or whether the amounts of money won are large and hypothetical or small and real.
[3] Other versions of the primrose path manipulate delays rather than amounts (with inverse contingencies). In some experiments subjects are given more or less explicit instructions about the contingencies in effect. In others, the base number of trials determining N (rule #3) is varied. In still others, trials are grouped in temporal patterns. These manipulations have systematic effects on the proportion of X’s and Y’s chosen (over a typical session of about 100 trials), but none results in exclusive choice of X or Y, showing that the contingencies retain their essential ambivalence.
[4] The social prisoner’s dilemma, in which a single person’s interests conflict with the common interests of a group, is analogous to a single person’s intertemporal dilemma, in which the person’s interests over a narrow time range conflict with the common interests of that same person over a wide time range. Ainslie (1992) pointed out that the prisoner’s dilemma among groups of individuals corresponds to that within an individual at different times. The difference between Ainslie’s view of self-control and mine is my conception of common interests reinforcing behavioral patterns (analogous to group selection) versus Ainslie’s conception of internal bargaining among a person’s temporally distant interests. Underlying this is a difference in our conceptions of simple versus complex ambivalence. Ainslie believes that complex ambivalence – where abstract rewards such as good health reinforce behavioral patterns such as daily exercise – may be reduced to the sum of discounted values of particular rewards acting on each particular act of exercise. That is, Ainslie believes that complex ambivalence may be reduced to multiple cases of simple ambivalence. I believe that complex and simple ambivalence are essentially different. Where simple ambivalence opposes larger but more delayed rewards to smaller but less delayed rewards, complex ambivalence opposes larger but more abstract (and temporally extended) rewards to smaller, particular rewards.
[5] And many times since. Ainslie (1992), Platt (1973), and Schelling (1971) have recently stressed this correspondence.
[6] It is sometimes supposed that in a perfect world there would be no conflict between immediate desires and long-term values. The image of a natural human being living a natural life has this sort of framework – a place where our immediate desires are in harmony with our long-term best interests. But, as Plato pointed out (Philebos, 21c), life in such a world would be the life of a slug. In such a world we would have no need to behave in conformance with more abstract environmental contingencies; therefore we would have no ability to do so.
[7] As previously noted, however, people often ignore valuable long-term patterns and focus on particular present costs and benefits. In economic terms, this implies that you need to be very careful in determining which previously incurred costs are really “sunk costs” and which are investments that if pursued (at a present additional cost) may still pay off.
[8] This is as far as the behavioral psychologist can go. For the evolutionary biologist, the answer to, “Why is this pattern valuable?” is that it has contributed to survival in the past. I am not arguing that the behavioral psychologist’s answer is better than the evolutionary biologist’s answer but rather that a correspondence between self-control and social-cooperation is no less consistent with an evolutionary biological approach to behavior than it is with a teleological behavioral approach.
[9] As the Gestalt psychologists pointed out, we perceive patterns (like melodies) directly rather than as the sum of their parts. Similarly, the value of a pattern (like the enjoyment of listening to a melody) may be far greater than the sum of the values of its parts (the enjoyment of listening to particular notes).
[10] The game was modified to make the computer’s responses probabilistic rather than all-or-none. When a strategy would ordinarily dictate cooperation, the computer increased its probability of cooperation by .25 (and decreased its probability of defection by .25) between 0 p 1. When a strategy would ordinarily dictate defection, the computer increased its probability of defection by .25 (and decreased its probability of cooperation by .25).
[11] There were two other groups whose results are not presented here. Those groups did not see a spinner on the computer screen but neither did they see another reward matrix and they were not led to believe that they were playing against another subject.
FIGURE
LEGENDS
1. Bold typeface: Contingencies of
10-person prisoner’s dilemma experiment. Italic
typeface: Contingencies of
self-control, “primrose path,” experiment (1 player, successive choices) to be
described later . [In brackets]: Contingencies faced by alcoholic. In all three
cases, particular choices of X [having a single drink] are always worth more
than particular choices of Y [refusing a single drink] yet on the average it is
better to choose Y [to drink at a low rate].
2. (a) Payoff
matrix of 2-person prisoner’s dilemma game. (b) Same game. Player A’s earnings
for cooperation (lower black dot) and defection (upper black dot) as a function
of Player B’s choice.
3. Results of
Baker and Rachlin’s (in press) experiment. Average of last 15 of 100 trials.