© 2002 Cambridge University Press


Below is the unedited, uncorrected preprint of an accepted target article previously published in Behavioral and Brain Sciences, Volume 25, Number 2: 239-250 (April 2002). Please visit the Cambridge Journals Online BBS Home Page to order the full published treatment.


ALTRUISM AND SELFISHNESS

 

Howard Rachlin

Psychology Department

State University of New York

Stony Brook, New York, 11794-2500

 

(212) 632-7807

e-mail: howard.rachlin@sunysb.edu

 

(13,641 words in all)

Revision: 8/01

 


Long Abstract: Many situations in human life present choices between (a) narrowly preferred particular alternatives and (b) narrowly less preferred (or aversive) particular alternatives that nevertheless form part of highly preferred abstract behavioral patterns. Such alternatives characterize problems of self-control. For example, at any given moment, a person may accept alcoholic drinks yet also prefer being sober to being drunk over the next few days.  Other situations present choices between (a) alternatives beneficial to an individual and (b) alternatives that are less beneficial (or harmful) to the individual that would nevertheless be beneficial if chosen by many individuals. Such alternatives characterize problems of social cooperation; choices of the latter alternative are generally considered to be altruistic. Altruism, like self-control, is a valuable temporally-extended pattern of behavior. Like self-control, altruism may be learned and maintained over an individual’s lifetime. It needs no special inherited mechanism. Individual acts of altruism, each of which may be of no benefit (or of possible harm) to the actor, may nevertheless be beneficial when repeated over time. However, because each selfish decision is individually preferred to each altruistic decision, people can benefit from altruistic behavior only when they are committed to an altruistic pattern of acts and refuse to make decisions on a case-by-case basis.

 


Short Abstract: Many situations in human life present choices between particular and abstract alternatives. Such choices characterize both problems of self-control and problems of social cooperation. Choices of social good at a cost to the particular individual are generally considered to be altruistic. Altruism, like self-control, is a valuable temporally-extended pattern of behavior. Like self-control, altruism may develop over an individual’s lifetime. It needs no special inherited mechanism. Individual acts of altruism, each of which may be costly to the actor, may nevertheless be beneficial when repeated over time. However, because each selfish decision is individually preferred to each altruistic decision, people can benefit from altruistic behavior only when they are committed to an altruistic pattern of acts and refuse to make decisions on a case-by-case basis.

 

Key Words:

addiction, altruism, commitment, cooperation, defection, egoism, impulsiveness, patterning, prisoner’s dilemma, reciprocation, reinforcement, selfishness, self-control

 

 


 

 

ALTRUISM AND SELFISHNESS

 

 

1. Introduction

            Biological Compatibility.  Altruism and selfishness, like free-will and determinism, seem to be polar opposites. Yet, as with free will and determinism (Dennett, 1984), the apparent incompatibility may be challenged by various forms of compatibility. From a biological viewpoint selfishness translates into survival value. Evolutionary biologists have been able to reconcile altruism with selfishness by showing how a biological structure mediating altruistic behavior could have evolved. (The next section will briefly summarize one such demonstration.) This structure is assumed to be more complex than ordinary mechanisms that mediate selfish behavior but in essence is no different from them. The gazelle that moves toward the lion (putting itself in danger but showing other gazelles where the lion is) may thus be seen as acting according to the same principles as the gazelle that takes a drink of water when it is thirsty. The desire to move toward the lion stands beside the desire to drink.

            Evolutionary biologists do not conceive of behavior itself being passed from generation to generation; rather, some mechanism, in this case an internal mechanism – a structure of nervous connections in the brain – is hypothesized to be the evolving entity. Altruism as it appears in behavior is conceived as the action of that mechanism developed over the lifetime of the organism. Tooby and Cosmides (1996, p. 125) compare the structure of the altruism mechanism to that of the eye: “We think that such adaptations will frequently require complex computations and suspect that at least some adaptations for altruism may turn out to rival the complexity of the eye.”

            This biological compatibility makes contact with modern cognitive and physiological psychology (Sober & Wilson, 1998). Cognitive psychology attempts to infer the mechanism’s principles of action (its software) from behavioral observation and manipulation while physiological psychology attempts to investigate the mechanism itself (its hardware).

            From the biological viewpoint, altruistic acts differ from selfish acts by virtue of differing internal mediating mechanisms; altruism becomes a motive like any other. In this view, a person leaves a tip in a restaurant to which he will never return because of a desire for fairness or justice, a desire generated by the restaurant situation and the altruistic mechanism within him, which is satisfied by the act of leaving the tip. Similarly, he eats and drinks at the restaurant because of desires generated by internal mechanisms of hunger and thirst. For the biologist, Person A’s altruistic behavior (behavior that benefits others at a cost to A) would be fully explained if Person A were shown to possess the requisite internal altruistic mechanism. Once the mechanism were understood, no further explanation would be required.

            The problem with this conception, from a behavioral viewpoint, is not that it postulates an internal mechanism as such. (After all, no behavior is possible without internal neural structure.) The problem is that in focusing on an inherited internal mechanism, the role of learning over an organism’s lifetime tends to get ignored. To develop normally, eyes have to interact with the environment. But we inherit good eyesight or bad eyesight. If our altruism mechanisms are like our visual mechanisms we are doomed to be more or less selfish depending on our genetic inheritance. This is a sort of genetic version of Calvinism. Experience might aid in the development of altruistic mechanisms. Environmental constraints imposed by social institutions – family, religion, government – might act on selfish motives (like glasses on eyesight) to make them conform to social good. But altruistic behavior as such, according to biological theory, would depend (as eyesight depends) much more on genes than on experience.             The present article, does not deny the existence of such mechanisms.  A large part of human altruism and a still larger part of nonhuman altruism may well be explained in terms of inherited mechanisms based on genetic overlap.  However, the mechanisms underlying these behaviors would have evolved individually.  The mechanism responsible for the ant’s self sacrifice in defense of a communal nest would differ from that responsible for a mother bear’s care for her cubs.  There remains some fraction of altruistic action, especially among humans, that cannot be attributed to genetic overlap.  For the remainder of this article I will symbolize such actions by the example of a woman who runs into a burning building to save someone else’s child.  I mean this example to stand for altruistic actions not easily attributed to genetic factors.  Biological compatibility attributes such an act not to a specific mechanism for running into burning buildings but to a general mechanism for altruism itself. The present article will argue that it is unnecessary to postulate the existence of such a general mechanism.  I claim, first, that altruism may be learned over an individual’s lifetime and, second, that it is learned in the same way that self-control is learned –  by forming particular acts into coherent patterns of acts.  The woman who runs into a burning building to save someone else’s child does so not by activating an innate self-sacrificing tendency but by virtue of the same learning process she uses to control her smoking, drinking or weight.

 

            Behavioral Compatibility.  For biological compatibility, selfishness translates into survival value; for behavioral compatibility, selfishness translates into reinforcement.[1] From a behavioral viewpoint, an altruistic act is not motivated, as an act of drinking is, by the state of an internal mechanism; it is rather a particular component that fits into an overall pattern of behavior. Given this, the important question for the behaviorist is not, “What reinforces a particular act of altruism?” – for this particular act may not be reinforced; it may never be reinforced; it may be punished – but, “What are the patterns of behavior that the altruistic act fits into?”

            To explain why a woman might risk her life to save someone else’s child it would be a mistake to look for current or even future reinforcers of the act itself. By definition, as an altruistic act, it is not reinforced. In economic terms, adding up its costs and benefits results in a negative value. Some behavioristic analyses of altruism have tried to explain particular altruistic acts in terms of delayed rather than immediate reinforcement (Ainslie, 1992; Platt, 1973). But delayed reinforcers, after being discounted, may have significant present value, even for nonhumans (Mazur, 1987). If the present value of a delayed reward is higher than the cost of the act, it is hard to see how the act can be altruistic. It is certainly not altruistic of the bank to lend me money just because I will pay them back later rather than now. If the woman who risked her life to run into the burning building to save someone else’s child were counting on some later reward or sequence of rewards to counterbalance her risk (say ten million dollars, to be paid over the next ten years, offered by the child’s parents), her action would be no more altruistic than that of the bank when it lends me money.

            This narrow behavioral view of altruism has been criticized by social psychologists (Edney, 1980, for example) but the criticism focuses mostly on the behaviorism rather than on the narrowness of the view. These critics have merely replaced, as an explanatory device, the present action of delayed rewards with the present action of internal mechanisms. I argue here that it is a mistake to look for the cause of a specific altruistic act either in the environment or in the interior of the organism. Rather, the cause of the altruistic act is to be found in the high value (the reinforcing value, the survival value, the function) of the act as part of a pattern of acts, or as a habit (provided habit is seen as a pattern of overt behavior extended in time rather than, as it is sometimes seen in psychology, as an internal state). According to the present view, a woman runs into a burning building to save someone else’s child (without the promise of money) not because she is compelled to do so by some internal mechanism nor because she has stopped to calculate all costs and benefits of this particular act; if she did stop to calculate she would arrive at a negative answer and not do the act. Rather, this act forms part of a pattern of acts in her life, a pattern that is valuable in itself, apart from the particular acts that compose it. The pattern, as a pattern of overt behavior, is worth so much to her that she would risk dying rather than break it.

            Biological compatibility says that a particular altruistic act is itself of high value by virtue of an inherited general altruistic mechanism.  Learning would enter into the development of altruism, according to biological compatibility, only in the minimal sense that a baby has to learn how to eat.  The mechanism is there, the biologist says; you need only to learn how to use it.  Behavioral compatibility says, on the other hand, that the altruistic act itself is of low value and remains of low value.  What is highly valued is a temporally extended pattern of acts into which the particular act fits.  The role of the hypothesized internal altruistic mechanism in biological compatibility – to provide a motive for otherwise unreinforced particular acts – is taken, in behavioral compatibility, by the highly valued pattern of acts.  Learning of altruism, the behavioral compatibilist says, is learning to perform relatively low valued particular acts as part of a highly valued pattern.  Thus, from a behavioral viewpoint, particular altruistic acts are not in themselves fundamentally selfish; rather, an altruistic act is selfish only by virtue of the high value of the pattern.

 

            Teleological Behaviorism.  The kind of behaviorism that this view embodies is called, “teleological behaviorism” (Baum, 1994; Rachlin, 1994; Stout, 1996). Aristotle’s psychology and ethics are behavioristic in this teleological sense: for Aristotle, a particular action has no meaning by itself; the meaning of an action resides in habits of overt behavior as they are played out in time, not in internal mechanistic or spiritual events; whether a particular act is good or bad depends on the habit into which it fits. In Aristotle’s conception of science, habits are final causes of the particular acts that comprise them. While a particular ethical act may be caused (in the sense of efficient cause) by the action of an internal mechanism, it is caused (in the sense of final cause) by an abstract pattern of overt behavior. It is the final cause that determines whether the particular act is good or bad, altruistic or selfish.

            Teleological behaviorism retains Aristotle’s final-cause system of explanation in psychology. For example, it explains motives in terms of habits rather than habits in terms of motives. It is at least arguable that we will not be able to uncover the mechanisms underlying altruistic behavior until we gain a clear idea of what altruistic behavior is in its own terms – as a kind of habit. That is the purpose of this target article.

 

            Outline. Altruism and selfishness were introduced in Section 1 as apparently contradictory but nevertheless compatible behaviors. Particular altruistic acts are compatible with a larger selfishness – selfishness on a more abstract level. The introduction is followed in Section 2 by a discussion of group selection, a biological compatibility between altruism of the individual relative to other members of a group and selfishness (increased survival) of group members relative to those of other groups. Section 3 draws an analogy between group selection and self-control; just as particular acts of self-sacrifice are compatible with a more abstract benefit to a group of individuals, so particular unreinforced acts are compatible with a more abstract long-term benefit to the individual. Section 4 tightens the analogy with more formal definitions of both self-control and altruism. Whether a person acts impulsively or selfishly on the one hand versus temperately or altruistically on the other depends on the degree to which that person structures particular acts in patterns. Such structuring is discussed in Section 5 on commitment. If the analogy between self-control and altruism reflects a fundamental correspondence, altruism may be explained as self-control has been explained – as a choice between high valued particular acts and higher valued patterns of acts. Section 6 describes how the principles of reinforcement and punishment, which have been used to determine the value of self-control alternatives, may apply to social cooperation. Section 7 presents an experiment showing that behavior in a laboratory social-cooperation game depends strongly on the game’s context.  Sections 8 and 9 deal with potential objections.  Section 8 claims that altruism cannot be fully explained in biological terms, without the concept of reinforcement.  Section 9 claims that altruism cannot be fully explained in Skinnerian terms, without the concept of intrinsic reinforcement of behavioral patterns. Section 10 concludes that altruism as well as self-control involves organization of behavior in patterns and choosing among patterns as wholes.

           

2. Group Selection

            Biologists have speculated that the degree of common interest between organisms is fundamentally reflected in their shared genes (Dawkins, 1976).  The innate tendency of any organism to sacrifice its own interests for those of another organism would then depend on the degree to which their genes overlapped.  To the degree that closeness of familial relationship correlates with genetic overlap, innate altruism should be greatest within families and decrease as overlap decreases in the population.  The behavior of a mother who ran into a burning building to save her own child would thus be explained. But the many documented cases of altruism with respect to strangers (that of saints, heros, and the like) would not be explained.  Why would a mother ever run into a burning building, risking her own life (100% genetic overlap with herself), to save someone else’s child?

            Some principle other than genetic overlap seems to be necessary to explain the inheritance of an altruism that goes beyond the family. Recently, Sober & Wilson (1998) described such a principle – “group selection of altruism.”  To understand group selection you first have to understand a kind of social contingency called, “The Prisoner’s Dilemma.” An example of a prisoner’s dilemma game (in this case, a multi-person prisoner’s dilemma) is a game that I have, for the last ten years or so, been playing with the audience whenever I present the results of my research at university colloquia or conferences. I begin by saying that I want to give the audience a phenomenal experience of ambivalence.  Index cards are then handed to 10 randomly selected people and the others are asked to imagine that they had gotten one of the cards.  They choose among hypothetical monetary prizes by writing either Y or X on the card.  The rules of the game (projected on a screen behind me while I talk) are as follows:


1. If you choose Y you get $100 times N.

2. If you choose X you get $100 times N plus a bonus of $300.

3. N equals the number of people (of the 10) who choose Y.


            Then I point out the consequences of each choice as follows: “You will always get $200 more by choosing X than by choosing Y.  Choosing X rather than Y decreases N by 1 (Rule #3), costing you $100; but if you chose X you also gain the $300 bonus (Rule # 2).  This results in a $200 gain for choosing X.  Logic therefore says that you should choose X, and any lawyer would advise you to do so.  The problem is that if you all followed the advice of your lawyers and chose X, N = 0, and each of you would get $300; while if you all ignored the advice of your lawyers and chose Y, N = 10 and each of you would get $1,000.”  Sometimes, depending on the audience, I illustrate these observations with a diagram like Figure 1 (bold labels).

            Then I ask the 10 people holding cards to make their choices, imagining as best they can what they would choose if the money were real, and letting no one else see what they have chosen.  Then I collect the cards and hold them until I finish my lecture.  I have done this demonstration or its equivalent dozens of times with audiences ranging from Japanese psychologists to Italian economists.  The result is about an even split between cooperation (choosing Y) and defection (choosing X), indicating that the game does create ambiguity. Although the money won by members of my audiences is entirely hypothetical, significant numbers of subjects in similar experiments in my laboratory, with real albeit lesser amounts of money, have also chosen Y.[2]

            Figure 1 (labels in bold typeface) represents the contingencies of the prisoner’s dilemma game that I ask my audience to play. Point A represents the condition where everyone cooperates.  Point C represents the condition where everyone defects.  The line from A to C represents the average (hypothetical) earnings per person at each value of N (the inverse of the x-axis).  Clearly, the more people who cooperate, the greater the average earnings.  But, as is shown by the two lines, ABC (representing the return to each player who defects) and ADC (representing the return to each player who cooperates), an individual always earns more by defecting than cooperating.

            Suppose, instead of hypothetically giving money to each player, I instead pooled the money each player earned (still hypothetical) and donated it to the entertainment fund of whatever institution I were lecturing at.  Given this common interest it would now pay for every individual to choose Y; a choice of Y by any individual would increase N by 1 for all 10 players, gaining $1,000 at a cost of the individual player’s $300 bonus, for a net gain to the pool of $700.  A common interest thus tends to reinforce cooperation in prisoner’s dilemma games.

Figure 1. Contingencies of 10-person prisoner's dilemma experiment. Italic typeface: Contingencies of self-control, "primrose path", experiment (1 player, successive choices) to be described later . [In brackets]: Contingencies faced by alcoholic. In all three cases, particular choices of X [having a single drink] are always worth more than particular choices of Y [refusing a single drink] yet on the average it is better to choose Y [to drink at a low rate].


 

            Group selection relies on common interest.  A highly simplified version of group selection runs as follows: Consider a population of organisms divided into several relatively isolated groups (tribes, for example). Within each tribe there are some altruists and some selfish individuals (“egoists”) interacting with each other repeatedly in multi-person prisoner’s-dilemma-like games such as the one with which I introduce my lectures, except instead of monetary reward the players receive more or less fitness – ability to reproduce.  In these games the altruists tend to cooperate while the egoists tend to defect.  Within each group (as in the prisoner’s dilemma) altruists always lose out to egoists.  However, those groups originally containing many altruists grow much faster than those originally containing many egoists – because cooperation benefits the group more than defection does. 

            Consider the case of teams, such as basketball teams, playing in a league.  It is commonly accepted that, all else being equal, teams with individual players who play unselfishly will beat teams with individual players who play selfishly; however, within each team, the most selfish players will score the most points.  Imagine now, instead of scoring points and winning or losing games, the teams competed for reproductive fitness.  Then the number of players on teams with a predominance of unselfish players would grow rapidly while that of teams with a predominance of selfish players would grow slowly or (in competition for scarce resources) shrink – the group effect.  Although, within each team, selfish players would still increase faster than unselfish ones (the individual effect), this growth could well be overwhelmed by the group effect.  As time goes on, the absolute number of unselfish individuals (altruists) could increase faster across the whole population than the absolute number of egoists even though within each group the relative number of altruists decreases.  If the groups remained rigidly divided, eventually, because the relative number of altruists is always decreasing within each group, the absolute number would begin to decrease as well.  However, if, before this point is reached, the groups mixed with each other and then re-formed, the process would begin all over again and altruists might maintain or increase their gains.  Again, this is a highly simplified version of the argument.  But the essential point is that while individual altruists may always be at a disadvantage relative to egoists, groups of altruists may be at an advantage relative to groups of egoists. 

            Nothing in the present article argues against group selection.  Organisms may be born with greater or lesser biological tendencies to be altruistic.  But, it does not follow from group selection that altruistic behavior is incompatible with a larger individual selfishness.  Sober and Wilson consider only two forms of human selfishness: that selfishness which desires maximization of consumer goods and that which desires (immediate) “internal, psychological benefits” (p.2).  They do not consider individual behavior in the long run and in the abstract.  They leapfrog over behavioral contingencies that may cause behavioral change (contingencies analogous to the group selection processes they have just developed) and proceed directly to “delve below the level of behavior” (p 194) to an internal cognitive mechanism hypothesized to mediate between the biological selective process and altruistic behavior.  Their cognitive psychology may well be correct but it is not clear how (or even whether), according to their psychology, altruism might emerge from selfishness over an organism’s lifetime.  If it implies that we are born with fixed proportions of selfish and altruistic motives and that experience cannot teach us to alter those proportions then their theory is not as optimistic as Sober and Wilson seem to think; it will not be of much use to those of us trying, despite our weaknesses, to live a better life.

 

3. Altruism and Self-Control

            The contingencies of my lecture demonstration of ambivalence in a social prisoner’s dilemma situation correspond to those of  “primrose-path” experiments with individual subjects facing an intertemporal dilemma (Herrnstein 1991; Herrnstein & Prelec 1992; Herrnstein et al. 1986; Heyman 1996; Kudadjie-Gyamfi 1998; Kudadjie-Gyamfi & Rachlin 1996).  In the prisoner’s dilemma situation illustrated in Figure 1 (bold typeface) many subjects each make a single choice between X and Y.  In primrose path experiments, on the other hand, a single subject makes repeated choices between X and Y.  The rules of the primrose path experiment, usually not told to the subjects, parallel those of the social cooperation experiments.  A typical set of rules follows:


1. Each choice of Y gains N points (convertible to money at the experiment’s end).

2. Each choice of X gains N points plus a bonus of 3 points.

3. N equals the number of Y choices in the last 10 trials.[3]


Figure 1 (labels in italic typeface) illustrates these contingencies in a corresponding way to social cooperation.  The reward for choosing X is always greater than that for choosing Y but overall reward (proportional to the ordinate of line AC) would be maximized by repeatedly choosing Y.  Ambivalence (reflected in social cooperation dilemmas as non-exclusive choice between X and Y across subjects) would be reflected, in primrose path experiments, as non-exclusive choice by individual subjects across trials.  Indeed, in these experiments, subjects generally distribute choices non-exclusively across X and Y.

            Complex as it is, Figure 1 is a highly simplified picture of real-world complexity.  Lines AD and BC need not be parallel or straight or even monotonic (Rachlin, 1997, 2000).  High rates of consumption, harmful in one context, may be not harmful, or may be beneficial, in others.  Nevertheless, the ambivalence represented by Figure 1 is real and captures everyday-life problems of self-control as well as everyday social dilemmas. 

            The labels in brackets in Figure 1 illustrate the application of this model to alcoholism.  Let us say that point A represents a low rate of drinking (one or two glasses of wine with dinner).  Dinner would be more enjoyable, however, with three glasses of wine and perhaps a cocktail beforehand (point B).  But this much drinking every evening might interfere with sleep, or cause a hangover the next morning, or be slightly damaging to health.  That is, notwithstanding the distinct pleasure of the extra drinking, the average value of the drinker’s state over time (line AC) would be ever so slightly lower as rate of drinking moves one unit to the right. Further increases in the number of drinks before, during, or after dinner (or instead of dinner) would always be immediately preferable to continuing at the lower rate but, if repeated day after day, would bring average value over time lower and lower (moving to the right on line AC).  Eventually, at point C, drinking would serve only to prevent the misery of descent to point D. In other words, positive reinforcement, in going from point A to B by the social drinker having an extra drink, would have been replaced by negative reinforcement (avoidance of point D) in staying at point C by the alcoholic continuing to drink at a high rate.

            The model of alcoholism as represented in Figure 1 is highly simplistic. Social drinking may be more valuable than teetotaling even in the long run. As noted above, lines AD and BC may not be parallel or even straight (see Herrnstein & Prelec, 1992; Rachlin, 1997, 2000, for discussion of more complex cases). Nevertheless, the model has suggested several methods of bringing behavior back from addiction (from point C to A).  These include formation of temporally extended behavior patterns (Rachlin, 1995a; 1995b), substitution of a “positive addiction” such as social activity for a negative addiction (Rachlin, 1997), and manipulation of discriminative stimuli so as to signal changes in overall value (Heyman, 1996; Rachlin, 2000).

            The existence of conflicting reinforcement at the level of particular acts versus that of patterns of acts makes it at least conceivable that a particular unreinforced act such as a mother’s running into a burning building to save someone else’s child may nevertheless be reinforced as part of a pattern of acts. A group of such acts, every one of them unreinforced (altruistic in the strict sense), may nevertheless form a highly reinforced – a maximally reinforced – pattern.

            Just as group selection theory postulates more than one level of selection so there may be more than one level of reinforcement – reinforcement of particular acts and reinforcement of groups, or patterns, of acts. Just as the behavior maximizing benefit to the individual may conflict with the behavior maximizing benefit to the group (which is what generates ambivalence in prisoner’s dilemma situations) so a maximally reinforced act may conflict with a maximally reinforced pattern of acts. I have argued (Rachlin 1995a; 2000) that this latter type of conflict epitomizes many problems of self-control. I call this conflict complex ambivalence, as opposed to simple ambivalence in which one response leads to a smaller more immediate reward while an alternative response leads to a larger more delayed reward.[4]           

            Platt (1973) pointed out the relation between “temporal traps” and “social traps.” Temporal traps are conflicts in the individual between smaller-sooner and larger-later rewards – situations of simple ambivalence. Social traps are conflicts between rewards beneficial to the individual and rewards beneficial to the group. Platt speculated that social traps could be understood as a subclass of temporal traps. But the correspondence between the two kinds of traps breaks down when attention is focused on particular choices (Dawes, 1980; Messick and McClelland, 1983). These authors point out that prisoner’s dilemma problems such as the one in my class demonstration involve immediate conflicting consequences for the individual versus the group. The people in the audience are faced with only one momentary choice. Where is the temporal trap? The answer is that there is no temporal trap as long as temporal traps are limited to conditions of simple ambivalence. However, the correspondence of altruism and self-control is based not on simple ambivalence but on complex ambivalence; single choice exists in a vacuum. Assuming that their hypothetical choices are those they would make in a real situation, the members of my audience are making only one in a series of choices extending to their lives outside of the lecture hall.  Messick and McClelland say (footnote 1, p. 110), “Obviously, a repeated Prisoner’s Dilemma game requires a temporal component [that is, it can be explained in terms of self-control] but the opposition that characterizes a social trap exists without such repetition.” This assertion highlights a crucial difference between teleological behaviorism and cognitive psychology. For the teleological behaviorist there can be no social trap without repetition. All prisoner’s dilemmas are repeated. If a person were born yesterday, played one prisoner’s dilemma game, cooperated in that game, and then died today, it would be impossible to say whether the person’s cooperation were truly altruistic or just an accident or really, in some other conceivable game, a defection.           

                                                           

4. Definitions of Self-Control And Altruism

            Moral philosophers at least since Plato have claimed that there is a relationship between self-control and altruism.[5]  The fundamental issue addressed by ancient Greek philosophy was the relation between particular objects and abstract entities: abstract ideals for Plato; abstract categories for Aristotle (Rachlin 1994; Stout 1996). The problem of self-control in cases of complex ambivalence is a conflict between particular acts such as eating a caloric dessert, taking an alcoholic drink, or getting high on drugs, and abstract patterns of acts strung out in time such as living a healthy life, functioning in a family, or getting along with friend.

            Neither self-control nor altruism is a class of particular movements, operants, or acts.  Moreover, while self-control and altruism are both relative terms, depending on alternatives rejected as well as alternatives chosen, neither term refers to a particular choice independent of its context.  For example, an alcoholic’s particular choice of ginger ale over scotch and soda cannot be self-controlled unless it is embedded in a context of similar choices; if a person chooses scotch and soda 99 times to each choice of ginger ale, the choice of ginger ale is in no way self-controlled.  The person might have been extremely thirsty at the moment when ginger ale was chosen, or might have been trying to hide his alcoholism at that moment, or might have made a mistake in his choice.  The alcoholic’s verbal claim that he intended to control his drinking at that moment would be taken as valid by the behaviorist only in the light of consistent future choices of ginger ales over scotch and sodas. And this criterion would hold regardless of the state of his nervous system, regardless of the activity or lack of activity of any internal mechanism.  For the behaviorist, self-control as such has to lie wholly in choice behavior – but need not lie in any particular act of choice.

            Similarly, no particular act is altruistic in itself – even a woman’s running into a burning building and saving a child.  If the woman were normally selfish we would look for other explanations (perhaps she was just trying to save her jewelry and only incidentally picked up the child).  A truly altruistic act is always part of a pattern of acts (highly valued by both the actor and the community) particular components of which are dispreferred by the actor to their immediate alternatives.  Altruistic patterns of acts are thus subsets of self-controlled patterns.  The particular components of an altruistic pattern, like those of a self-controlled pattern, are less valuable to the actor than are their immediate alternatives; however, in the case of altruistic acts, they are also more valuable to the community than are their immediate alternatives.

            Self-control may be defined more formally as follows: If two alternative activities are available, a relatively brief activity lasting t units of time, and a longer activity lasting T units of time, where T = nt and n is a positive number greater than one, a self-control problem occurs when two conditions are satisfied:

 

            1. The whole longer activity is preferred to n repetitions of the brief activity, and

            2. The brief activity is preferred to a t-length fraction of the longer activity.

 

By “brief activity” and “long activity” I mean classes of activities perhaps not identical in topography but classified functionally, as Skinner (1938) defined operant class. For example, eating a steak dinner at a restaurant and drinking a malted at a lunch counter might be counted as repetitions of the same brief activity – eating high-calorie food.  The long activity would be going through a period of time (a day, a month, a year) without eating high-calorie foods. The choice of the longer activity over a series of choices of the shorter activity is self-control.[6]

            According to this definition, the “self” underlying self-control is not an internal entity, spiritual or mechanistic, containing a person’s mental life (including a more or less powerful “will”). Such an entity would imply what Parfitt (1971) calls “personal continuity,” a concept he believes we would be better off abandoning. Rather, the self is conceived as existing contingently in a series of overlapping temporal intervals during which behavior occurs in patterns (what Parfitt calls, “contingent personal interactions”). People’s “selves” would thus evolve and change over their lifetimes, as these patterns evolved and changed, as a function of social and non-social reinforcement.

            Social cooperation situations may now be seen as a subcategory of self-control situations. A social cooperation situation exists when, in addition to Conditions 1 and 2:

 

3. A group benefits more when an individual member chooses a t-length fraction of the longer activity than it does when the individual chooses the brief activity.

 

            An altruistic act is defined as a choice of the t-length fraction of the longer activity over the brief activity under Conditions 1, 2, and 3. The size of the group may range from only two people to the population of the world. The cost of the altruistic act may be a true cost, as when one anonymously donates money to charity, or an opportunity cost – the loss of the preferred brief alternative. Note that in this definition a particular altruistic act need not be reinforced, either presently or in the future. Reinforcement of altruism is obtained only when such acts are grouped in patterns that are, as a whole, intrinsically valuable.  Thus, the woman’s act of running into a burning building to save someone else’s child is reinforced only insofar as it is part of a highly valued pattern. It may not itself ever be reinforced and may be punished by injury or death. If the woman died in the attempt, the act may still have been worth doing since not doing it would have broken a highly valued pattern.[7]

            This way of thinking about altruism and self-control may seem strange but it is not at all unusual. It is what Plato meant when he held Socrates’ life (and death) to be both good (ethical) and happy. It is what many thinkers about ethics, before and since, have been saying. In 20th century psychology, the gestalt psychologists emphasized that the whole could be greater than the sum of its parts. They intended this maxim to apply to motivation or value as much as to perception (Lewin, 1936). Consider listening to a symphony (assuming you enjoy this activity) on a CD that you just bought. Your enjoyment apparently begins when the music begins and ends when the music ends. Now suppose, after listening to the first 57 minutes of the symphony, you discover that the final three minutes of the 60-minute piece are missing from the CD. Is your enjoyment of the music just reduced by 3/60 of what it would have been if the whole symphony were played? Or is the breaking of the pattern so costly that the missing three minutes ruins the whole experience? In my own case the latter would be true. Readers who do not agree may imagine some other temporally extended activity that would be ruined for them by interruption late in the sequence.

            The meaning of a single instrumental act can be found only in a context of other acts. Conditions 1, 2, and 3 place the act in such a  context. For the cognitive psychologist, on the other hand, the meaning of a single act is to be found in the mechanism that immediately and efficiently caused the act. Thus, for the cognitive psychologist, a single act may be altruistic or not independent of other acts. Obviously, both cognitive and behavioral investigations need to be pursued. I am not saying that one is any more valuable or important than the other. But I do believe that it makes more sense to say that the behaviorist studies altruism itself while the cognitive psychologist studies the mechanisms behind it than it does to say that the cognitive psychologist studies altruism itself while the behaviorist studies only its behavioral effects.

            It seems clear that a person may be self-controlled without being altruistic. That is, Conditions 1 and 2 may obtain while Condition 3 does not. Although, given our strong social dependencies, there is usually some social benefit when a person stops drinking or smoking or overeating or gambling, such benefits are arguably incidental. The opposite question, whether a person may be altruistic without being self-controlled, however, is the one that concerns us here. This question is important because its answer determines whether people need a special mechanism for altruism, aside from whatever mechanism mediates self-control. Most demonstrations of altruistic behavior without egoistic incentives have focused on particular acts (Caporeal et al., 1989). But it is not possible to determine that a separate altruism mechanism exists by the absence of reinforcement (immediate or delayed) of particular altruistic acts. The question is rather: Are there altruistic acts under Conditions 2 and 3 above where Condition 1 does not obtain? This is a difficult question to answer because Condition 1 does not specify the appropriate context (the longer activity, T) for a particular act. Is there any context (any relatively long-duration activity, T) in which a given altruistic act would also be a self-controlled act? I believe that it will always be possible to find such a context. This makes altruism a relative concept; in some contexts a given act will be altruistic and in some contexts, not. Where it is altruistic it will also be self-controlled (although the reverse may not be true).

            The relativity of the concept of altruism should not be disturbing. First, it does not imply a moral relativism. Many Nazi soldiers behaved altruistically in the context of their military units but immorally in a larger context. Morality does not depend on altruism any more strictly than it depends on self-control. A moral code may approve of some kinds of altruism but disapprove of others just as it may approve of some kinds of self-control and disapprove of others.

            Secondly, whether an act is self-controlled or impulsive is no less contextually dependent than whether it is altruistic or selfish. Even a hungry rat rewarded by food for pressing a lever is to an extent controlling itself.  The pattern of pressing the lever and eating takes longer (necessarily) than the act of pressing the lever alone.  Pressing the lever, considered alone, is dispreferred to just sniffing in the corner of the cage; hence pressing the lever for food to be delivered within a fraction of a second is an instance of self-control. Correspondingly, even a slug may be said to exhibit self-control – on a microscopic level.  At the other extreme, strict sobriety may be narrow relative to a still more complex pattern of social drinking.

            There is a sense in which all acts (of choice) are selfish; the same sense in which all instrumental acts are reinforced and, for the economist, all behavior maximizes utility.  These are assumptions of theory, or rather methods of procedure, not empirical findings.  But this does not mean that selfishness is a meaningless concept (any more than reinforcement or utility maximization is).  The sense in which an altruistic act is selfish (as part of an ultimately selfish pattern) differs from that in which a non-altruistic act is selfish.  And this distinction is an empirical one.

            Behavioral psychology has not been able to trace every particular act to a particular reinforcer – immediate or in the future. Organized patterns of acts occur despite the existence within them of unreinforced particular acts. What then reinforces the patterns? In psychology, theories of reinforcement based on “pleasure” or “need” or “drive” have not been able to explain particular acts. Such theories have proved to be circular – “pleasures,” “needs,” and “drives” proliferated about as fast as the behaviors they were supposed to explain. It is often not possible to use these concepts to predict behavior in one choice situation from behavior in another. But Premack’s (1965) wholly behavioral theory and the economic theories based on it (Rachlin et al., 1981) are predictive and non-circular. These theories use the choices under one set of behavioral contingencies or constraints to estimate the values of the alternatives (or the parameters of a utility function) and then use those values or parameters to predict choice under other sets of contingencies or constraints.

            This method serves to explain choices among patterns of acts as well as particular acts. And, it answers the social-cooperation question, “Why is friendship rewarding?” as well as the self-control question, “Why is sobriety rewarding?” The answer in both cases, for the behavioral psychologist, is that in a choice test between each of these patterns as a whole and their respective alternative patterns as a whole, friendship would (at least in some cases) be chosen over loneliness and sobriety would (at least in some cases) be chosen over drunkenness.[8]

 

5. Commitment

            No amount of calculation by the mother who runs into a burning building to save someone else’s child will bring the benefits-minus-risks of this activity considered by itself into positive territory. But over a series of actions, a series of opportunities to sacrifice her own benefit for the benefit of others, the weightings may change. As we have seen (Figure 1) social and individual decisions may individually be completely negative, their only value appearing when they are grouped.[9]  The problem is that life ordinarily faces us not with groups of decisions but with particular decisions that must be made. It is up to us to group decisions together, and we do this by means of various commitment devices – contracts, agreements, buying tickets to a series of concerts or plays, joining a health club, and so forth.

            These commitments may work by instituting some punishment (such as loss of money or social support) should we fail to carry them through.  Green & Rachlin (1996) have shown that pigeons prefer, A: a future choice between 1) a small, immediate reward followed by punishment and 2) a larger, delayed reward to, B: the same future pair of alternatives but without the punishment. Only by present choice of the future pair of alternatives involving punishment will they avoid being tempted later by the smaller immediate reward and will they obtain the larger reward that they prefer at the present time. Another kind of commitment shown by pigeons

(Siegel & Rachlin, 1996) is “soft commitment.” At an earlier time the pigeon begins a pattern of behavior, such as rapidly pecking a fixed number of times on a lit button. This pattern is difficult for the pigeon to interrupt. Then, in the midst of this pattern, the tempting alternative (the smaller, immediate reward) is presented. Only by continuing and completing the previously begun pattern of behavior will the larger reward be obtained.  By beginning and continuing the pattern the pigeon avoids the temptation and obtains the larger reward. The further along the pigeon is into the pattern, the more likely it is that the tempting small reward will be avoided.

            In a primrose-path experiment (italicized labels of Figure 1) in my laboratory (Kudadjie-Gyamfi & Rachlin, 1996) human subjects chose the self-control option (Y) more when choices were clustered in threes (patterned) than when they were evenly spaced. Within a group of three choices, the probability of self-control on the first choice was high but, given self-control on the first choice, the conditional probability of self-control on the second choice was higher and, given self-control on the first two choices, the probability of self-control on the third choice was higher still. Similarly, in a repeated prisoner’s dilemma situation, playing against tit-for-tat (a strategy that mimicked, on a given trial, the subject’s choice to cooperate or defect on the previous trial), human subjects cooperated more when trials were clustered in fours than when they were evenly spaced out; moreover, as in the self-control experiment, conditional probability of cooperation increased as the sequence progressed (Brown, 2000).

            Soft commitment with pigeons is a model, on a narrow temporal scale, for successful self-control by humans, on a much wider temporal scale (Rachlin, 2000). The alcoholic, for example, resolves to stop drinking, and refuses one drink. At that point he is vulnerable to the offer of another drink. But if he refuses 10 drinks he is less vulnerable and if he refuses 100 drinks he is still less vulnerable. He refuses the later drinks not because their value is reduced (their value is actually enhanced as deprivation increases) but because he has already begun a pattern of refusal that involves some cost to break.  As he repeatedly refuses drinks (climbs up line DA in Figure 1) the long term rewards that sobriety entails – better health, social support, better job performance – grow apace.

            In experiments on repeated prisoner’s dilemmas some subjects cooperate and continue to cooperate regardless of whether other subjects cooperate with them (Brann & Foddy 1988). These people may be said to cooperate out of a sense of moral duty or for ethical reasons or because they are more altruistic than others. But these sorts of explanations do not say why such people behave as they do. To understand their behavior, the laboratory experiment has to be seen not as an isolated situation but in the context of everyday life. Many experimental subjects are willing and able to separate decisions made in a psychology experiment from those they make in everyday life. But others are not able or not willing to do so. They have decided to cooperate in life and continue to do so in the experiment, not necessarily because of some innate tendency to be altruistic, but because altruism is generally valuable and they would not act altruistically if they made decisions on a case-by case basis. The experiment is merely one case, one situation out of many in their lives. Moral duty, ethical concerns, and altruism are apt descriptions of their behavior. But these qualities do not come from nowhere. They are highly valued patterns of behavior – just as moderation in eating, moderation in drinking, and moderation in sexual activity are highly valued patterns.

 

6. Reinforcement and Punishment in The Prisoner’s Dilemma

            Current discussions of altruism and selfishness in philosophy, biology, economics, and psychology are generally united by reference to strategies of play in prisoner’s dilemma situations. The present analysis does not deny the interest or importance of strategies. Rather, as patterns of behavior, it sees them as crucial. The difference between the present behavioral analysis and cognitive analyses is that, in determining what underlies a strategy, the behaviorist looks for contingencies of reinforcement and punishment rather than internal mechanisms. Thus, it is important to show that the prisoner’s dilemma incorporates reinforcement and punishment contingencies and that prisoner’s-dilemma behavior is sensitive to those contingencies.

            Consider the contingencies of the 2-person prisoner’s dilemma diagramed in Figure 2a.  If both players cooperate, each gets 5 points (convertible to money at the experiment’s end); if both defect, each gets 2 points; if one cooperates while the other defects, the cooperator gets 1 point while the defector gets 6 points.  Figure 2b diagrams the game in a corresponding way to Figure 1, revealing the ambivalence.  As in Figure 1, defection results in a higher immediate reward and a lower long-run reward while cooperation results in the reverse.  Regardless of the other player’s choice, it is always immediately better to defect than to cooperate; if the other player has cooperated then a player will gain 6 points by defecting and 5 points by cooperating; if the other player has defected then a player will gain 2 points by defecting and only 1 point by cooperating.  If communication between players is against the rules, if the game could be played only once (and no similar cooperative tasks were ever expected to be undertaken with the other player), then the motive to defect should predominate.  However, if there were some way to get the other player to cooperate, then whatever it takes to do this should predominate over defection because the gain from the right to the left vertical line in Figure 2b averages 4 points while the gain from the lower to the upper line (from cooperation to defection) averages 1 point.  The best set of circumstances would be to defect while the other player cooperates, earning 6 points.  This is an unlikely scenario since the other player would then earn only 1 point.  However, if communication were within the rules, it would be possible to compromise by agreeing to mutual cooperation, earning 5

points each (the highest pooled score).  Or, if the game were to be played many times, it would be possible to reinforce the other player’s cooperation by cooperating, and to punish the other player’s defection by defecting.  This strategy is called “tit-for-tat.”  The dashed line shows average points gained in repeated trials against tit-for-tat with a distribution of choices proportional to the distance between the vertical lines.  For example, alternation of cooperation and defection (halfway between the vertical lines) yields 6 points and 1 point alternately for an average of 3.5 points per trial against tit-for-tat.  The highest point on the dashed line (hence the best strategy against tit-for-tat) is to cooperate on all trials.  Tit-for-tat has  indeed been highly effective in generating cooperation and maximizing pooled scores in several situations: computer simulations of prisoner’s dilemma games (Axelrod 1997); 2-person games with human subjects (Rapoport & Chammah 1965; Silverstein et al 1998; Brown & Rachlin 1999); with a single subject playing against a computer programmed to play tit-for-tat (Komorita & Parks 1994).

Figure 2.(a) Payoff matrix of 2-person prisoner's dilemma game. (b) Same game. Player A's earnings for cooperation (lower black dot) and defection (upper black dot) as a function of Player B's choice.


            The crucial variable influencing cooperation in 2-person games seems to be reciprocation (Komorita & Parks 1994; Silverstein et al. 1998).  This is also true in games with more than 2 players such as illustrated in Figure 1 (Komorita et al. 1993).  The tit-for-tat strategy imposes a strict reciprocation and thus engenders cooperation.  Prior communication enhances reciprocation and thus has the same effect.  On the other hand, when reciprocation is low or nonexistent, as when the other player plays randomly or always cooperates or always defects, cooperation deteriorates (Silverstein et al. 1998).  Baker and Rachlin (2001) found that a player’s probability of cooperation in a 2-person prisoner’s dilemma game varied directly with the other player’s probability of reciprocation.

 

7. Context

            As Tversky and Khaneman (1981) showed, context, or “framing,” strongly influences

probabilistic choice behavior. Context is likewise a strong determinant of self-control. Heyman (1996) cites a study by Robins (1974) of American soldiers who became addicted to heroin in Vietnam. The majority of these addicts easily gave up their addiction when they came home to a different environment. Heyman argues that the boundary line separating local from non-local events (the duration of the chosen activity) may vary over a wide range (depending on the salience and relevance of environmental stimuli), thereby explaining how humans and nonhumans may act impulsively in one situation and self-controlled in another. A second experiment by Baker and Rachlin (in press) demonstrates a similarly strong influence of context in a social cooperation experiment with human subjects.

            Tit-for-tat is a teaching strategy.  A computer, playing tit-for-tat against a player, invariably follows the player’s cooperation by cooperating on the next trial and invariably follows the player’s defection by defecting on the next trial.  Since the computer’s cooperation is much more valuable to the player than its defection, the computer’s cooperation reinforces the player’s cooperation and its defection punishes the player’s defection.  Thus the computer “teaches” the player to cooperate.

            Another strategy that has been successful in computer tournaments (dominating tit-for-tat) is called Pavlov (Fudenberg & Maskin, 1990; Nowak & Sigmund, 1993). Pavlov is a learning strategy. Using Pavlov, the computer’s choice on the present trial, whether cooperation or defection, is repeated on the next trial if the player cooperates and changed on the next trial if the player defects. Against tit-for-tat, the player cannot successfully punish the computer’s defection; the computer would respond to defection by defecting itself. Using Pavlov, however, the computer would respond to defection by changing its choice on the next trial: if it had defected, it would now cooperate; if it had cooperated, it would now defect. The computer using Pavlov would respond to cooperation by repeating its choice on the next trial; if it had defected, it would defect again; if it had cooperated, it would cooperate again.  That is, the computer would behave as if its choice were reinforced by the player’s cooperation and punished by the player’s defection.  Thus the computer, playing Pavlov, “learns” from the player.


Figure 3. Results of Baker and Rachlin’s (in press a) experiment. Average of last 15 of 100 trials.


            In this experiment, four groups of subjects (Stony Brook undergraduates) played 100 trials of a prisoner’s dilemma game. Against each subject in two groups, the computer played a modified form of tit-for-tat. Against each subject in the other two groups, the computer played a modified form of Pavlov.[10]  One of the tit-for-tat groups and one of the Pavlov groups saw a spinner on the computer screen and were correctly informed that the computer’s responses were determined by that spinner. The other two groups believed that they were playing the game against another player rather than against a computer. They did not see a spinner but they did see the “other player’s” reward matrix (and reward presumably received) as well as their own.[11]

 

            The results of the experiment are shown in Figure 3. The context of the game – whether or not the subjects were led to believe that they were playing against another subject – had a strong effect on their behavior, but the context effect was opposite for the two computer strategies. When subjects believed that they were playing against a computer, they cooperated more against tit-for-tat (where the computer reinforced and punished the players’ cooperation and defection) than they did against Pavlov (where the computer’s choices were reinforced and punished by the players’ cooperation and defection). On the other hand, when subjects believed that they were playing against a human being, they cooperated more against Pavlov than against tit-for-tat. This result may be attributed to the fact that subjects’ histories of interacting with machines (unlikely to be responsive to reinforcement and punishment) differed from their histories of interacting with other people (more likely to be responsive). When the relatively global histories matched the relatively local set of contingencies (the computer’s strategies) subjects cooperated; when the global histories contradicted local contingencies they defected. (In all cases, however, under the most narrowly local contingencies, defection was immediately reinforced.)  Choice in prisoner’s dilemma situations, therefore, like choice in self-control situations, may be understood in terms of global as well as local reinforcement.

            Taken together with the previously discussed experiments of Kudadjie-Gyamfi & Rachlin (1996) and Brown (2000), in which patterning choices over time increased human subjects’ self-control and prisoner’s-dilemma cooperation, the experiment described above shows that, at least in laboratory studies, self-control and social cooperation are similarly responsive to reinforcement contingencies and similarly sensitive to context.

            Laboratory models, however, are necessarily diminished representations of everyday-life processes. The reinforcers in all of these experiments – points convertible to money – were extrinsic to the subjects’ choices. If, as argued here, the reinforcers of real-life self-controlled and altruistic behavior are intrinsic in the patterns of those behaviors and if those patterns are extended over long durations – months and years – real-life rewards will never be duplicated in a 30-minute laboratory experiment.

            The experiment described above partially gets around this limitation by varying verbal instructions so as to bring the brief laboratory experiment into differing long-term, real-life contexts. Moreover, an economic extension of Premack’s conception of reinforcement (Rachlin et al., 1981) sees all reinforcement as intrinsic (even that of a rat’s lever press reinforced by food; the rat is seen as choosing the pattern of lever pressing plus eating over not lever pressing plus not eating).

            Nevertheless, there remains a vast difference in scale between laboratory experiments and real life. The point of the experiments is to s