© 2002 Cambridge University Press


Below is the unedited, uncorrected preprint of an accepted target article previously published in Behavioral and Brain Sciences, Volume 25, Number 2: 239-250 (April 2002). Please visit the Cambridge Journals Online BBS Home Page to order the full published treatment.


ALTRUISM AND SELFISHNESS

 

Howard Rachlin

Psychology Department

State University of New York

Stony Brook, New York, 11794-2500

 

(212) 632-7807

e-mail: howard.rachlin@sunysb.edu

 

(13,641 words in all)

Revision: 8/01

 


Long Abstract: Many situations in human life present choices between (a) narrowly preferred particular alternatives and (b) narrowly less preferred (or aversive) particular alternatives that nevertheless form part of highly preferred abstract behavioral patterns. Such alternatives characterize problems of self-control. For example, at any given moment, a person may accept alcoholic drinks yet also prefer being sober to being drunk over the next few days.  Other situations present choices between (a) alternatives beneficial to an individual and (b) alternatives that are less beneficial (or harmful) to the individual that would nevertheless be beneficial if chosen by many individuals. Such alternatives characterize problems of social cooperation; choices of the latter alternative are generally considered to be altruistic. Altruism, like self-control, is a valuable temporally-extended pattern of behavior. Like self-control, altruism may be learned and maintained over an individual’s lifetime. It needs no special inherited mechanism. Individual acts of altruism, each of which may be of no benefit (or of possible harm) to the actor, may nevertheless be beneficial when repeated over time. However, because each selfish decision is individually preferred to each altruistic decision, people can benefit from altruistic behavior only when they are committed to an altruistic pattern of acts and refuse to make decisions on a case-by-case basis.

 


Short Abstract: Many situations in human life present choices between particular and abstract alternatives. Such choices characterize both problems of self-control and problems of social cooperation. Choices of social good at a cost to the particular individual are generally considered to be altruistic. Altruism, like self-control, is a valuable temporally-extended pattern of behavior. Like self-control, altruism may develop over an individual’s lifetime. It needs no special inherited mechanism. Individual acts of altruism, each of which may be costly to the actor, may nevertheless be beneficial when repeated over time. However, because each selfish decision is individually preferred to each altruistic decision, people can benefit from altruistic behavior only when they are committed to an altruistic pattern of acts and refuse to make decisions on a case-by-case basis.

 

Key Words:

addiction, altruism, commitment, cooperation, defection, egoism, impulsiveness, patterning, prisoner’s dilemma, reciprocation, reinforcement, selfishness, self-control

 

 


 

 

ALTRUISM AND SELFISHNESS

 

 

1. Introduction

            Biological Compatibility.  Altruism and selfishness, like free-will and determinism, seem to be polar opposites. Yet, as with free will and determinism (Dennett, 1984), the apparent incompatibility may be challenged by various forms of compatibility. From a biological viewpoint selfishness translates into survival value. Evolutionary biologists have been able to reconcile altruism with selfishness by showing how a biological structure mediating altruistic behavior could have evolved. (The next section will briefly summarize one such demonstration.) This structure is assumed to be more complex than ordinary mechanisms that mediate selfish behavior but in essence is no different from them. The gazelle that moves toward the lion (putting itself in danger but showing other gazelles where the lion is) may thus be seen as acting according to the same principles as the gazelle that takes a drink of water when it is thirsty. The desire to move toward the lion stands beside the desire to drink.

            Evolutionary biologists do not conceive of behavior itself being passed from generation to generation; rather, some mechanism, in this case an internal mechanism – a structure of nervous connections in the brain – is hypothesized to be the evolving entity. Altruism as it appears in behavior is conceived as the action of that mechanism developed over the lifetime of the organism. Tooby and Cosmides (1996, p. 125) compare the structure of the altruism mechanism to that of the eye: “We think that such adaptations will frequently require complex computations and suspect that at least some adaptations for altruism may turn out to rival the complexity of the eye.”

            This biological compatibility makes contact with modern cognitive and physiological psychology (Sober & Wilson, 1998). Cognitive psychology attempts to infer the mechanism’s principles of action (its software) from behavioral observation and manipulation while physiological psychology attempts to investigate the mechanism itself (its hardware).

            From the biological viewpoint, altruistic acts differ from selfish acts by virtue of differing internal mediating mechanisms; altruism becomes a motive like any other. In this view, a person leaves a tip in a restaurant to which he will never return because of a desire for fairness or justice, a desire generated by the restaurant situation and the altruistic mechanism within him, which is satisfied by the act of leaving the tip. Similarly, he eats and drinks at the restaurant because of desires generated by internal mechanisms of hunger and thirst. For the biologist, Person A’s altruistic behavior (behavior that benefits others at a cost to A) would be fully explained if Person A were shown to possess the requisite internal altruistic mechanism. Once the mechanism were understood, no further explanation would be required.

            The problem with this conception, from a behavioral viewpoint, is not that it postulates an internal mechanism as such. (After all, no behavior is possible without internal neural structure.) The problem is that in focusing on an inherited internal mechanism, the role of learning over an organism’s lifetime tends to get ignored. To develop normally, eyes have to interact with the environment. But we inherit good eyesight or bad eyesight. If our altruism mechanisms are like our visual mechanisms we are doomed to be more or less selfish depending on our genetic inheritance. This is a sort of genetic version of Calvinism. Experience might aid in the development of altruistic mechanisms. Environmental constraints imposed by social institutions – family, religion, government – might act on selfish motives (like glasses on eyesight) to make them conform to social good. But altruistic behavior as such, according to biological theory, would depend (as eyesight depends) much more on genes than on experience.             The present article, does not deny the existence of such mechanisms.  A large part of human altruism and a still larger part of nonhuman altruism may well be explained in terms of inherited mechanisms based on genetic overlap.  However, the mechanisms underlying these behaviors would have evolved individually.  The mechanism responsible for the ant’s self sacrifice in defense of a communal nest would differ from that responsible for a mother bear’s care for her cubs.  There remains some fraction of altruistic action, especially among humans, that cannot be attributed to genetic overlap.  For the remainder of this article I will symbolize such actions by the example of a woman who runs into a burning building to save someone else’s child.  I mean this example to stand for altruistic actions not easily attributed to genetic factors.  Biological compatibility attributes such an act not to a specific mechanism for running into burning buildings but to a general mechanism for altruism itself. The present article will argue that it is unnecessary to postulate the existence of such a general mechanism.  I claim, first, that altruism may be learned over an individual’s lifetime and, second, that it is learned in the same way that self-control is learned –  by forming particular acts into coherent patterns of acts.  The woman who runs into a burning building to save someone else’s child does so not by activating an innate self-sacrificing tendency but by virtue of the same learning process she uses to control her smoking, drinking or weight.

 

            Behavioral Compatibility.  For biological compatibility, selfishness translates into survival value; for behavioral compatibility, selfishness translates into reinforcement.[1] From a behavioral viewpoint, an altruistic act is not motivated, as an act of drinking is, by the state of an internal mechanism; it is rather a particular component that fits into an overall pattern of behavior. Given this, the important question for the behaviorist is not, “What reinforces a particular act of altruism?” – for this particular act may not be reinforced; it may never be reinforced; it may be punished – but, “What are the patterns of behavior that the altruistic act fits into?”

            To explain why a woman might risk her life to save someone else’s child it would be a mistake to look for current or even future reinforcers of the act itself. By definition, as an altruistic act, it is not reinforced. In economic terms, adding up its costs and benefits results in a negative value. Some behavioristic analyses of altruism have tried to explain particular altruistic acts in terms of delayed rather than immediate reinforcement (Ainslie, 1992; Platt, 1973). But delayed reinforcers, after being discounted, may have significant present value, even for nonhumans (Mazur, 1987). If the present value of a delayed reward is higher than the cost of the act, it is hard to see how the act can be altruistic. It is certainly not altruistic of the bank to lend me money just because I will pay them back later rather than now. If the woman who risked her life to run into the burning building to save someone else’s child were counting on some later reward or sequence of rewards to counterbalance her risk (say ten million dollars, to be paid over the next ten years, offered by the child’s parents), her action would be no more altruistic than that of the bank when it lends me money.

            This narrow behavioral view of altruism has been criticized by social psychologists (Edney, 1980, for example) but the criticism focuses mostly on the behaviorism rather than on the narrowness of the view. These critics have merely replaced, as an explanatory device, the present action of delayed rewards with the present action of internal mechanisms. I argue here that it is a mistake to look for the cause of a specific altruistic act either in the environment or in the interior of the organism. Rather, the cause of the altruistic act is to be found in the high value (the reinforcing value, the survival value, the function) of the act as part of a pattern of acts, or as a habit (provided habit is seen as a pattern of overt behavior extended in time rather than, as it is sometimes seen in psychology, as an internal state). According to the present view, a woman runs into a burning building to save someone else’s child (without the promise of money) not because she is compelled to do so by some internal mechanism nor because she has stopped to calculate all costs and benefits of this particular act; if she did stop to calculate she would arrive at a negative answer and not do the act. Rather, this act forms part of a pattern of acts in her life, a pattern that is valuable in itself, apart from the particular acts that compose it. The pattern, as a pattern of overt behavior, is worth so much to her that she would risk dying rather than break it.

            Biological compatibility says that a particular altruistic act is itself of high value by virtue of an inherited general altruistic mechanism.  Learning would enter into the development of altruism, according to biological compatibility, only in the minimal sense that a baby has to learn how to eat.  The mechanism is there, the biologist says; you need only to learn how to use it.  Behavioral compatibility says, on the other hand, that the altruistic act itself is of low value and remains of low value.  What is highly valued is a temporally extended pattern of acts into which the particular act fits.  The role of the hypothesized internal altruistic mechanism in biological compatibility – to provide a motive for otherwise unreinforced particular acts – is taken, in behavioral compatibility, by the highly valued pattern of acts.  Learning of altruism, the behavioral compatibilist says, is learning to perform relatively low valued particular acts as part of a highly valued pattern.  Thus, from a behavioral viewpoint, particular altruistic acts are not in themselves fundamentally selfish; rather, an altruistic act is selfish only by virtue of the high value of the pattern.

 

            Teleological Behaviorism.  The kind of behaviorism that this view embodies is called, “teleological behaviorism” (Baum, 1994; Rachlin, 1994; Stout, 1996). Aristotle’s psychology and ethics are behavioristic in this teleological sense: for Aristotle, a particular action has no meaning by itself; the meaning of an action resides in habits of overt behavior as they are played out in time, not in internal mechanistic or spiritual events; whether a particular act is good or bad depends on the habit into which it fits. In Aristotle’s conception of science, habits are final causes of the particular acts that comprise them. While a particular ethical act may be caused (in the sense of efficient cause) by the action of an internal mechanism, it is caused (in the sense of final cause) by an abstract pattern of overt behavior. It is the final cause that determines whether the particular act is good or bad, altruistic or selfish.

            Teleological behaviorism retains Aristotle’s final-cause system of explanation in psychology. For example, it explains motives in terms of habits rather than habits in terms of motives. It is at least arguable that we will not be able to uncover the mechanisms underlying altruistic behavior until we gain a clear idea of what altruistic behavior is in its own terms – as a kind of habit. That is the purpose of this target article.

 

            Outline. Altruism and selfishness were introduced in Section 1 as apparently contradictory but nevertheless compatible behaviors. Particular altruistic acts are compatible with a larger selfishness – selfishness on a more abstract level. The introduction is followed in Section 2 by a discussion of group selection, a biological compatibility between altruism of the individual relative to other members of a group and selfishness (increased survival) of group members relative to those of other groups. Section 3 draws an analogy between group selection and self-control; just as particular acts of self-sacrifice are compatible with a more abstract benefit to a group of individuals, so particular unreinforced acts are compatible with a more abstract long-term benefit to the individual. Section 4 tightens the analogy with more formal definitions of both self-control and altruism. Whether a person acts impulsively or selfishly on the one hand versus temperately or altruistically on the other depends on the degree to which that person structures particular acts in patterns. Such structuring is discussed in Section 5 on commitment. If the analogy between self-control and altruism reflects a fundamental correspondence, altruism may be explained as self-control has been explained – as a choice between high valued particular acts and higher valued patterns of acts. Section 6 describes how the principles of reinforcement and punishment, which have been used to determine the value of self-control alternatives, may apply to social cooperation. Section 7 presents an experiment showing that behavior in a laboratory social-cooperation game depends strongly on the game’s context.  Sections 8 and 9 deal with potential objections.  Section 8 claims that altruism cannot be fully explained in biological terms, without the concept of reinforcement.  Section 9 claims that altruism cannot be fully explained in Skinnerian terms, without the concept of intrinsic reinforcement of behavioral patterns. Section 10 concludes that altruism as well as self-control involves organization of behavior in patterns and choosing among patterns as wholes.

           

2. Group Selection

            Biologists have speculated that the degree of common interest between organisms is fundamentally reflected in their shared genes (Dawkins, 1976).  The innate tendency of any organism to sacrifice its own interests for those of another organism would then depend on the degree to which their genes overlapped.  To the degree that closeness of familial relationship correlates with genetic overlap, innate altruism should be greatest within families and decrease as overlap decreases in the population.  The behavior of a mother who ran into a burning building to save her own child would thus be explained. But the many documented cases of altruism with respect to strangers (that of saints, heros, and the like) would not be explained.  Why would a mother ever run into a burning building, risking her own life (100% genetic overlap with herself), to save someone else’s child?

            Some principle other than genetic overlap seems to be necessary to explain the inheritance of an altruism that goes beyond the family. Recently, Sober & Wilson (1998) described such a principle – “group selection of altruism.”  To understand group selection you first have to understand a kind of social contingency called, “The Prisoner’s Dilemma.” An example of a prisoner’s dilemma game (in this case, a multi-person prisoner’s dilemma) is a game that I have, for the last ten years or so, been playing with the audience whenever I present the results of my research at university colloquia or conferences. I begin by saying that I want to give the audience a phenomenal experience of ambivalence.  Index cards are then handed to 10 randomly selected people and the others are asked to imagine that they had gotten one of the cards.  They choose among hypothetical monetary prizes by writing either Y or X on the card.  The rules of the game (projected on a screen behind me while I talk) are as follows:


1. If you choose Y you get $100 times N.

2. If you choose X you get $100 times N plus a bonus of $300.

3. N equals the number of people (of the 10) who choose Y.


            Then I point out the consequences of each choice as follows: “You will always get $200 more by choosing X than by choosing Y.  Choosing X rather than Y decreases N by 1 (Rule #3), costing you $100; but if you chose X you also gain the $300 bonus (Rule # 2).  This results in a $200 gain for choosing X.  Logic therefore says that you should choose X, and any lawyer would advise you to do so.  The problem is that if you all followed the advice of your lawyers and chose X, N = 0, and each of you would get $300; while if you all ignored the advice of your lawyers and chose Y, N = 10 and each of you would get $1,000.”  Sometimes, depending on the audience, I illustrate these observations with a diagram like Figure 1 (bold labels).

            Then I ask the 10 people holding cards to make their choices, imagining as best they can what they would choose if the money were real, and letting no one else see what they have chosen.  Then I collect the cards and hold them until I finish my lecture.  I have done this demonstration or its equivalent dozens of times with audiences ranging from Japanese psychologists to Italian economists.  The result is about an even split between cooperation (choosing Y) and defection (choosing X), indicating that the game does create ambiguity. Although the money won by members of my audiences is entirely hypothetical, significant numbers of subjects in similar experiments in my laboratory, with real albeit lesser amounts of money, have also chosen Y.[2]

            Figure 1 (labels in bold typeface) represents the contingencies of the prisoner’s dilemma game that I ask my audience to play. Point A represents the condition where everyone cooperates.  Point C represents the condition where everyone defects.  The line from A to C represents the average (hypothetical) earnings per person at each value of N (the inverse of the x-axis).  Clearly, the more people who cooperate, the greater the average earnings.  But, as is shown by the two lines, ABC (representing the return to each player who defects) and ADC (representing the return to each player who cooperates), an individual always earns more by defecting than cooperating.

            Suppose, instead of hypothetically giving money to each player, I instead pooled the money each player earned (still hypothetical) and donated it to the entertainment fund of whatever institution I were lecturing at.  Given this common interest it would now pay for every individual to choose Y; a choice of Y by any individual would increase N by 1 for all 10 players, gaining $1,000 at a cost of the individual player’s $300 bonus, for a net gain to the pool of $700.  A common interest thus tends to reinforce cooperation in prisoner’s dilemma games.

Figure 1. Contingencies of 10-person prisoner's dilemma experiment. Italic typeface: Contingencies of self-control, "primrose path", experiment (1 player, successive choices) to be described later . [In brackets]: Contingencies faced by alcoholic. In all three cases, particular choices of X [having a single drink] are always worth more than particular choices of Y [refusing a single drink] yet on the average it is better to choose Y [to drink at a low rate].


 

            Group selection relies on common interest.  A highly simplified version of group selection runs as follows: Consider a population of organisms divided into several relatively isolated groups (tribes, for example). Within each tribe there are some altruists and some selfish individuals (“egoists”) interacting with each other repeatedly in multi-person prisoner’s-dilemma-like games such as the one with which I introduce my lectures, except instead of monetary reward the players receive more or less fitness – ability to reproduce.  In these games the altruists tend to cooperate while the egoists tend to defect.  Within each group (as in the prisoner’s dilemma) altruists always lose out to egoists.  However, those groups originally containing many altruists grow much faster than those originally containing many egoists – because cooperation benefits the group more than defection does. 

            Consider the case of teams, such as basketball teams, playing in a league.  It is commonly accepted that, all else being equal, teams with individual players who play unselfishly will beat teams with individual players who play selfishly; however, within each team, the most selfish players will score the most points.  Imagine now, instead of scoring points and winning or losing games, the teams competed for reproductive fitness.  Then the number of players on teams with a predominance of unselfish players would grow rapidly while that of teams with a predominance of selfish players would grow slowly or (in competition for scarce resources) shrink – the group effect.  Although, within each team, selfish players would still increase faster than unselfish ones (the individual effect), this growth could well be overwhelmed by the group effect.  As time goes on, the absolute number of unselfish individuals (altruists) could increase faster across the whole population than the absolute number of egoists even though within each group the relative number of altruists decreases.  If the groups remained rigidly divided, eventually, because the relative number of altruists is always decreasing within each group, the absolute number would begin to decrease as well.  However, if, before this point is reached, the groups mixed with each other and then re-formed, the process would begin all over again and altruists might maintain or increase their gains.  Again, this is a highly simplified version of the argument.  But the essential point is that while individual altruists may always be at a disadvantage relative to egoists, groups of altruists may be at an advantage relative to groups of egoists. 

            Nothing in the present article argues against group selection.  Organisms may be born with greater or lesser biological tendencies to be altruistic.  But, it does not follow from group selection that altruistic behavior is incompatible with a larger individual selfishness.  Sober and Wilson consider only two forms of human selfishness: that selfishness which desires maximization of consumer goods and that which desires (immediate) “internal, psychological benefits” (p.2).  They do not consider individual behavior in the long run and in the abstract.  They leapfrog over behavioral contingencies that may cause behavioral change (contingencies analogous to the group selection processes they have just developed) and proceed directly to “delve below the level of behavior” (p 194) to an internal cognitive mechanism hypothesized to mediate between the biological selective process and altruistic behavior.  Their cognitive psychology may well be correct but it is not clear how (or even whether), according to their psychology, altruism might emerge from selfishness over an organism’s lifetime.  If it implies that we are born with fixed proportions of selfish and altruistic motives and that experience cannot teach us to alter those proportions then their theory is not as optimistic as Sober and Wilson seem to think; it will not be of much use to those of us trying, despite our weaknesses, to live a better life.

 

3. Altruism and Self-Control

            The contingencies of my lecture demonstration of ambivalence in a social prisoner’s dilemma situation correspond to those of  “primrose-path” experiments with individual subjects facing an intertemporal dilemma (Herrnstein 1991; Herrnstein & Prelec 1992; Herrnstein et al. 1986; Heyman 1996; Kudadjie-Gyamfi 1998; Kudadjie-Gyamfi & Rachlin 1996).  In the prisoner’s dilemma situation illustrated in Figure 1 (bold typeface) many subjects each make a single choice between X and Y.  In primrose path experiments, on the other hand, a single subject makes repeated choices between X and Y.  The rules of the primrose path experiment, usually not told to the subjects, parallel those of the social cooperation experiments.  A typical set of rules follows:


1. Each choice of Y gains N points (convertible to money at the experiment’s end).

2. Each choice of X gains N points plus a bonus of 3 points.

3. N equals the number of Y choices in the last 10 trials.[3]


Figure 1 (labels in italic typeface) illustrates these contingencies in a corresponding way to social cooperation.  The reward for choosing X is always greater than that for choosing Y but overall reward (proportional to the ordinate of line AC) would be maximized by repeatedly choosing Y.  Ambivalence (reflected in social cooperation dilemmas as non-exclusive choice between X and Y across subjects) would be reflected, in primrose path experiments, as non-exclusive choice by individual subjects across trials.  Indeed, in these experiments, subjects generally distribute choices non-exclusively across X and Y.

            Complex as it is, Figure 1 is a highly simplified picture of real-world complexity.  Lines AD and BC need not be parallel or straight or even monotonic (Rachlin, 1997, 2000).  High rates of consumption, harmful in one context, may be not harmful, or may be beneficial, in others.  Nevertheless, the ambivalence represented by Figure 1 is real and captures everyday-life problems of self-control as well as everyday social dilemmas. 

            The labels in brackets in Figure 1 illustrate the application of this model to alcoholism.  Let us say that point A represents a low rate of drinking (one or two glasses of wine with dinner).  Dinner would be more enjoyable, however, with three glasses of wine and perhaps a cocktail beforehand (point B).  But this much drinking every evening might interfere with sleep, or cause a hangover the next morning, or be slightly damaging to health.  That is, notwithstanding the distinct pleasure of the extra drinking, the average value of the drinker’s state over time (line AC) would be ever so slightly lower as rate of drinking moves one unit to the right. Further increases in the number of drinks before, during, or after dinner (or instead of dinner) would always be immediately preferable to continuing at the lower rate but, if repeated day after day, would bring average value over time lower and lower (moving to the right on line AC).  Eventually, at point C, drinking would serve only to prevent the misery of descent to point D. In other words, positive reinforcement, in going from point A to B by the social drinker having an extra drink, would have been replaced by negative reinforcement (avoidance of point D) in staying at point C by the alcoholic continuing to drink at a high rate.

            The model of alcoholism as represented in Figure 1 is highly simplistic. Social drinking may be more valuable than teetotaling even in the long run. As noted above, lines AD and BC may not be parallel or even straight (see Herrnstein & Prelec, 1992; Rachlin, 1997, 2000, for discussion of more complex cases). Nevertheless, the model has suggested several methods of bringing behavior back from addiction (from point C to A).  These include formation of temporally extended behavior patterns (Rachlin, 1995a; 1995b), substitution of a “positive addiction” such as social activity for a negative addiction (Rachlin, 1997), and manipulation of discriminative stimuli so as to signal changes in overall value (Heyman, 1996; Rachlin, 2000).

            The existence of conflicting reinforcement at the level of particular acts versus that of patterns of acts makes it at least conceivable that a particular unreinforced act such as a mother’s running into a burning building to save someone else’s child may nevertheless be reinforced as part of a pattern of acts. A group of such acts, every one of them unreinforced (altruistic in the strict sense), may nevertheless form a highly reinforced – a maximally reinforced – pattern.

            Just as group selection theory postulates more than one level of selection so there may be more than one level of reinforcement – reinforcement of particular acts and reinforcement of groups, or patterns, of acts. Just as the behavior maximizing benefit to the individual may conflict with the behavior maximizing benefit to the group (which is what generates ambivalence in prisoner’s dilemma situations) so a maximally reinforced act may conflict with a maximally reinforced pattern of acts. I have argued (Rachlin 1995a; 2000) that this latter type of conflict epitomizes many problems of self-control. I call this conflict complex ambivalence, as opposed to simple ambivalence in which one response leads to a smaller more immediate reward while an alternative response leads to a larger more delayed reward.[4]           

            Platt (1973) pointed out the relation between “temporal traps” and “social traps.” Temporal traps are conflicts in the individual between smaller-sooner and larger-later rewards – situations of simple ambivalence. Social traps are conflicts between rewards beneficial to the individual and rewards beneficial to the group. Platt speculated that social traps could be understood as a subclass of temporal traps. But the correspondence between the two kinds of traps breaks down when attention is focused on particular choices (Dawes, 1980; Messick and McClelland, 1983). These authors point out that prisoner’s dilemma problems such as the one in my class demonstration involve immediate conflicting consequences for the individual versus the group. The people in the audience are faced with only one momentary choice. Where is the temporal trap? The answer is that there is no temporal trap as long as temporal traps are limited to conditions of simple ambivalence. However, the correspondence of altruism and self-control is based not on simple ambivalence but on complex ambivalence; single choice exists in a vacuum. Assuming that their hypothetical choices are those they would make in a real situation, the members of my audience are making only one in a series of choices extending to their lives outside of the lecture hall.  Messick and McClelland say (footnote 1, p. 110), “Obviously, a repeated Prisoner’s Dilemma game requires a temporal component [that is, it can be explained in terms of self-control] but the opposition that characterizes a social trap exists without such repetition.” This assertion highlights a crucial difference between teleological behaviorism and cognitive psychology. For the teleological behaviorist there can be no social trap without repetition. All prisoner’s dilemmas are repeated. If a person were born yesterday, played one prisoner’s dilemma game, cooperated in that game, and then died today, it would be impossible to say whether the person’s cooperation were truly altruistic or just an accident or really, in some other conceivable game, a defection.           

                                                           

4. Definitions of Self-Control And Altruism

            Moral philosophers at least since Plato have claimed that there is a relationship between self-control and altruism.[5]  The fundamental issue addressed by ancient Greek philosophy was the relation between particular objects and abstract entities: abstract ideals for Plato; abstract categories for Aristotle (Rachlin 1994; Stout 1996). The problem of self-control in cases of complex ambivalence is a conflict between particular acts such as eating a caloric dessert, taking an alcoholic drink, or getting high on drugs, and abstract patterns of acts strung out in time such as living a healthy life, functioning in a family, or getting along with friend.

            Neither self-control nor altruism is a class of particular movements, operants, or acts.  Moreover, while self-control and altruism are both relative terms, depending on alternatives rejected as well as alternatives chosen, neither term refers to a particular choice independent of its context.  For example, an alcoholic’s particular choice of ginger ale over scotch and soda cannot be self-controlled unless it is embedded in a context of similar choices; if a person chooses scotch and soda 99 times to each choice of ginger ale, the choice of ginger ale is in no way self-controlled.  The person might have been extremely thirsty at the moment when ginger ale was chosen, or might have been trying to hide his alcoholism at that moment, or might have made a mistake in his choice.  The alcoholic’s verbal claim that he intended to control his drinking at that moment would be taken as valid by the behaviorist only in the light of consistent future choices of ginger ales over scotch and sodas. And this criterion would hold regardless of the state of his nervous system, regardless of the activity or lack of activity of any internal mechanism.  For the behaviorist, self-control as such has to lie wholly in choice behavior – but need not lie in any particular act of choice.

            Similarly, no particular act is altruistic in itself – even a woman’s running into a burning building and saving a child.  If the woman were normally selfish we would look for other explanations (perhaps she was just trying to save her jewelry and only incidentally picked up the child).  A truly altruistic act is always part of a pattern of acts (highly valued by both the actor and the community) particular components of which are dispreferred by the actor to their immediate alternatives.  Altruistic patterns of acts are thus subsets of self-controlled patterns.  The particular components of an altruistic pattern, like those of a self-controlled pattern, are less valuable to the actor than are their immediate alternatives; however, in the case of altruistic acts, they are also more valuable to the community than are their immediate alternatives.

            Self-control may be defined more formally as follows: If two alternative activities are available, a relatively brief activity lasting t units of time, and a longer activity lasting T units of time, where T = nt and n is a positive number greater than one, a self-control problem occurs when two conditions are satisfied:

 

            1. The whole longer activity is preferred to n repetitions of the brief activity, and

            2. The brief activity is preferred to a t-length fraction of the longer activity.

 

By “brief activity” and “long activity” I mean classes of activities perhaps not identical in topography but classified functionally, as Skinner (1938) defined operant class. For example, eating a steak dinner at a restaurant and drinking a malted at a lunch counter might be counted as repetitions of the same brief activity – eating high-calorie food.  The long activity would be going through a period of time (a day, a month, a year) without eating high-calorie foods. The choice of the longer activity over a series of choices of the shorter activity is self-control.[6]

            According to this definition, the “self” underlying self-control is not an internal entity, spiritual or mechanistic, containing a person’s mental life (including a more or less powerful “will”). Such an entity would imply what Parfitt (1971) calls “personal continuity,” a concept he believes we would be better off abandoning. Rather, the self is conceived as existing contingently in a series of overlapping temporal intervals during which behavior occurs in patterns (what Parfitt calls, “contingent personal interactions”). People’s “selves” would thus evolve and change over their lifetimes, as these patterns evolved and changed, as a function of social and non-social reinforcement.

            Social cooperation situations may now be seen as a subcategory of self-control situations. A social cooperation situation exists when, in addition to Conditions 1 and 2:

 

3. A group benefits more when an individual member chooses a t-length fraction of the longer activity than it does when the individual chooses the brief activity.

 

            An altruistic act is defined as a choice of the t-length fraction of the longer activity over the brief activity under Conditions 1, 2, and 3. The size of the group may range from only two people to the population of the world. The cost of the altruistic act may be a true cost, as when one anonymously donates money to charity, or an opportunity cost – the loss of the preferred brief alternative. Note that in this definition a particular altruistic act need not be reinforced, either presently or in the future. Reinforcement of altruism is obtained only when such acts are grouped in patterns that are, as a whole, intrinsically valuable.  Thus, the woman’s act of running into a burning building to save someone else’s child is reinforced only insofar as it is part of a highly valued pattern. It may not itself ever be reinforced and may be punished by injury or death. If the woman died in the attempt, the act may still have been worth doing since not doing it would have broken a highly valued pattern.[7]

            This way of thinking about altruism and self-control may seem strange but it is not at all unusual. It is what Plato meant when he held Socrates’ life (and death) to be both good (ethical) and happy. It is what many thinkers about ethics, before and since, have been saying. In 20th century psychology, the gestalt psychologists emphasized that the whole could be greater than the sum of its parts. They intended this maxim to apply to motivation or value as much as to perception (Lewin, 1936). Consider listening to a symphony (assuming you enjoy this activity) on a CD that you just bought. Your enjoyment apparently begins when the music begins and ends when the music ends. Now suppose, after listening to the first 57 minutes of the symphony, you discover that the final three minutes of the 60-minute piece are missing from the CD. Is your enjoyment of the music just reduced by 3/60 of what it would have been if the whole symphony were played? Or is the breaking of the pattern so costly that the missing three minutes ruins the whole experience? In my own case the latter would be true. Readers who do not agree may imagine some other temporally extended activity that would be ruined for them by interruption late in the sequence.

            The meaning of a single instrumental act can be found only in a context of other acts. Conditions 1, 2, and 3 place the act in such a  context. For the cognitive psychologist, on the other hand, the meaning of a single act is to be found in the mechanism that immediately and efficiently caused the act. Thus, for the cognitive psychologist, a single act may be altruistic or not independent of other acts. Obviously, both cognitive and behavioral investigations need to be pursued. I am not saying that one is any more valuable or important than the other. But I do believe that it makes more sense to say that the behaviorist studies altruism itself while the cognitive psychologist studies the mechanisms behind it than it does to say that the cognitive psychologist studies altruism itself while the behaviorist studies only its behavioral effects.

            It seems clear that a person may be self-controlled without being altruistic. That is, Conditions 1 and 2 may obtain while Condition 3 does not. Although, given our strong social dependencies, there is usually some social benefit when a person stops drinking or smoking or overeating or gambling, such benefits are arguably incidental. The opposite question, whether a person may be altruistic without being self-controlled, however, is the one that concerns us here. This question is important because its answer determines whether people need a special mechanism for altruism, aside from whatever mechanism mediates self-control. Most demonstrations of altruistic behavior without egoistic incentives have focused on particular acts (Caporeal et al., 1989). But it is not possible to determine that a separate altruism mechanism exists by the absence of reinforcement (immediate or delayed) of particular altruistic acts. The question is rather: Are there altruistic acts under Conditions 2 and 3 above where Condition 1 does not obtain? This is a difficult question to answer because Condition 1 does not specify the appropriate context (the longer activity, T) for a particular act. Is there any context (any relatively long-duration activity, T) in which a given altruistic act would also be a self-controlled act? I believe that it will always be possible to find such a context. This makes altruism a relative concept; in some contexts a given act will be altruistic and in some contexts, not. Where it is altruistic it will also be self-controlled (although the reverse may not be true).

            The relativity of the concept of altruism should not be disturbing. First, it does not imply a moral relativism. Many Nazi soldiers behaved altruistically in the context of their military units but immorally in a larger context. Morality does not depend on altruism any more strictly than it depends on self-control. A moral code may approve of some kinds of altruism but disapprove of others just as it may approve of some kinds of self-control and disapprove of others.

            Secondly, whether an act is self-controlled or impulsive is no less contextually dependent than whether it is altruistic or selfish. Even a hungry rat rewarded by food for pressing a lever is to an extent controlling itself.  The pattern of pressing the lever and eating takes longer (necessarily) than the act of pressing the lever alone.  Pressing the lever, considered alone, is dispreferred to just sniffing in the corner of the cage; hence pressing the lever for food to be delivered within a fraction of a second is an instance of self-control. Correspondingly, even a slug may be said to exhibit self-control – on a microscopic level.  At the other extreme, strict sobriety may be narrow relative to a still more complex pattern of social drinking.

            There is a sense in which all acts (of choice) are selfish; the same sense in which all instrumental acts are reinforced and, for the economist, all behavior maximizes utility.  These are assumptions of theory, or rather methods of procedure, not empirical findings.  But this does not mean that selfishness is a meaningless concept (any more than reinforcement or utility maximization is).  The sense in which an altruistic act is selfish (as part of an ultimately selfish pattern) differs from that in which a non-altruistic act is selfish.  And this distinction is an empirical one.

            Behavioral psychology has not been able to trace every particular act to a particular reinforcer – immediate or in the future. Organized patterns of acts occur despite the existence within them of unreinforced particular acts. What then reinforces the patterns? In psychology, theories of reinforcement based on “pleasure” or “need” or “drive” have not been able to explain particular acts. Such theories have proved to be circular – “pleasures,” “needs,” and “drives” proliferated about as fast as the behaviors they were supposed to explain. It is often not possible to use these concepts to predict behavior in one choice situation from behavior in another. But Premack’s (1965) wholly behavioral theory and the economic theories based on it (Rachlin et al., 1981) are predictive and non-circular. These theories use the choices under one set of behavioral contingencies or constraints to estimate the values of the alternatives (or the parameters of a utility function) and then use those values or parameters to predict choice under other sets of contingencies or constraints.

            This method serves to explain choices among patterns of acts as well as particular acts. And, it answers the social-cooperation question, “Why is friendship rewarding?” as well as the self-control question, “Why is sobriety rewarding?” The answer in both cases, for the behavioral psychologist, is that in a choice test between each of these patterns as a whole and their respective alternative patterns as a whole, friendship would (at least in some cases) be chosen over loneliness and sobriety would (at least in some cases) be chosen over drunkenness.[8]

 

5. Commitment

            No amount of calculation by the mother who runs into a burning building to save someone else’s child will bring the benefits-minus-risks of this activity considered by itself into positive territory. But over a series of actions, a series of opportunities to sacrifice her own benefit for the benefit of others, the weightings may change. As we have seen (Figure 1) social and individual decisions may individually be completely negative, their only value appearing when they are grouped.[9]  The problem is that life ordinarily faces us not with groups of decisions but with particular decisions that must be made. It is up to us to group decisions together, and we do this by means of various commitment devices – contracts, agreements, buying tickets to a series of concerts or plays, joining a health club, and so forth.

            These commitments may work by instituting some punishment (such as loss of money or social support) should we fail to carry them through.  Green & Rachlin (1996) have shown that pigeons prefer, A: a future choice between 1) a small, immediate reward followed by punishment and 2) a larger, delayed reward to, B: the same future pair of alternatives but without the punishment. Only by present choice of the future pair of alternatives involving punishment will they avoid being tempted later by the smaller immediate reward and will they obtain the larger reward that they prefer at the present time. Another kind of commitment shown by pigeons

(Siegel & Rachlin, 1996) is “soft commitment.” At an earlier time the pigeon begins a pattern of behavior, such as rapidly pecking a fixed number of times on a lit button. This pattern is difficult for the pigeon to interrupt. Then, in the midst of this pattern, the tempting alternative (the smaller, immediate reward) is presented. Only by continuing and completing the previously begun pattern of behavior will the larger reward be obtained.  By beginning and continuing the pattern the pigeon avoids the temptation and obtains the larger reward. The further along the pigeon is into the pattern, the more likely it is that the tempting small reward will be avoided.

            In a primrose-path experiment (italicized labels of Figure 1) in my laboratory (Kudadjie-Gyamfi & Rachlin, 1996) human subjects chose the self-control option (Y) more when choices were clustered in threes (patterned) than when they were evenly spaced. Within a group of three choices, the probability of self-control on the first choice was high but, given self-control on the first choice, the conditional probability of self-control on the second choice was higher and, given self-control on the first two choices, the probability of self-control on the third choice was higher still. Similarly, in a repeated prisoner’s dilemma situation, playing against tit-for-tat (a strategy that mimicked, on a given trial, the subject’s choice to cooperate or defect on the previous trial), human subjects cooperated more when trials were clustered in fours than when they were evenly spaced out; moreover, as in the self-control experiment, conditional probability of cooperation increased as the sequence progressed (Brown, 2000).

            Soft commitment with pigeons is a model, on a narrow temporal scale, for successful self-control by humans, on a much wider temporal scale (Rachlin, 2000). The alcoholic, for example, resolves to stop drinking, and refuses one drink. At that point he is vulnerable to the offer of another drink. But if he refuses 10 drinks he is less vulnerable and if he refuses 100 drinks he is still less vulnerable. He refuses the later drinks not because their value is reduced (their value is actually enhanced as deprivation increases) but because he has already begun a pattern of refusal that involves some cost to break.  As he repeatedly refuses drinks (climbs up line DA in Figure 1) the long term rewards that sobriety entails – better health, social support, better job performance – grow apace.

            In experiments on repeated prisoner’s dilemmas some subjects cooperate and continue to cooperate regardless of whether other subjects cooperate with them (Brann & Foddy 1988). These people may be said to cooperate out of a sense of moral duty or for ethical reasons or because they are more altruistic than others. But these sorts of explanations do not say why such people behave as they do. To understand their behavior, the laboratory experiment has to be seen not as an isolated situation but in the context of everyday life. Many experimental subjects are willing and able to separate decisions made in a psychology experiment from those they make in everyday life. But others are not able or not willing to do so. They have decided to cooperate in life and continue to do so in the experiment, not necessarily because of some innate tendency to be altruistic, but because altruism is generally valuable and they would not act altruistically if they made decisions on a case-by case basis. The experiment is merely one case, one situation out of many in their lives. Moral duty, ethical concerns, and altruism are apt descriptions of their behavior. But these qualities do not come from nowhere. They are highly valued patterns of behavior – just as moderation in eating, moderation in drinking, and moderation in sexual activity are highly valued patterns.

 

6. Reinforcement and Punishment in The Prisoner’s Dilemma

            Current discussions of altruism and selfishness in philosophy, biology, economics, and psychology are generally united by reference to strategies of play in prisoner’s dilemma situations. The present analysis does not deny the interest or importance of strategies. Rather, as patterns of behavior, it sees them as crucial. The difference between the present behavioral analysis and cognitive analyses is that, in determining what underlies a strategy, the behaviorist looks for contingencies of reinforcement and punishment rather than internal mechanisms. Thus, it is important to show that the prisoner’s dilemma incorporates reinforcement and punishment contingencies and that prisoner’s-dilemma behavior is sensitive to those contingencies.

            Consider the contingencies of the 2-person prisoner’s dilemma diagramed in Figure 2a.  If both players cooperate, each gets 5 points (convertible to money at the experiment’s end); if both defect, each gets 2 points; if one cooperates while the other defects, the cooperator gets 1 point while the defector gets 6 points.  Figure 2b diagrams the game in a corresponding way to Figure 1, revealing the ambivalence.  As in Figure 1, defection results in a higher immediate reward and a lower long-run reward while cooperation results in the reverse.  Regardless of the other player’s choice, it is always immediately better to defect than to cooperate; if the other player has cooperated then a player will gain 6 points by defecting and 5 points by cooperating; if the other player has defected then a player will gain 2 points by defecting and only 1 point by cooperating.  If communication between players is against the rules, if the game could be played only once (and no similar cooperative tasks were ever expected to be undertaken with the other player), then the motive to defect should predominate.  However, if there were some way to get the other player to cooperate, then whatever it takes to do this should predominate over defection because the gain from the right to the left vertical line in Figure 2b averages 4 points while the gain from the lower to the upper line (from cooperation to defection) averages 1 point.  The best set of circumstances would be to defect while the other player cooperates, earning 6 points.  This is an unlikely scenario since the other player would then earn only 1 point.  However, if communication were within the rules, it would be possible to compromise by agreeing to mutual cooperation, earning 5

points each (the highest pooled score).  Or, if the game were to be played many times, it would be possible to reinforce the other player’s cooperation by cooperating, and to punish the other player’s defection by defecting.  This strategy is called “tit-for-tat.”  The dashed line shows average points gained in repeated trials against tit-for-tat with a distribution of choices proportional to the distance between the vertical lines.  For example, alternation of cooperation and defection (halfway between the vertical lines) yields 6 points and 1 point alternately for an average of 3.5 points per trial against tit-for-tat.  The highest point on the dashed line (hence the best strategy against tit-for-tat) is to cooperate on all trials.  Tit-for-tat has  indeed been highly effective in generating cooperation and maximizing pooled scores in several situations: computer simulations of prisoner’s dilemma games (Axelrod 1997); 2-person games with human subjects (Rapoport & Chammah 1965; Silverstein et al 1998; Brown & Rachlin 1999); with a single subject playing against a computer programmed to play tit-for-tat (Komorita & Parks 1994).

Figure 2.(a) Payoff matrix of 2-person prisoner's dilemma game. (b) Same game. Player A's earnings for cooperation (lower black dot) and defection (upper black dot) as a function of Player B's choice.


            The crucial variable influencing cooperation in 2-person games seems to be reciprocation (Komorita & Parks 1994; Silverstein et al. 1998).  This is also true in games with more than 2 players such as illustrated in Figure 1 (Komorita et al. 1993).  The tit-for-tat strategy imposes a strict reciprocation and thus engenders cooperation.  Prior communication enhances reciprocation and thus has the same effect.  On the other hand, when reciprocation is low or nonexistent, as when the other player plays randomly or always cooperates or always defects, cooperation deteriorates (Silverstein et al. 1998).  Baker and Rachlin (2001) found that a player’s probability of cooperation in a 2-person prisoner’s dilemma game varied directly with the other player’s probability of reciprocation.

 

7. Context

            As Tversky and Khaneman (1981) showed, context, or “framing,” strongly influences

probabilistic choice behavior. Context is likewise a strong determinant of self-control. Heyman (1996) cites a study by Robins (1974) of American soldiers who became addicted to heroin in Vietnam. The majority of these addicts easily gave up their addiction when they came home to a different environment. Heyman argues that the boundary line separating local from non-local events (the duration of the chosen activity) may vary over a wide range (depending on the salience and relevance of environmental stimuli), thereby explaining how humans and nonhumans may act impulsively in one situation and self-controlled in another. A second experiment by Baker and Rachlin (in press) demonstrates a similarly strong influence of context in a social cooperation experiment with human subjects.

            Tit-for-tat is a teaching strategy.  A computer, playing tit-for-tat against a player, invariably follows the player’s cooperation by cooperating on the next trial and invariably follows the player’s defection by defecting on the next trial.  Since the computer’s cooperation is much more valuable to the player than its defection, the computer’s cooperation reinforces the player’s cooperation and its defection punishes the player’s defection.  Thus the computer “teaches” the player to cooperate.

            Another strategy that has been successful in computer tournaments (dominating tit-for-tat) is called Pavlov (Fudenberg & Maskin, 1990; Nowak & Sigmund, 1993). Pavlov is a learning strategy. Using Pavlov, the computer’s choice on the present trial, whether cooperation or defection, is repeated on the next trial if the player cooperates and changed on the next trial if the player defects. Against tit-for-tat, the player cannot successfully punish the computer’s defection; the computer would respond to defection by defecting itself. Using Pavlov, however, the computer would respond to defection by changing its choice on the next trial: if it had defected, it would now cooperate; if it had cooperated, it would now defect. The computer using Pavlov would respond to cooperation by repeating its choice on the next trial; if it had defected, it would defect again; if it had cooperated, it would cooperate again.  That is, the computer would behave as if its choice were reinforced by the player’s cooperation and punished by the player’s defection.  Thus the computer, playing Pavlov, “learns” from the player.


Figure 3. Results of Baker and Rachlin’s (in press a) experiment. Average of last 15 of 100 trials.


            In this experiment, four groups of subjects (Stony Brook undergraduates) played 100 trials of a prisoner’s dilemma game. Against each subject in two groups, the computer played a modified form of tit-for-tat. Against each subject in the other two groups, the computer played a modified form of Pavlov.[10]  One of the tit-for-tat groups and one of the Pavlov groups saw a spinner on the computer screen and were correctly informed that the computer’s responses were determined by that spinner. The other two groups believed that they were playing the game against another player rather than against a computer. They did not see a spinner but they did see the “other player’s” reward matrix (and reward presumably received) as well as their own.[11]

 

            The results of the experiment are shown in Figure 3. The context of the game – whether or not the subjects were led to believe that they were playing against another subject – had a strong effect on their behavior, but the context effect was opposite for the two computer strategies. When subjects believed that they were playing against a computer, they cooperated more against tit-for-tat (where the computer reinforced and punished the players’ cooperation and defection) than they did against Pavlov (where the computer’s choices were reinforced and punished by the players’ cooperation and defection). On the other hand, when subjects believed that they were playing against a human being, they cooperated more against Pavlov than against tit-for-tat. This result may be attributed to the fact that subjects’ histories of interacting with machines (unlikely to be responsive to reinforcement and punishment) differed from their histories of interacting with other people (more likely to be responsive). When the relatively global histories matched the relatively local set of contingencies (the computer’s strategies) subjects cooperated; when the global histories contradicted local contingencies they defected. (In all cases, however, under the most narrowly local contingencies, defection was immediately reinforced.)  Choice in prisoner’s dilemma situations, therefore, like choice in self-control situations, may be understood in terms of global as well as local reinforcement.

            Taken together with the previously discussed experiments of Kudadjie-Gyamfi & Rachlin (1996) and Brown (2000), in which patterning choices over time increased human subjects’ self-control and prisoner’s-dilemma cooperation, the experiment described above shows that, at least in laboratory studies, self-control and social cooperation are similarly responsive to reinforcement contingencies and similarly sensitive to context.

            Laboratory models, however, are necessarily diminished representations of everyday-life processes. The reinforcers in all of these experiments – points convertible to money – were extrinsic to the subjects’ choices. If, as argued here, the reinforcers of real-life self-controlled and altruistic behavior are intrinsic in the patterns of those behaviors and if those patterns are extended over long durations – months and years – real-life rewards will never be duplicated in a 30-minute laboratory experiment.

            The experiment described above partially gets around this limitation by varying verbal instructions so as to bring the brief laboratory experiment into differing long-term, real-life contexts. Moreover, an economic extension of Premack’s conception of reinforcement (Rachlin et al., 1981) sees all reinforcement as intrinsic (even that of a rat’s lever press reinforced by food; the rat is seen as choosing the pattern of lever pressing plus eating over not lever pressing plus not eating).

            Nevertheless, there remains a vast difference in scale between laboratory experiments and real life. The point of the experiments is to show that, on a small scale, self-control and altruism are sensitive to reinforcement and punishment. In the case of self-control there is ample evidence that large-scale, real-life behavior is similarly sensitive (Bickel & Vuchinich, 2000). If, as is argued here, there is no essential difference between self-control and altruism, the same behavioral laboratory studies that have proved useful in developing real-life self-control techniques may be equally useful in developing real-life altruistic behavior.

 

8.  Can Altruism Be Explained Without Reinforcement?

            Does this way of thinking put more weight on reinforcement than it can bear?  Can the job be done entirely by internal mechanisms with reinforcement playing no part whatsoever?  The issue is this: There are some particular acts, especially by humans, that we normally classify as done through a sense of altruism, of duty, of principle.  No biologist claims that a separate inherited mechanism exists for each of the infinitude of possible acts that fall within these categories.  To explain such actions as inherited, the biologist must hypothesize the existence of a general mechanism for altruism which is somehow aroused by situations such as the game I play with my audiences illustrated in Figure 1 (bold labels).  It seems to me that the postulation of such a mechanism as inherited – like blue or brown eyes – puts far too heavy a load on inheritance; we have no idea how such a mechanism could work.

            On the other hand, it is generally agreed that self-control may be taught at some level even to nonhumans.  The crucial issue then is whether or not altruism is a subcategory of self-control. If it is, there is no need to postulate an innate altruistic mechanism; the job can be done by whatever mechanism we use to learn self-control – an innate mechanism to be sure, but an innate learning mechanism.

            This is hardly an original idea.  Plato and Aristotle both claimed that self-control and altruism were related concepts.  The experiments described in this article illustrate the correspondence.  However, perhaps the argument is ultimately not empirical.  It rests on two assumptions: 1. Habitual altruism is a happier mode of existence than habitual selfishness and 2. Particular altruistic acts (together with their consequences) are less pleasurable (even for saints) than particular selfish acts (together with their consequences).  If you accept both of these propositions altruism must be seen as a kind of self-control.

 

9.  Can Altruism be Explained Wholly In Terms of Extrinsic Reinforcement?

            How are patterns of behavior learned and how are they maintained?  Consider the following set of cases.  Four soldiers are ordered to advance on the enemy.  The first and second advance; the third and fourth do not.  Of the two who advance, the first is just obeying orders; he advances because he fears the consequences of disobedience more than he fears the enemy.  The second is not just obeying orders; he advances because he believes it is his patriotic duty to advance.  Of the two who do not advance, the third soldier remains in his foxhole out of fear of the enemy; he weighs the aversive consequences of disobeying orders less than the aversive consequences of advancing.  The fourth soldier does not advance because he believes that the orders are immoral.

            No one, neither the biologist, the cognitivist, the Skinnerian behaviorist, nor the teleological behaviorist, denies that there are important differences between the two soldiers who advance and between the two soldiers who do not advance.  But the biologist and cognitivist alike see all the differences in thought, feeling, moral sentiment of the soldiers, as contemporary with their current behavior.  Behaviorists do not disagree that internal differences exist but their focus is rather on non-contemporary events; the Skinnerian behaviorist is concerned to discover crucial differences in the soldiers’ extrinsic reinforcement histories.  The teleological behaviorist is concerned to discover the patterns of behavior of which each soldiers’ present act forms a part (intrinsic reinforcement). Note, however, that even the concept of extrinsic reinforcement must rely at some point on intrinsic reinforcement.  According to Premack’s theory, for example, eating reinforces lever pressing because eating is (intrinsically) of high value and lever pressing is (intrinsically) of lower value.  I am claiming here that an abstract pattern of behavior may be (intrinsically) of high value while the sum of the values of its particular components are of (intrinsically) lower value.  Value, in either case, would be determined by a choice test.

            Let us first consider extrinsic reinforcement.  By careful selection, with humans, it is possible to reinforce members of a set of particular acts belonging to a wide or abstractly defined class of acts (a rule) so that particular acts that have never been reinforced, but that obey the rule, are performed. That is, humans are able to generalize across instances of complex rules and, with simple rules, nonhumans are also able to do so.  Behavior thus learned is said to be rule-governed.  Imitation (of certain people) and following orders (in certain circumstances) are two such kinds of rules. There is no space here to discuss the several techniques developed for generating rule-governed behavior with extrinsic reinforcement  (see Hayes, 1989, for a collection of articles on the subject), nor to discuss current disputes about whether language precedes complex rule-following or whether rule-following precedes language (Sidman, 1997).

            The behavior of the first soldier, who advances because he fears the consequences of disobeying orders more than he fears the enemy, and that of the third soldier, who fails to advance because he fears the enemy more than the consequences of disobeying orders, may be explained in terms of conflicting rules.  Regardless of the complexity of the relation between the consequences of the present act and those of past acts, it is the weighting of the extrinsic consequences of the present act (the magnitudes, probabilities, and delays of enemy fire versus those of punishment for disobedience) that determines the behavior of these two soldiers.

            Moreover, it may be possible to account for the initial learning of ethical rules and principles, such as those that govern the altruistic behavior of the second and fourth soldiers, in terms of extrinsic social reinforcement at home or school or church. But extrinsic reinforcement cannot account for the maintenance of altruistic behavior.  An altruistic act may never be reinforced.  The second and fourth soldiers (as well as the woman who runs into the burning building to save someone else’s child) are as capable of weighing the immediate consequences of their acts as are the first and third soldiers.  But those consequences are ignored by these two soldiers.  The second and fourth soldiers, both of whose behavior has been brought under the control of highly abstract principles (we are assuming), are surely capable of discriminating between the extrinsic consequences of their present acts and the extrinsic social approval or disapproval of their past behavior at home, school or church where the principles were learned.  A person capable of bringing his or her behavior into conformance with an abstract principle by means of extrinsic reinforcement, and of transferring the application of that rule across situations, could not fail to discriminate the present context (where social approval is dwarfed by the possibility of death) from situations where the rule-governance may have been initially learned.  Yet the altruistic act is performed anyway.

            Such acts must be maintained not by extrinsic reinforcement but by intrinsic reinforcement.  The patterns of those acts (patriotic, ethical, altruistic), perhaps supported during their formation by a scaffold of extrinsic reinforcement, must be highly valuable in themselves.  If they depended on extrinsic reinforcement for maintenance they would not be maintained.

            In Premack’s terms, valuable patterns would be chosen if offered as whole patterns in a free choice situation.  In cases such as the patriotic and ethical soldiers and the woman saving a child, imagine a giant concurrent-chain schedule with years-long terminal link alternatives: heroism versus timidity, reverence for life versus toleration of killing, kindness versus cruelty.  Because of their intrinsic value the chosen patterns are final causes of their component acts and may themselves be effects of still wider final causes: a coherent concept of self; living a happier life, living a better life.

            Most of us would indeed choose to be heros rather than cowards, to revere life rather than to kill, to be kind rather than cruel.  We realize that the former alternatives of each pair are actually patterns of happy lives and the latter, of unhappy lives.  But these alternatives are rarely offered to us as wholes.  Rather, we are faced with a series of particular choices with outcomes of limited temporal extent.  The altruists among us, however, have chosen such more extended patterns as wholes; they are the patterns most of us would choose if we could choose them as wholes.  But to do this we would need to evaluate particular alternatives not by their particular consequences but rather by whether or not they fit into the larger patterns.  This of course is a problem of self-control.

 

10. Conclusions.

            Some particular altruistic acts are profitable some of the time.  Giving to charity is often observed and frequently rewarded by society.  But patterns of behavior may be maintained without extrinsic rewards. For example, on a relatively small scale, activities such as solving jigsaw or crossword puzzles are valuable in themselves. People, like me, who like to do crossword puzzles, find value in the whole act of doing the puzzle. When I sit down on a Sunday morning to do the puzzle I am not beginning a laborious act that will be rewarded only when it is completed. Yet, despite the lack of extrinsic and intrinsic reward for putting in that last particular letter, completing the puzzle is, for me, a necessary part of its value. Like listening to symphonies, the pattern is valuable only as a whole. Extrinsic rewards may initially put together the elements of


these patterns but the patterns, once formed, are maintained by their intrinsic value. The cost of breaking the pattern is the loss of this value – even that of the parts already performed. On an infinitely larger scale, living a good life is such a pattern. This is why the woman runs into the burning building to save someone else’s child without stopping to calculate the cost of this particular act, why Socrates chose to die rather than violate the sentence of the Athenian court.

            It is not possible to tease apart the individual and social benefits of such acts. High degrees of altruism are infrequent, not because most people lack an internal altruism mechanism, not because they are selected by evolution to be egoists rather than altruists, but because of the highly abstract nature of the valuable patterns. The relation between particular acts of altruism and the intrinsic reward of the pattern is vague and indistinct.  Altruism for most of us (like sobriety for the alcoholic) is not profitable and would not be chosen considering only its case-by-case, extrinsic reinforcement.  Consequently the way for most of us to profit from altruism (and the way for an alcoholic to profit from sobriety) is to pattern our behavior abstractly – to choose to be an altruistic (or a sober) person. But in order to pattern our behavior in this way (and reap the rewards for so doing) we must forego making decisions on a case-by-case basis.  Once we abandon case-by-case decisions, there will come times in choosing between selfishness and altruism when we will be altruistic even at the risk of death.

 

 


REFERENCES

 

Ainslie, G. (1992) Picoeconomics, Cambridge University Press.

 

Axelrod, R. (1997) The complexity of cooperation: Agent based models of competition and collaboration, Princeton University Press.

 

Baker, F. & Rachlin, H. (2001) Probability of reciprocation in prisoner’s dilemma games.  Journal of Behavioral Decision Making 14: 51-67.

 

Baker, F. & Rachlin, H. (in press) Teaching and learning in a probabilistic prisoner’s dilemma.  Behavioural Processes.

 

Baum, W. (1994) Understanding behaviorism: Science, behavior, and culture, Harper Collins.

 

Bickel, W.K. & Vuchinich, R.E. (2000) Reframing health behavior change with behavioral economics, Lawrence Erlbaum Associates.

 

Brann, P. & Foddy, M. (1988) Trust and the consumption of a deteriorating common resource. Journal of Conflict Resolution 31: 615-630.

 

Brown, J. (2000) Delay discounting of multiple reinforcers following a single choice. Thesis, Psychology Department, State University of New York at Stony Brook.

 

Brown, J. & Rachlin, H. (1999) Self-control and social cooperation. Behavioural Processes 47 65-72.

 

Caporael, L.R., Dawes, R.M., Orbel, J.M. & van de Kragt, A.J.C. (1989) Selfishness examined: Cooperation in the absence of egoistic incentives. Behavioral and Brain Sciences 12: 683-739.

 

Dawes, R. (1980) Social dilemmas. Annual Review of Psychology 31: 169-193.

 

Dawkins, R. (1976) The selfish gene, Oxford University Press.

 

Dennett, D.C. (1984) Elbow room: The varieties of free will worth wanting, MIT Press.

 

Edney, J.J. (1980) The commons problem: Alternative perspectives. American Psychologist 35: 131-150.

 

Fudenberg, D. & Maskin, E. (1990) Evolution and cooperation in noisy repeated games. New Developments in Economic Theory 80: 274-279.

 

Green, L. & Rachlin, H. (1996) Commitment using punishment. Journal of The Experimental Analysis of Behavior 65: 593-601.

 

 

Hayes, S.C. (1989) Ed.  Rule-governed behavior: Cognition, contingencies, and instructional control, Plenum Press.

 

Herrnstein, R.J. (1991)  Experiments on stable suboptimality in individual behavior.  American Economic Review 81: 360-364.

 

Herrnstein, R.J. & Prelec, D. (1992)  A theory of addiction.  In: Choice over time, eds. G. Loewenstein & J. Elster, Russell Sage Foundation.

 

Herrnstein, R.J., Prelec, D. & Vaughan, W. Jr. (1986) An intra-personal prisoners’ dilemma. Paper presented at the IX Symposium on Quantitative Analysis of Behavior: Behavioral Economics, Harvard University.

 

Heyman, G.M. (1996)  Resolving the contradictions of addiction.  Behavioral and Brain Sciences 19: 561-610.

 

Kahneman, D. & Tversky, A. (1979) Prospect theory: An analysis of decisions under risk. Econometrica 47: 263-291.

 

Komorita, S.S. & Parks, C.D. (1994) Social dilemmas, Brown & Benchmark.

 

Komorita, S.S., Chan, D. K-S. & Parks, C.D. (1993) The effects of reward structure and reciprocity in social dilemmas. Journal of Experimental Social Psychology 29: 252-267.

 

Kudadjie-Gyamfi, E. (1998) Patterns of behavior: Self-control choices among risky alternatives. Thesis. Psychology Department. State University of New York at Stony Brook.

 

Kudadjie-Gyamfi, E. & Rachlin, H. (1996)  Temporal patterning in choice among delayed outcomes.  Organizational Behavior and Human Decision Processes 65: 61-67.

 

Lewin, K. (1936) Principles of topological psychology, McGraw-Hill.

 

Mazur, J.E. (1987) An adjusting procedure for studying delayed reinforcement. In: Quantitative analysis of behavior, 5: The effects of delay and of intervening events on reinforcement value, eds. M.L. Commons, J.E. Mazur, J.A. Nevin & H. Rachlin, Lawrence Erlbaum Associates.

 

Messick, D.M. & McClelland, C.L. (1983) Social traps and temporal traps. Personality& Social Psychology Bulletin 9: 105-110.

 

Nowak, M. & Sigmund, K. (1993) A strategy of win-stay-lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature 364: 56-58.

 

Parfit, D. (1971) Personal identity. Philosophical Review 80: 3-27.

 

Platt, J. (1973) Social traps. American Psychologist 28: 641-651.

 

Premack, D. (1965) Reinforcement theory. In: Nebraska symposium on motivation, ed. D. Levine, University of Nebraska Press.

 

Rachlin, H. (1994) Behavior and mind: The roots of modern psychology, Oxford University Press.

 

Rachlin, H. (1995a) Self-control: Beyond commitment. Behavioral and Brain Sciences 18: 109-159.

 

Rachlin, H. (1995b) The value of temporal patterns in behavior. Current Directions 4: 188-191.

 

Rachlin, H. (1997) Four teleological theories of addiction. Psychonomic Bulletin & Review 4: 462-473.

 

Rachlin, H. (2000) The science of self-control, Harvard University Press.

 

Rachlin, H., Battalio, R., Kagel, J. & Green, L. (1981) Maximization theory in behavioral psychology. Behavioral and Brain Sciences 4: 371-417.

 

Rapoport, A. & Chammah, A.M. (1965) Prisoner’s dilemma, University of Michigan Press.

 

Robins, L.N. (1974) The Vietnam drug user returns. Special Action Office Monograph, Series A, Number 2, United States Government Printing Office.

 

Schelling, T. (1971) The ecology of micromotives. Public Interest 25: 61-98.

 

Sidman, M.  (1997) Equivalence relations. Journal of the Experimental Analysis of Behavior 68: 258-266.

 

Siegel, E. & Rachlin, H. (1996)  Soft commitment: Self-control achieved by response persistence.  Journal of the Experimental Analysis of Behavior 64: 117-128.

 

Silverstein, A., Cross, D., Brown, J., & Rachlin, H. (1998) Prior experience and patterning in a prisoner’s dilemma game. Journal of Behavioral Decision Making 11: 123-138.

 

Skinner, B.F. (1938) The behavior of organisms: An experimental analysis, Appleton-Century-Crofts.

Sober, E. & Wilson, D.S. (1998) Unto others: The evolution and psychology of unselfish behavior, Harvard University Press.

 

Stout, R. (1996) Things that happen because they should, Oxford University Press.

 

Tooby, J. & Cosmides, L. (1996) Friendship and the banker’s paradox: Other pathways to the evolution of adaptations for altruism. In: Evolution of social behavior patterns in primates and man. Proceedings of The British Academy 88, Oxford University Press.

 

Tversky, A. & Kahneman, D. (1981) The framing of decisions and the rationality of choice. Science 211: 453-458.

 


ACKNOWLEDGMENTS

 

            The research reported in this article and the preparation of the article were supported by grants from the National Institute of Mental Health and the National Institute on Drug Abuse. Some sections of the article are rewritten versions of sections of the author’s book, The Science of Self-Control, published by Harvard University Press (Rachlin, 2000).

 


ENDNOTES

 

 



[1] These are very wide conceptions of selfishness.  Usually, by “selfishness,” we mean explicit rejection of a clearly altruistic alternative; so the word has a socially negative connotation.  However, in popular explanations of biology, “selfishness” has lost its negative sense.  It just stands for survival value (as in “selfish gene”).  Similarly I use the term here to stand for reinforcement value.

 

[2] What counts seems to be how the problem is presented – whether I emphasize the group or the individual benefit – rather than who the players are (Italian economists, Japanese psychologists, Stony Brook undergraduates, and so forth) or whether the amounts of money won are large and hypothetical or small and real.

 

[3]  Other versions of the primrose path manipulate delays rather than amounts (with inverse contingencies).  In some experiments subjects are given more or less explicit instructions about the contingencies in effect.  In others, the base number of trials determining N (rule #3) is varied.  In still others, trials are grouped in temporal patterns.  These manipulations have systematic effects on the proportion of X’s and Y’s chosen (over a typical session of about 100 trials), but none results in exclusive choice of X or Y, showing that the contingencies retain their essential ambivalence.

 

[4] The social prisoner’s dilemma, in which a single person’s interests conflict with the common interests of a group, is analogous to a single person’s intertemporal dilemma, in which the person’s interests over a narrow time range conflict with the common interests of that same person over a wide time range. Ainslie (1992) pointed out that the prisoner’s dilemma among groups of individuals corresponds to that within an individual at different times. The difference between Ainslie’s view of self-control and mine is my conception of common interests reinforcing behavioral patterns (analogous to group selection) versus Ainslie’s conception of internal bargaining among a person’s temporally distant interests. Underlying this is a difference in our conceptions of simple versus complex ambivalence. Ainslie believes that complex ambivalence – where abstract rewards such as good health reinforce behavioral patterns such as daily exercise – may be reduced to the sum of discounted values of particular rewards acting on each particular act of exercise. That is, Ainslie believes that complex ambivalence may be reduced to multiple cases of simple ambivalence. I believe that complex and simple ambivalence are essentially different. Where simple ambivalence opposes larger but more delayed rewards to smaller but less delayed rewards, complex ambivalence opposes larger but more abstract (and temporally extended) rewards to smaller, particular rewards.

 

[5] And many times since. Ainslie (1992), Platt (1973), and Schelling (1971) have recently stressed this correspondence.

 

[6] It is sometimes supposed that in a perfect world there would be no conflict between immediate desires and long-term values.  The image of a natural human being living a natural life has this sort of framework – a place where our immediate desires are in harmony with our long-term best interests.  But, as Plato pointed out (Philebos, 21c), life in such a world would be the life of a slug.  In such a world we would have no need to behave in conformance with more abstract environmental contingencies; therefore we would have no ability to do so.

 

[7] As previously noted, however, people often ignore valuable long-term patterns and focus on particular present costs and benefits. In economic terms, this implies that you need to be very careful in determining which previously incurred costs are really “sunk costs” and which are investments that if pursued (at a present additional cost) may still pay off.

 

[8] This is as far as the behavioral psychologist can go. For the evolutionary biologist, the answer to, “Why is this pattern valuable?”  is that it has contributed to survival in the past. I am not arguing that the behavioral psychologist’s answer is better than the evolutionary biologist’s answer but rather that a correspondence between self-control and social-cooperation is no less consistent with an evolutionary biological approach to behavior than it is with a teleological behavioral approach.

 

[9] As the Gestalt psychologists pointed out, we perceive patterns (like melodies) directly rather than as the sum of their parts. Similarly, the value of a pattern (like the enjoyment of listening to a melody) may be far greater than the sum of the values of its parts (the enjoyment of listening to particular notes).

 

[10] The game was modified to make the computer’s responses probabilistic rather than all-or-none. When a strategy would ordinarily dictate cooperation, the computer increased its probability of cooperation by .25 (and decreased its probability of defection by .25) between 0   p   1. When a strategy would ordinarily dictate defection, the computer increased its probability of defection by .25 (and decreased its probability of cooperation by .25).

 

[11] There were two other groups whose results are not presented here. Those groups did not see a spinner on the computer screen but neither did they see another reward matrix and they were not led to believe that they were playing against another subject.

 

 

FIGURE LEGENDS

 

1.  Bold typeface: Contingencies of 10-person prisoner’s dilemma experiment. Italic typeface: Contingencies of self-control, “primrose path,” experiment (1 player, successive choices) to be described later . [In brackets]: Contingencies faced by alcoholic. In all three cases, particular choices of X [having a single drink] are always worth more than particular choices of Y [refusing a single drink] yet on the average it is better to choose Y [to drink at a low rate].

 

2. (a) Payoff matrix of 2-person prisoner’s dilemma game. (b) Same game. Player A’s earnings for cooperation (lower black dot) and defection (upper black dot) as a function of Player B’s choice.

 

3. Results of Baker and Rachlin’s (in press) experiment. Average of last 15 of 100 trials.