Below is the unedited preprint (not a quotable final draft) of:
Rachlin, H. (1995). Self-control: Beyond commitment.
Behavioral and Brain Sciences 18 (1): 109-159.
The final published draft of the target article, commentaries and
Author's Response are currently available only in paper.
Self control has been defined historically as an intrapersonal conflict: between spirit and body; between reason and passion; between cognition and motivation; between higher and lower centers of the nervous system and, according to two contemporary economists (Shefrin & Thaler, 1992), between an internal "planner" and an internal "doer." Self control is said to be the dominance of the former member of each of these pairs over the latter; impulsiveness, the dominance of the latter over the former. The Oxford Encyclopedic English Dictionary defines self control as "the power of controlling one's external reactions, emotions, etc." The key word in the definition is external. If something external is being controlled, the controlling forces by implication must be internal. Self control is commonly understood as control from inside. The mechanism by which internal control is said to be achieved has varied in the various theories from spiritual factors, like the human will, which, in Descartes' model, moves the pineal gland in the brain so as to supplement or overcome externally controlled reflexes, to strictly neurological "higher centers" (the prefrontal cortex for instance) inhibiting or exciting "lower centers" (the limbic system for instance). The higher neural centers are said to be centers of "action" while the lower centers are centers of "passion;" they are passive in the face of perhaps chaotic environmental or bodily forces.
Without denying the validity or applicability of any of these non-behavioral models, the present article takes a strictly behavioral stance from which to view self control. The internal cognitive or physiological mechanisms that mediate between environmental forces and behavior are ignored. The article focuses instead on the environmental contingencies that characterize self controlled and impulsive behavior.
1. A Behavioral Stance
I call the particular behavioral stance from which this article views self control, "teleological behaviorism" (Rachlin, 1992, 1994, in press a). Before proceeding to discuss self- control as such it is necessary to characterize teleological behaviorism and to distinguish it from other views of human and nonhuman choice.
Like physiological and cognitive (or "intentional") stances (Dennett, 1978) teleological behaviorism denies the prima facie validity of introspection as psychological observation. In common with physiologism and cognitivism, teleological behaviorism considers introspective reports as data, but not as privileged data. Thus a person's statement, "I am anxious", would be no more and no less a reflection of true anxiety than her facial expression, her blood pressure, or the tremor of her hand. More to the point, your own introspections are equally fallible. Given that a successful physiological, cognitive, or behavioral model of anxiety could be formulated, and given a contradiction between the state of such a model and your own introspection, your introspection would be wrong. Physiologists, cognitivists, and behaviorists alike claim that it is possible to be anxious and not know it or not anxious and not know it.
Some introspectively based models concede that introspections may be wrong -- but the error is said to be due to an impediment to deeper introspection (or true insight); introspection, for an introspectionist, may be faulty just to the extent that it does not go deep enough. Cognitivists, physiologists and behaviorists alike deny this; they say that depth of introspection is irrelevant to genuineness of mental state.
The physiologist may note that we have no organs for introspection. Our visual systems for instance are designed to observe external objects, but not themselves. We see the chair in the world, not the chair as represented (upside down) on our retinas or in our brains. The representation of a chair on our retinas or in our brains may be approached by physiological investigation, but not by better or deeper introspection. Anxiety, and even much more complex mental states, may be studied by physiological means, says the physiologist, but not by introspection.
Similarly for the cognitivist. Modern cognitivism generally relies on behavioral rather than physiological data but the nature of cognitive theory is very much like the nature of physiological theory. Information processing systems containing internal representations of objects and events are constructed on the basis of behavioral data and used to predict future behavior. The cognitive models (the information processing systems) are essentially ways of organizing (understanding and predicting) observed behavior, usually human verbal behavior, but often non- verbal behavior, of humans and nonhumans as well. Cognitive models may be presented in the form of diffuse networks or coherent units but in either case they are held to be abstract formulations of neural processes internal to the organism.
In physiological theories, the meaning of a mental term (like anxiety, or love, or thought) is a state of the central nervous system conceived directly in neurological terms; for cognitive theories the meaning of a mental term is a state of an information processing system, perhaps eventually to be instantiated in neurological terms. Between cognitive and physiological theories the location of mental events is not in dispute -- mental events occur inside an animal, particularly inside the brain. Mental events are said to occur in an animal in the same sense as a play is said to be performed in a theater.
Teleological behaviorism differs from physiologism and cognitivism by virtue of its "whole organism" view of the mind. Teleological behaviorism is a "personal level theory" (Dennett, 1978, p. 154, footnote) that looks for order in the relationship between organisms and environment rather than for mechanisms within the organism. A teleological organization of behavior relies not on common antecedents (efficient causes) but on common ends or outcomes (final causes); an intention or purpose is conceived not as an internal state but as an overt pattern of behavior of a whole person. Thus, for a teleological behaviorist, a person's mind would not be in his body as a play is in a theater but would be a mode of functioning of the whole body -- like good acceleration may be said to be in a car. From this viewpoint the meaning of a mental term, its relevant context, is not another entity spatially located in a person but a pattern of particular overt acts temporally extended, of the whole person. [For simplicity from here on I will occasionally drop the tag, "teleological", and just say, "behaviorism". But the reader should keep in mind that teleological behaviorism differs drastically from other forms. Skinner's (1953) radical behaviorism accepts inner causes (Zuriff, 1979) and rejects mental terms while teleological behaviorism does the reverse.]
Economic utility functions (as applied to individuals in microeconomic theory) are examples of how abstract conceptions of external goals may organize behavior (Rachlin, Battalio, Kagel & Green, 1981). From a series of observations of choices in controlled situations, an animal's utility function (among a set of activities) may be constructed. Then the utility function may be used to predict behavior in other situations. If predictions fail, the utility function is adjusted until it accounts for all behavior so far observed; then it is tested again in still other situations. Thus, in behavioral theory, utility functions may operate just like information processing systems do in cognitive theory. Successful outcome of a behavioral investigation would be better and better specification of a utility function just as successful outcome of a cognitive investigation would be better and better specification of an information processing system.
Success of this sort of behavioral model would not constitute evidence against any particular cognitive or physiological theory (or vice versa). Although I will argue in the present article against a particular quasi-cognitive model of self control (the internalization of commitment) I will attempt to do so on its own terms.
2. The Discount Reversal Effect
In order to obtain a behavioral focus the concept of self- control has been redefined (in behavioral literature) as control of behavior by delayed events as opposed to control of behavior by immediate events. For instance, in Mischel's delay-of- gratification paradigm (Mischel, Shoda & Rodriguez, 1989) a child, sitting alone in a room, is faced with a choice between a less-valued reward available immediately and a more-valued reward available only after waiting (for the experimenter to enter the room). Children who wait longer are said to have demonstrated more self control. Although Mischel's ultimate object is to uncover the internal cognitive mechanisms underlying a child's ability to wait, the defining property of self control or ability to delay gratification in these experiments is the purely behavioral operation described above.
Behaviorally oriented studies (Ainslie, 1974; Logue, 1988; Rachlin, 1974) have redefined self control similarly. Here self- control, whether by humans or nonhumans, has been defined as choice of a more valuable but more delayed reinforcer over a less valuable but less delayed reinforcer (a "temptation"). For example, the student who on a given evening chooses to study rather than to go to a party is said to be controlling herself because she has chosen a more valuable but later reward (good grades, a degree, a better job) over a sooner but less valuable reward (having fun at the party). Conversely, a person who chooses a less aversive but less delayed punisher over a more aversive but more delayed punisher would also be demonstrating self control.
Failure of self control is often accompanied by inconsistency between verbal behavior and overt choices. Remaining with the example of studying versus going to the party, students at least occasionally declare their intention to study on the morning in question but nevertheless go to the party that evening. Similarly, those of us who are alcoholics, smokers, drug addicts, overweight, lazy, and so forth, are notorious for inconsistencies between what we say we intend to do and what we do.
This inconsistency has been modeled in economic and behavioral theory by crossing discount functions (derivable from utility functions) as illustrated in Figure 1 (Ainslie, 1974; Rachlin & Green, 1972; Strotz, 1956).
-------------------
Figure 1 about here
------------------- The Figure illustrates two rewards, a smaller-sooner reward (SS) available at t2 and a larger-later reward (LL) available at t3. The discount functions (the thin lines subtended downward to the left in Figure 1) indicate how each of these rewards decreases in value with increasing delay (t2-t1 or t3-t1). The crossing of the functions indicates inconsistency in preference with passage of time. At t1 (the beginning of the school term) the discounted value of LL (good grades) is higher than the discounted value of SS (the party). The student is therefore honestly stating her true preferences when she says she intends to study later on. But at t2 (the evening of the party) the value of SS is higher than that of LL (and the student goes to the party).
Studies with nonhuman and human subjects with real and hypothetical rewards have found individual delay discount functions of the following hyperbolic form (Mazur, 1987; Rachlin, Raineri & Cross, 1991):
V
v = ___________ (1)
1 + kd
where V is undiscounted value of a reward (the height of the line SS or LL in Figure 1), v is current discounted value (the height of the discount function), d is the delay of the reward (time of the reward minus time now) and k is a parameter measuring degree of discounting. When k = 0 (no discounting) v = V at all delays; as k increases, discount functions decline more steeply.
As opposed to the exponential functions used by banks to calculate continuously compounded interest, hyperbolic discount functions of SS and LL rewards (with the same k values) may cross as illustrated in Figure 1. Relative to exponential functions, hyperbolic discount functions are steeper at short delays and shallower at long delays.
Discount reversals like that diagrammed in Figure 1 are common in all of nature (Rachlin & Raineri, 1992). They appear in memory; with time my vivid memory of this morning's breakfast and my dim memory of my bar-mitzvah dinner will both become dimmer but the memory of today's breakfast will fade to oblivion while that of my bar-mitzvah dinner will dimly remain. They appear in perception; the relative size of the moon, seen through the branches of a nearby tree, will increase, and the moon will eventually appear larger than the tree as I walk a common distance away from both. They appear in economics; a new economy car may cost more today than an older luxury car but as they both age their relative values reverse. They even appear in applied physics; according to the inverse-square law, an intense radiant energy source at a distance decays slower with more distance than does a dim radiant energy source initially close up; if a meter currently measures the dim source's energy as higher than that of a distant but more intense source, the measured intensities of the two sources will approach each other and cross as the meter is drawn away from both. There is nothing irrational about any of these discounting phenomena. The obtained reversals are simply a consequence of the non-exponential form of the respective discount functions (Prelec, 1989).
3. Commitment
Given a set of stable discount functions, how is self- control possible? The standard answer is, through external commitment -- the self-imposition of behavioral constraints. The classic example is Homer's story of Odysseus tying himself to the mast of his ship and stopping-up the ears of his crew with wax in order to sail past the Sirens, hear their song, and yet avoid sailing his ship onto the rocks. In terms of Figure 1, at t1, while the value of SS is still low relative to LL, Odysseus acts so as to deny himself the choice of SS later at t2, thus obtaining LL.
An experiment by Rachlin & Green (1972) showed that commitment of the kind exhibited by Odysseus may also be exhibited by pigeons provided the alternatives are clear and distinct. The experiment is diagrammed in Figure 2. For pigeons, at 80% of their free-feeding weight, Choice-X at
-------------------
Figure 2 about here
------------------- t2 was between a smaller-sooner (SS) reinforcer (2s availability of food followed by a 6s blackout in the chamber) obtainable by pecking a red key (a small illuminated disk) and a larger-later (LL) reinforcer (4s blackout followed by 4s of food availability) obtainable by pecking a green key. Given Choice-X, pigeons strongly preferred SS. However, prior to Choice-X (at t1 in Figure 2), the pigeons were offered Choice-Y (pecking the left or the right of two white keys) between the upper branch of Figure 2 -- availability of Choice-X 10 seconds later (t2-t1 = 10s) -- and the lower branch, a take-it-or-leave-it choice of LL, also 10 seconds later. When offered Choice-Y the pigeons preferred the lower branch; in other words, the pigeons chose at t1 to restrict their choice at t2; they committed themselves to LL. Furthermore, we found that as t2-t1 increased or decreased, degree of preference for the commitment alternative increased or decreased correspondingly (as predicted by Equation 1). Other commitment experiments with pigeon subjects, by Ainslie & Herrnstein (1981), Green, Fisher, Perlow, & Sherman (1981), and Navarick & Fantino (1976) have obtained corresponding results. With appropriate modification of procedure and rewards, corresponding results have also been found with human subjects (Millar & Navarick, 1982; Solnick, Kannenberg, Eckerman & Waller, 1980).
Commitment is a very powerful method of self control. People voluntarily commit themselves to drug and alcohol clinics; they wire their jaws shut or have parts of their intestines removed to reduce weight; they deliberately get caught committing crimes in order to be sent to jail. These are examples of commitment of the kind illustrated in Figure 2 -- by physical restraint.
Powerful as it is, commitment by physical restraint entails many problems in practice. One is that if the restraint is removed the original behavior is likely to return. Another problem with physical restraint in real-life human self control is that it does not adapt itself to changing conditions. Once a pigeon has chosen the lower path in Figure 2 there is nothing it can do, no matter what unforseen event happens, to restore Choice-X. Of course, in the Skinner box, unforseen events are rare. But human life is different. Hence real-life commitment devices, unlike Odysseus' device, almost always offer a way out. Instead of physical restraint, commitment is much more frequently enforced by punishment; at t1 the person puts into place a punishment (P) contingency for choosing SS at t2; Choice-X is still available but (relatively immediate) punishment of choice of SS at t2 lowers its net value (VSS-VP) below that of the discounted LL (vLL). If the punishment is severe enough, the imposed commitment approaches that of physical restraint. Consider an example reported by Schelling (1992):
In a cocaine addiction center in Denver, patients are
offered an opportunity to submit to extortion. They may
write a self-incriminating letter, preferably a letter
confessing their drug addiction, deposit the letter with the
clinic, and submit to a randomized schedule of laboratory
tests. If the laboratory finds evidence of cocaine use, the
clinic sends the letter to the addressee. An example is a
physician who addresses a letter to the State Board of
Medical Examiners confessing that he has administered
cocaine to himself in violation of the laws of Colorado and
requests that his license to practice be revoked. Faced
with the prospect of losing career, livelihood, and social
standing, the physician has a powerful incentive to stay
clean. (p. 167)
The drug, antabuse, which causes severe pain if alcohol is drunk after taking it, is (when taken) another example of self-imposed severe punishment.
Most everyday-life commitment devices provide much less severe penalties for defection. One reason for taking out subscriptions to a concert or a theater series or joining a health club for an extended period (in addition to saving money on the subscription price) is that, although we want to engage in these activities, when the time for them comes we might well feel tired or lazy. Losing the price of the tickets is usually enough of a penalty to get us going (and being glad of it once there). Interpersonal contracts such as marriages, adoptions, membership agreements, all involve punishment for defection extrinsic to the rewards and punishers contingent on the activities themselves. The extrinsic punishers form a sort of artificial "invisible fence" around the activity being controlled.
The practical use of commitment by self-imposed punishment involves a tradeoff between effectiveness and adaptability. When punishment is severe, as in Schelling's cocaine-clinic example, effectiveness is high but behavior is rigidly controlled and unadaptive, and freedom (or the feeling of freedom) is lost. On the other hand, when punishment is mild, effectiveness is lost. There may be very little or no room between loss of adaptability and freedom by overly severe commitment and loss of effectiveness by overly mild commitment.
The main problem with commitment by self-imposed restraint and self-imposed punishment as the final explanation of self- control in everyday life is that most instances of self control in everyday life seem to occur without any extrinsic commitment at all. Most people control their drinking without entering detoxification clinics or going to jail. The same goes for smoking and overeating. Most of us do manage to turn down an extra dessert without wiring our jaws shut. Even pigeons, with gradual extension of t3-t2, may eventually come to choose LL over SS without explicit commitment (Mazur & Logue, 1978).
In general, as we get older, we seem to become better and better able to control our own behavior; we come to choose later- larger over sooner-smaller rewards without either self-imposed physical restraint or punishment. How do we do it? One answer is that as we grow older we come to internalize commitment. For example, according to Frank (1992), many of our negative emotions (guilt, fear, disgust, and so forth) function as internal tools for self control. The reward for refusing an extra dessert is presumably large (good health or social desirability) but not immediately attainable. When we spontaneously do refuse the dessert we do so because acceptance would be punished by (for instance) guilt; the expectation of guilt subtracts enough from the value of the dessert (SS - P) to diminish the resultant immediate value of accepting the dessert below the discounted value of refusing it. Thus, some of us sometimes do refuse the dessert and eventually obtain the larger reward.
Let us now examine this conception of internalized commitment; we will find that it fails on both theoretical and empirical grounds.1
4. Problems With Internal Commitment
From the viewpoint expressed at the beginning of this article the concept of internal commitment is a cognitive concept. But it is an odd kind of cognitive concept, not an information processing system but simply an internal metaphor for a process identifiable only externally. It is as if one were to say that an automobile works by having little automobiles (rather than a carburetor, steering systems, engine, and so forth) inside it.
While external commitment is identifiable (by overt physical constraints or punishment contingencies), internal commitment must be inferred from perhaps subtle behavioral signals. It is not easy to tell whether a person is refusing a second dessert through fear of guilt or simply because its value is low. Still, although difficult, such detection would not in principle be impossible. Verbal report or physiological measurement might indicate the presence or absence of an internal commitment response.
But, beyond the difficulty of identifying internal commitment devices, lies another conceptual problem. Given a fixed set of alternatives, like that of Figure 1, external commitment may operate either by the device of self-imposed physical restraint or the device of self-imposed punishment. Internal commitment must therefore operate in such cases by either internal self-imposed physical restraint or internal self- imposed punishment (Of course other internal mechanisms may govern self control, but they could not be internalized versions of the above-described commitment mechanism).
Self-imposed physical restraint without an intermediate environmental mechanism is a difficult process to envision. It is not possible, when reaching for the refrigerator door with one hand, to hold that hand back with the other; the contradiction inherent in this method of self control becomes apparent to us as soon as we try it. The same goes for internally restraining one set of muscles by action of another set. Movements are indeed controlled by opposing muscle groups. But the actions subject to self control, ranging from reaching for a refrigerator door to injecting heroin into one's bloodstream, are the resultants of that opposition. We need to control our behavior as it appears in the world, not the flexing of our biceps.
If internal self-imposed physical restraint is not possible, internal commitment could be mediated by self-imposed internal punishment. The problem is that internal punishment is just as self contradictory, just as entangled in logical difficulties, as internal self-imposed physical restraint.
One original purpose of the commitment model illustrated in Figure 2 was to explain self control in Skinnerian terms (stimulus, response, reinforcement) and without the use of mental terms. The price for this rigidity of terminology, in complex cases where external stimuli, responses, and reinforcers, were not immediately present, was the internalization of conditioning processes (reinforcement and punishment) hence the use of concepts of self-reward and self-punishment. As early as the turn of the last century Dewey (1896) warned against the internalization of concepts like stimulus and response. Reflexes, Dewey argued, had originally been defined with respect to the whole organism and would lose meaning once internalized. Given the complex neural networks that even then were known to mediate reflex behavior, internal stimuli could not be distinguished from internal responses. An internal neural event could be conceived as a stimulus to the next neural event or a response to the previous one. In other words, stimulus and response had been developed as psychological concepts (relative to the whole organism) and it was at best confusing to use them as physiological concepts, relative to its parts. (It would be like talking about steering the carburetor.) Internal stimuli and responses seemed to Dewey to require an internal homunculus (a little automobile, as it were) to receive the stimulus and emit the response (and an infinite regress as one comes to explain how the homunculus works). The same argument applies with even more force to the concepts of internal reinforcement and punishment (Catania, 1975). Later in the century Ryle (1949) and Wittgenstein (1958) were to echo Dewey's arguments.
But philosophical injunctions have rarely been heeded by psychologists. Much of the history of 20th-century behaviorism may be seen as an attempt to internalize reflexes. In the context of Hull's (1943) system, Mowrer (1960) put forward a "two-factor" account of avoidance learning. Since two-factor theory uses the concept of internal reinforcement to explain avoidance in the same way that internal punishment is used to explain self control, it may be worthwhile to briefly indicate here what this theory is, and what happened to it (Rachlin, 1976, contains a more complete account).
Dogs can learn to avoid electric shock by jumping over a hurdle, from one side of a box to another. But then what reinforces the jump? By definition, after the jump, the shock does not occur. What does occur? Since dogs show considerable fear, initially, in the avoidance situation (as measured by increased heart rate and other physiological tests) it was postulated that the stimuli preceding the jump became associated with fear by classical conditioning (the first of the two factors) and that internal fear-reduction then instrumentally reinforced the jump (the second factor). In the late 1960's two influential articles appeared attacking the theory that avoidance was mediated by fear reduction. One study (Rescorla and Solomon, 1967) using physiological measures of fear showed that the relation between fear-reduction and the avoidance response (say jumping) was not as two-factor theory claimed. For its jump to be reinforced, the dog must be afraid when it jumps and fear reduction must immediately follow the jump. But Rescorla and Solomon found that dogs often jumped before fear even appeared. How could fear-reduction be responsible for jumping if a dog could learn to jump before it became afraid? Certainly fear occurs during avoidance conditioning, Rescorla and Solomon found, and fear may have a strong effect on avoidance responding. Nevertheless, they argued, fear is neither necessary nor sufficient to cause avoidance responding. What then could motivate and reinforce avoidance?
This question was answered by Herrnstein (1969) in a way parallel to the explanation of self control to be advanced here. Herrnstein claimed that individual avoidance responses are not individually reinforced. Rather, a high rate of avoidance (jumps in general) is reinforced by a low rate of shock (shocks in general). Herrnstein trained rats to press levers where a negative correlation over time between lever presses and shocks was programmed with neither discrete contiguities between individual lever presses and shocks, nor any external conditional stimulus paired with shock reduction. Herrnstein's interpretation of the rats' avoidance learning is the beginning of the teleological viewpoint in behaviorism (Lacey & Rachlin, 1978). Although, in Herrnstein's experiments, no individual lever press was reinforced, a negative correlation was imposed between lever-pressing rate and shock rate. The most parsimonious description of the rats' avoidance learning is that they simply chose the most valuable (least aversive) combination of available lever-pressing and shock rates.
Of course all behavior, including avoidance, must be physiologically mediated, and it is impossible to prove that there is no point inside a rat's brain where an internal representation of a lever press is reinforced by an internal representation of fear reduction. But, so far, this internal metaphor of externally-defined escape has proved to be theoretically and empirically unwieldy and has generally been abandoned by behaviorists.
If escape from fear, where fear is conceived as an internal emotional state with at least some nonverbal external measures, has proven to be excess theoretical baggage, then the concept of avoidance of guilt as an explanation of self control must be still more gratuitous. Catania (1975) has argued more generally that the concepts of internal reinforcement and internal punishment (upon which the concept of internal commitment rests) are not viable.2 An attempted demonstration of the efficacy of self-reinforcement, by having subjects "reward" themselves for losing weight by taking money from a dish next to a scale (Mahoney, 1974) was shown to depend not on the so called reward but on the saliency of the money. Castro and Rachlin (1980) found that subjects lost more weight if they put money in the dish proportional to weight loss than if they took money out.
Strong negative emotions like strong positive emotions may act as discriminative stimuli to highlight an activity, to mark it off from others, to keep score so to speak (like paying small amounts of money or biting your tongue or saying, "How could I have done that stupid thing", and hitting yourself on the head with the heel of your hand) but cannot actually punish an activity. If we have indeed done or said something stupid and feel guilty afterward our act may thereby be more effectively punished by its external consequences but it is not punished by our guilt.
The remainder of this article will present and defend another approach to the problem of self control, one based not on an attempt to uncover an internal mechanism but on an attempt to characterize self controlled behavior externally -- to discriminate from the outside between self controlled and impulsive behavior and to consider how self controlled behavior might be brought about.
The first step in this approach is to note that the SS and LL alternatives, as depicted in Figure 1, do not adequately characterize the alternatives in most real-life self control situations. Consider again the choice between the extra dessert and being thin (or healthy or socially approved). First, real- life alternatives are rarely mutually exclusive. We can eat this particular second dessert and still be thin, healthy and socially approved as long as we do not generally eat too many second (or third) desserts. Furthermore, while the reward for accepting the extra dessert truly arrives soon, the reward for refusing the dessert does not clearly arrive later. It is not as if, two weeks after our refusal of the dessert, we wake up one morning significantly thinner, healthier and more socially approved than we were the night before. While the reward for accepting the dessert may be depicted more or less accurately by a vertical line like that labelled SS in Figure 1, the reward for refusal does not seem to be depicted at all accurately by the vertical line LL. Thinness, good health and social approval do not arrive or depart at any particular instant.
Ainslie (1992) attempts to deal with this problem by characterizing self control not as the choice of a single larger- later (LL) reward over a single smaller-sooner reward (SS) but as a series of choices of LL over SS rewards. Although this characterization does more closely approximate the repeated rather than one-shot nature of real-life self control decisions, it still does not alter the depiction of the LL alternative. Good health is not a punctate pleasure that occurs, say, when we visit our doctor annually, nor does the reward of being thin arrive just as we step on the scale and depart when we step off. Even social approval is something other than just the sum of a series of discounted pats on the back.
A characterization of the difference between the rewards for impulsiveness versus self control, better than smaller-sooner versus larger-later, is act versus pattern. The distinction between act and pattern retains the temporal character of the SS versus LL distinction but recasts it in terms of relatively brief versus relatively extended intervals where the extended interval embraces the brief one on both sides. Acts are to patterns as (brief) notes are to (extended) melodies or as (brief) steps are to (extended) dances; a particular (brief) event may fit into a larger (extended) pattern or may not fit in. Patterns as such take time to unfold. Aristotle (Nicomachaen Ethics, Bk.I, Ch.7, Para.15) said: "One swallow does not make a summer, nor does one day; and so too one day, or a short time does not make a person blessed and happy." A swallow and a day's activity are elements of summers and happiness. No event that could occur within a day could constitute human happiness. Happiness for Aristotle was not a brief private emotion but an abstract pattern of overt behavior, running through a whole lifetime. (See Rachlin, 1994, for an extended discussion of Aristotle's psychology.) A pattern may be an element of a wider pattern and an act may be a pattern of narrower elements. Consumption of an individual reinforcer, say a 2-second delivery of grain, by a hungry pigeon, is here considered to be an act, yet it consists of a pattern of individual pecks at grains of food which may be divided in turn into patterns of discrete movements. Thus as the next section indicates there is a continuum between happiness considered as a pattern of life and a relatively elementary act such as hammering a nail.3
The relation between particular acts and abstract patterns of behavior then is not one of mutually exclusive choice but of inclusion or exclusion. For the most part, in nature, particular acts of organisms fit nicely into abstract patterns. For instance, rats deprived of some nutrients develop specific hungers for those nutrients (Rozin & Kalat, 1971); the rats' tastes for particular foods fit in with a healthy pattern of eating. But these harmonious act-pattern relationships may be disturbed. Rats allowed to eat as much as they want of human "junk foods" with a lot of sugar, salt and carbohydrates, become overweight and eventually grossly obese (Brownell, Greenwood, Stellar & Schrager, 1986); here the rats' particular tastes do not fit into a healthy pattern of eating.
5. Act versus Pattern4
Perhaps due to the influence of gestalt psychology on American psychology we are familiar with the concept of direct perception of patterns. We know that even the perception of a simple light is an integration of complex neural effects caused by the light itself, lights elsewhere in the visual field and lights experienced in the past (Land, 1964). We are familiar moreover with the shift of our perceptions between levels. A movie is in a sense nothing but colored lights and shadows and it is possible to see and hear a movie exactly that way. Directors and cinematographers must train themselves to take this molecular viewpoint at certain times. But most of us, most of the time, see movies more abstractly, in terms of characters and events. Occasionally we "see through" those characters and events into the characters' motivations and their very thoughts. Still more rarely we may see still further -- to the director's style or even further, to the place of this film in movie history. The following list goes from particular to wider and wider patterns. Consider our direct perceptions of:
a) lights and shadows
b) characters and events
c) motivation and emotions of characters
d) director's style
e) place in history of movies As we go down the list, the incorporation of context expands until, at the bottom, we need to have seen and discussed hundreds of movies before making such a perception.
Level a may be said to be "transparent" to level b, level b to level c, c to d, and so forth. As we gain more experience we see through one level to the next; for instance, we see "through" the characters to their emotions and motivations; when we come out of the movie our conversation may refer to abstract aspects of the movie, to a character's motives, for instance, rather than to the character's actual behavior or to lights and shadows. The direction of our shift in reference when we shift from behavior to motives -- from act to pattern -- is not inward, into ourselves, but outward, into the movie and its context (not "insight" but something that would be better called, "outsight"). True, there must exist some internal mechanism that enables us to do this. But the contingencies that characterize one or another overt behavioral pattern may be characterized independently of internal mechanisms; indeed, study of such contingencies may be propaedeutic to study of how internal mechanisms work (Skinner, 1953).
Suppose we see a man swinging a hammer, briefly as in a film clip or through a train window. He might be murdering someone or just hammering a nail into a board. What he is doing would be known only by expanding our frame of view, perhaps seeing more of the film, going into the past or the future. Let us say he is hammering a nail. Further examination of the film would tell us perhaps that he is joining one piece of wood to another, and so on down the following list until we come to perceptions attainable only by the man's intimates, or a diligent biographer -- people who see his behavior in many contexts:
a. swinging hammer
b. hammering nail
c. joining one piece of wood to another
d. building floor
e. building house
f. providing shelter for his family
g. supporting his family
h. being a good husband and father
i. being a good person Each item on the list requires more and more context, more and more time, to perceive. To perform the first few actions on the list, the man merely has to hammer the nail. For the last few, the man has to do many things over a long time period. What these things are is difficult to specify but the man's intimate friends and relations can perceive items f, g, and h (or their lack) as easily and naturally as an experienced radiologist perceives a healthy heart in an x-ray or as an experienced music critic identifies a sonata she has never before heard as belonging to Mozart's later period.5
The particular acts, high on the list, have as objects, the abstract patterns of acts, lower on the list. In Aristotle's terminology, particular acts may be done for the sake of more abstract patterns of acts (as well as for their own sake); a more particular pattern stands to a more abstract act as an effect to a final (rather than an efficient) cause (Rachlin, 1992, 1994, in press). The more abstract patterns on the above list have no meaning whatsoever over a brief period of time. It is not possible, for instance, for a man to support his family, or to fail to support it, in one day. Supporting ones' family, being a good person, or, as Aristotle said, being a happy person, are categories of behavior over long spans of time.
As behavioral categories, a person's motives are long-term events. Suppose we observe a butcher giving a free pound of hamburger to a poor old lady. He might be giving the lady the meat out of a spirit of generosity or altruism or as part of a publicity campaign or to impress his other customers (or for all of these reasons). The way to tell is to observe the butcher at other times, when no other customers are present, and in other contexts. If the butcher is a generally generous person -- if this particular act fits into that larger pattern -- then we (and the butcher himself) will categorize his act as generous. If the butcher is not habitually generous, neither this particular act, nor any conceivable particular state of his internal cognitive/physiological apparatus (certainly not his verbal report of such a state) will make him so. The same argument may be made with respect to emotions such as love. (If throughout his marriage a man has been unfaithful, cruel and selfish to his family and on his deathbed says, "I always loved you all deep down," what present or past internal state could conceivably verify his statement? "None!" his wife and children would surely say.)
This mode of behavioral classification (on an act-pattern dimension) is applicable to one's own behavior as well as to the behavior of others. Our own motives become clearer and clearer as we gather more and more information about ourselves. The butcher himself would not be aware of (assuming he is interested in) his own motives any sooner than we are (except to the extent that he typically knows before we do about publicity campaigns, etc.). He has more information than we do but ours is potentially better information because we see his behavior directly while he sees it through reflection in the environment. (Of course, he also feels his particular acts through proprioception and can see his hands and bodily parts move. But we see his whole body move and it is the whole body that is motivated one way or another, not its parts.)
What sort of language may be used to describe these wide- scale patterns? For Skinner (1938) the concept of the operant was sufficient to classify particular movements. A rat's lever press, for instance, is an operant in the sense that it treats all the ways that the rat may press the lever (with its left paw, its right paw, its tail, its nose, and so forth) as one class of act. A common characteristic of operants is that the very last event in the sequence, the movement of the lever, is very brief (relative to the time between lever presses) and identical in all cases.
More complex behavioral patterns may not have this characteristic. For instance, "building a house" (by oneself) defines a class of behavioral patterns that may have no common final event -- or no common movement of any kind. Two people may each build a house yet physically each may make a set of entirely different movements. Nevertheless, building a house is a perceptible, if fuzzy, behavioral classification, one that we often make in everyday life. For the teleological behaviorist, in contrast to Skinner, such wider behavioral patterns may legitimately be described with mental or emotional predicates. Just as "building a house" would be almost impossible to define in terms of specific movements, so would "loving your spouse". The defining characteristic of building a house or loving your spouse is that they each constitute a perceptible behavioral pattern. Perception is defined in turn as a discriminative pattern of movement of an observer including the housebuilder or lover (or butcher) as observers of their own behavior. The behavioral definition in each case rests on common consequences of the behavior -- common contingencies.
As opposed to this behavioral conception, cognitive and physiological theories alike posit common efficient causes -- common internal representations, mechanisms, intentional systems of the housebuilder or lover -- rather than common overt discriminative acts of an observer. (See Rachlin, in press b, for a brief discussion of perception as overt discrimination.)
How then do we gain a perception of our own extended patterns of behavior? To see how we might, consider a purely perceptual act -- judgment of the length of one rod relative to that of another. In a dark room with monocular vision and the head held fixed, it is not possible to judge the true length of a luminous rod. The observer is dependent on visual angle alone. A large rod at a distance may appear smaller than a small rod close up. But as the observer moves away from both rods, their relative sizes reverse and the large rod again appears larger; this is a spatial version of the temporal reversal effect diagrammed in Figure 1.
Now consider what happens when the lights are turned on in the room and the two rods are seen in their context (walls, window, door, and the view through the other eye). Now the discount functions do not cross; the rods are judged at their true size. If we conceive the context as added bit by bit, say by gradually increasing the room's illumination, the effect is to shift the context of the observer's perception of the rods outward from the observer's own body (visual angle) to the immediate environment of the rods. If we add previous contexts (say the two rods become identifiable as a yardstick and a 12- inch ruler) constancy of judgment increases still further.
A corresponding effect of context on discounting in choice was obtained by Mischel and his colleagues in studies of children's delay of gratification (Mischel, et al., 1989). Children would wait much longer for a larger reward, two pretzels, in the presence of a smaller reward, a single pretzel, when they were told to think of the pretzels as toy logs than when they were told to think of how crunchy and tasty pretzels are. One conceivable interpretation of these results is that the verbal instructions shifted the context of the pretzels outward from the child's own body to the environment. It may be speculated that if the smaller reward were something with only a social use (say, a game that required two people to play) it would not be tempting at all and children would wait still longer in its presence for the larger reward.
The essential conflict underlying problems of self control is one between particular acts and patterns of acts. When the particular act fits neatly into the pattern -- when we like what we believe is good for us -- self control does not apply. The squirrel saving nuts is not controlling itself because the particular acts it does are exactly what it momentarily prefers to do; the squirrel engages in nut-saving for its own sake and not for its larger end (providing food for the winter). We know this not from the squirrel's verbal report or by empathy with the squirrel but by observing the squirrel's behavior under various controlled conditions (just as we would do with the butcher). For instance, squirrels continue to save nuts even when they are systematically removed from its cache (Hinde, 1970).6 But when there is a conflict between an act and a pattern -- when we like what we believe is not good for us or do not like what is -- then our behavior is either self controlled or impulsive, depending on our choice.
In summary, a problem of self control arises where we find a pattern in behavior, particular components of which are dispreferred relative to alternatives inconsistent with the pattern. Figure 3 illustrates the self control conflict.
--------------------
Insert Figure 3 here
-------------------- The wavy line high up on the value scale represents a complex behavioral pattern. A component of that high-valued pattern (X) is, taken by itself, low valued. An alternative to the component (Y) is more valuable than X but less valuable than the pattern.
The components of a healthy breakfast are, let us say, juice, cereal, bran muffin, and skim milk. If a person prefers each of these elements to available alternatives, then eating such a breakfast involves no self control. But if, all else being equal, the person prefers champagne to juice, or bacon and eggs to cereal, or apple pie to bran muffin, or coffee to skim milk, and nevertheless (in the face of available alternatives) eats the healthy breakfast, that person is controlling himself. At the other extreme, choice between two competing abstract patterns (whether to become a doctor or lawyer, for instance) would not by this definition involve self control.
6. Development of Self-Control
Self control is developed, therefore, by restructuring behavior into wider patterns. This process of course requires the operation of the nervous system, presumably the higher levels of the nervous system. The defining attributes of self control, however, may be found not within the nervous system but at the borderline between the nervous system and the environment -- in the behavior of an intact organism. The effect of restructuring behavior is to create a pattern; once the pattern begins, it becomes costly to interrupt; the further the pattern proceeds in time, the more costly it is to interrupt. In other words, instead of internalizing commitment, behavioral restructuring retains external commitment but not in the form of physical restraint or punishment. The form of commitment involved in restructuring is a pattern or trajectory of behavior the interruption of which involves a cost. Again, the process of perception most clearly illustrates the point. If a symphony consisting of 5,000 notes, say, were played with only the last note omitted, its value for listeners might be destroyed far beyond the 1/5,000 of the piece that the last note represents. One might wonder why this is so since it is the actual act of listening to the symphony that is said to be valuable and, at the point when the last note is omitted, 4,999 other notes have already been listened to. Nevertheless, as the gestalt psychologists often pointed out, there is a cost involved in the destruction of a whole pattern out of proportion to its particular parts. The same phenomenon (as is found in the act of perception) occurs, even more strongly, in the more complex act of performance -- of a symphony, a song, a dance or any other complex act. Following along with the example of dieting, interruption of a healthy pattern of eating also involves a cost -- not just in terms of health but in terms of the interruption itself. Whereas health is abstract and has no particular point of onset, an interruption is a particular event occurring at a particular point in time. Thus, the cost of interruption can directly oppose and subtract from the value of a particular temptation (like SS in Figure 1).
Interruption of an habitual behavioral pattern, although a negative act, is a wholly overt act. It requires internal mediation no more and no less than does performance of the pattern itself. Interruption involves costs in addition to the cost of giving up the pattern; in other words, a pattern of behavior once begun and then given up, prior to extrinsic reward, is more costly than one not begun at all. This extra cost occurs at a particular time -- the point of interruption -- and may counteract and subtract from the value of particular alternatives to the pattern (of "temptations").
The value of continuing an act once it has begun has been studied in the animal laboratory by J. A. Nevin and colleagues under the rubric of "behavioral momentum" (Nevin, Mandell & Atak, 1983). The direct relevance of behavioral momentum to self- control may be illustrated by the following experiment by Eric Siegel from the laboratory at Stony Brook.
In control conditions, four pigeons (at 80% of their free- feeding weight) were given repeated choices, identical to Choice- X depicted in Figure 2, between SS (an immediate 2s food delivery followed by a 6s blackout) and LL (a 4s blackout followed by a 4s food delivery). For two pigeons, each choice was a peck at a green key (SS) or a red key (LL). (Colors were reversed for the other two pigeons.) The keys were translucent plastic disks illuminated from behind by colored light. The keys were presented simultaneously, sides randomly alternated. After initial training to peck the keys, a control condition was imposed for (at least) 15 sessions for each pigeon at the beginning of the experiment (initial control condition) and for (at least) 15 sessions at the end of the experiment (terminal control condition). Within a given daily control session each pigeon received 45 exposures to Choice-X, separated by 30s intertrial intervals during which both keys were darkened and pecking had no programmed consequences. Using C to symbolize choice and O to symbolize outcome, the sequence was: ...30s CO 30s CO 30s CO.... As expected, the pigeons on the average strongly preferred the smaller more immediate reward (SS). Percent choices of SS over the last 5 sessions of the initial control condition averaged 80% for the four pigeons and, for the last 5 sessions of the terminal control condition, averaged 78%.
In both initial and terminal control conditions, the reinforcement schedule in effect was continuous reinforcement (crf); only a single peck on either the red or green key produced its respective outcome. In the experimental condition, performed between the two control conditions, the schedule imposed was fixed-ratio 31 (FR 31); 30 responses on either key were required for the 31st response on the red or green key to produce its respective outcome. Except for the time taken to complete the 30 responses no further intertrial interval was imposed. The experimental sequence was: ...FR30 CO FR30 CO FR30 CO.... In the experimental condition all pigeons reduced their preference for SS significantly below that for the control condition. Average percent choice of SS over the last 5 sessions of the experimental condition was 35% for the four pigeons, a reduction of 44% below the average of the initial and terminal control conditions.
Once a pigeon began a ratio by pecking at the LL key, there was a very strong tendency to stay with that choice throughout the sequence. For instance, over the last 5 sessions of the experimental condition, the probability of an SS peck on the very first of the 31 pecks in a ratio was, for the four pigeons, .25, .47, .41, and .29. But once the first peck was made on the LL key, the probability of an SS peck on the second of the 31 pecks (conditional on an LL peck on the first) was, respectively, .01, .00, .02, and .00. Given that a pigeon had made 6 successive pecks on the LL key, the probability of an SS peck on the 7th peck was zero for all four pigeons. Thereafter only an occasional "defection" (an SS peck after a series of LL pecks) was observed. Only one pigeon, on one occasion, switched to the SS key on the 31st peck, after 30 successive LL pecks.
In the experimental condition, at the moment of the 31st peck, the choice facing all pigeons was essentially the same as it was during the control condition: peck the SS key and obtain a small-immediate reward or peck the LL key and obtain a large- delayed reward; the previous 30 pecks were essentially "sunk costs." A pigeon that had pecked 30 times on the LL key would have to move its beak a few inches more in order to switch than to stay but previous research with pigeons' unpatterned choices (Herrnstein, 1961) has found this cost to be very slight, in fact insufficient to counteract a tendency to alternate. Thus, for most of the pigeons, most of the time, in this experiment, restructuring behavior into a relatively simple pattern was sufficient to bridge over the availability of SS; interruption of the pattern was sufficiently costly to make the small immediate (SS) reward less preferred than the larger (LL) reward and to obtain the LL reward.7
It may be argued that the pigeons in the experimental condition were "fooled" into choosing LL because they could not discriminate between the 31st peck and earlier pecks and so did not "know" when the reinforcement would occur. Therefore, after the above experiment, all four pigeons were exposed to a second experimental condition, identical to the first, except that after the 30th peck and before the 31st a brief (1s) blackout was imposed. After the blackout a single peck on either disk produced its outcome (SS or LL). In the second experimental condition all pigeons chose the LL alternative on a significantly greater percentage of trials than they did in the control conditions (the average difference for the four pigeons was 30%). This was true even though the blackout complicated the behavioral pattern (from "break-and-run" to break-run-pause-peck). "Knowing" that the next peck would produce an outcome (signalled by the blackout) did not significantly alter any pigeon's choice of the larger-delayed reward.
Figure 4 shows, for one (typical) pigeon, the pattern of
--------------------
Insert Figure 4 here
-------------------- choice (averaged over the last 5 sessions of each condition) between the two alternatives as the ratio progressed in the simple FR schedule and the FR schedule with a signal prior to the 31st peck (Sig FR 31). With both FR schedules, once the pigeon began pecking on the LL key it kept pecking on that key throughout, with almost no defections.8
It is at least conceivable, therefore, that in everyday human self control problems, the cost of interruption of preestablished patterns, rather than any internalization of commitment, is what enables us to control ourselves -- when we do so without external commitment. In Mischel's delay of gratification experiments, children who waited longer for larger rewards frequently engaged in spontaneous obsessive-like patterns during the waiting period -- some sang songs, some played with their hands, some turned their backs to the table, some made faces (Mischel, et al., 1989). Hungry pigeons waiting for food reward also exhibit spontaneous patterns of behavior ("interim" and "terminal" responding) which, Staddon and Simmelhaag (1971) claimed, are the basis for "superstitious" behavior. Patterns aid self control by bridging (over a "temptation") between early and later choices of a larger reward. Evidence from the present experiments indicates that such patterns, once begun, have a life of their own in the sense that their interruption is costly. The pigeon experiment indicates that when we appear to spontaneously refuse a second dessert we have indeed committed ourselves, not internally but overtly, in the sense that a kind of "soft" commitment is embodied in the cost of interrupting an overarching pattern of behavior.
However, regardless of the value of the patterns in the pigeon experiment, the rewards were extrinsic to the pattern and explicit: LL versus SS. It was argued previously that a better analogy to everyday-life self control is the case where rewards are intrinsic in the behavioral patterns themselves and more abstract. We turn now to an experiment where acts and patterns of acts are directly manipulated.
7. Act and Pattern In Human Choice
A human choice procedure devised by Herrnstein, Prelec and Vaughan (described in Herrnstein & Prelec, 1992, and in Herrnstein, Prelec, Loewenstein & Vaughan, in press) directly opposes particular acts to patterns of acts. With this procedure undergraduate subjects are each faced with repeated choices between pressing one of two concurrently available buttons (A and B). In the simplest version of the game, points, exchangeable for money, are given to the subjects according to the following rules:
i. Each choice of A adds N points;
ii. Each choice of B adds N + 3 points;
iii. N at each choice is equal to the number of A's in the
previous 10 choices.9
Usually, subjects are not told the rules but just to get as many points as possible over the course of the experimental session (50-100 trials).
Any particular choice of B is clearly better than A since 3 more points are thereby added to the subject's score. However, in general, it is better to choose A since (by rule iii) each choice of A adds one unit to N over the next 10 trials; the 3- point gain of a B-choice is more than offset, eventually, by the 10-point loss involved in having a B-choice remembered in N for the next 10 trials. A B-choice "poisons" N for 10 trials. In this version a subject would maximize earnings by always choosing A (except on the very last 3 trials) averaging 10 points per trial. The worst possible performance would be to always choose B, averaging 3 points per trial. In the long run, the more A- choices, the more points earned.
Most (undergraduate) subjects playing the game described above choose B most of the time thereby failing to maximize earnings. Very few subjects spontaneously learn the rules sufficiently to verbalize them. When subjects are given more or less broad hints about the rules, they tend to choose A more often immediately after the hint but gradually drift back to their original sub-maximal performance.
This laboratory procedure bears a formal relationship to typical everyday-life self control situations. Eating a piece of cake may always be better than not eating it in terms of pleasure at the present moment but eating the cake in a sense "poisons" the body's memory. Eating a piece of cake has an effect on the future analogous to that of choosing B in this experiment. The immediate benefit is high (bodily pleasure in the case of the cake, 3 points in the case of a B-choice) and the eventual cost of this individual act is relatively low (a few more calories in the case of the cake, a small subtraction from N in the case of a B-choice) but if the act were repeated over and over again, the result would be disastrous (bad health in the case of the cake and minimization of earnings in the case of a B-choice). Thus, the analogy of this procedure to everyday-life self control is much closer than that of the simple SS versus LL choice illustrated in Figure 1.
The following experiment by Elizabeth Kudadjie-Gyamfi at Stony Brook is an attempt to test the effect of patterning on "self control" (maximization) with human subjects. The version of the game used, differing somewhat from the one described above, gave a single point (convertible to cash at the rate of 10› per point) for A-choices and B-choices alike but varied delays of point-gain by choice of different alternatives. The total cumulative delay time was fixed (at 325 seconds). Subjects chose by pressing buttons marked A and B. A computer screen displayed delay-time left and total points earned. After pressing either button, the subject waited for a certain delay period while the delay timer on the computer screen counted down. At the end of the delay the timer stopped counting and one point was added to the subject's displayed score. When the delay timer reached zero, the experiment ended. The subjects would maximize total reward by minimizing average delay. The rules (not told to the subjects) were:
ii. Each choice of A yields 1 point delayed by N + 3
seconds;
ii. Each choice of B yields 1 point delayed by N seconds;
iii. N at each choice is equal to the number of B's in the
previous 10 choices.
Again, the particular consequences of choosing B were better than those of choosing A (less delay) while the general consequence of choosing B was to increase N, thereby increasing average delay. Of the 60 subjects in this experiment, no subject could verbalize the rules after the experiment although almost all subjects understood that it was a good idea to at least occasionally choose A. Most subjects distributed their choices between A and B.
There were 4 groups of subjects that differed only with respect to the patterning of trials. All subjects played under the same set of rules (as stated above). The trial patterns of the 4 groups were as follows:
Control group 1. ...COCOCO...
Control group 2. ...10sCO 10sCO 10sCO...
Control group 3. ...30sCO 30sCO 30sCO...
Experimental group ...30sCOCOCO 30sCOCOCO 30sCOCOCO...
where C = choice, O = outcome, and 10s and 30s represent intertrial intervals. The delay timer did not count down during the intertrial intervals. The experimental group is the only one with patterned trials -- triples of rapid trials separated by 30s intervals. The patterning, it was hypothesized, would group trials into threes and emphasize the consequences of groups of trials instead of particular trials. Relative to the experimental group, Control group 1 had the same local rate of trials, Control group 2 had the same overall rate of trials, and Control group 3 had the same intertrial interval. Because total delay time was held constant, number of trials (hence, number of points) depended on percent A-choices -- the greater the percent A-choices the shorter the average delay, the greater the number of trials. The fewest average number of trials for any group was 42, for Control group 1; the greatest average number of trials was 58, for the experimental group. Figure 4 shows the percent A-choices within the first 40 trials for all groups. The experimental group chose A significantly more times than any of
--------------------
Insert Figure 5 here
--------------------- the control groups while Control group 1 chose A significantly fewer times. Analysis of choices within the grouped triples of the experimental group indicates that the probability of a B choice (a "defection") in the first of the three grouped trials was .51; given that an A-choice was made on the first, the probability of a defection on the second was .46; given that the first two were A-choices, the probability of a defection on the third was .20. Thus, the same descending pattern of defections within a pattern of choices appears with humans as it did with pigeons even though with this procedure (as opposed to the pigeon experiments) patterns of choices were interrupted by outcomes after each choice.
It may be argued that although the problem posed in this experiment is formally analogous to everyday self control problems, it is fundamentally different from a self control problem because it is a cognitive problem whereas self control in everyday life is fundamentally a motivational problem. We often seem to "know what is good for us" but nonetheless do otherwise. The distinction typically made between cognition and motivation is a distinction between higher and lower internal mechanisms. To grant that distinction (as it is usually understood) would be to abandon the behavioral analysis with which we began. Nevertheless, it is true that if the subjects of this experiment were explicitly taught the rules and rigorously tested for their understanding, none would have ever chosen B (none would have
ever defected). The delays in the experiment were for points exchangeable for money. Eventual consumption could not occur in any case until after the experiment was over. To defect and choose fewer total points would be tantamount to failure to understand the rules. However, we know from the experiments with various degrees of "hints" (Herrnstein et al., in press) that subjects who merely repeated the rules verbally would not necessarily choose A exclusively (as they should if they truly "knew" the rules).
What it means to "truly know" a rule is a deep philosophical problem that we will not solve here. But true knowledge cannot just be repetition of the rule verbally -- like an actor on a stage. To know a rule must mean at least to behave consistently with the rule (and perhaps, in addition, to verbalize the rule). But people who behave consistently with a general rule are, to that extent, controlling themselves. Verbal agreement or disagreement would seem to be irrelevant in the case of self- control -- however relevant it may be in the case of knowledge. Although the distinction between knowledge and self control may well be worth preserving, it is not a behavioral distinction and it will not be preserved here. Instead note the formal correspondence between the present intra-personal prisoner's dilemma game and everyday problems of self control and note the common effect of patterning on self control by human subjects in the present experiment and on self control by pigeon subjects choosing between smaller sooner (SS) and larger later (LL) rewards. The operations that foster self control in the pigeon experiment, are the same ones that foster better knowledge in the human experiment.
8. Discussion
According to Thorndike's (1911) "law of exercise," a habit once begun tends to be repeated independent of extrinsic reinforcement. Thorndike "repealed" the law of exercise on the basis of an experiment in which human subjects trying to draw a 2-inch line did not become more accurate over trials without feedback. But, while the subjects' accuracy did not increase, their precision did increase (the variance of the lines decreased) -- that is, subjects tended to do the same thing they had done before (accurate or not). In everyday life, it seems, habits do have a life of their own. The behavior of people labeled "obsessive-compulsive" is just a more vivid version of everyone's behavior. We bite our nails, step on cracks or avoid cracks, prefer our meals at fixed periods, enjoy certain dishes at certain meals and others at others; we like our established routines in general and the older we get, the more we like them. From the present viewpoint the difference between "good" and "bad" habits has to do with the balance between an act and a pattern of acts. A good habit (such as a healthy diet) is one with a (perceived) long-term beneficial effect for the individual or the community; if, in addition, alternatives to its component parts are immediately reinforced, the habits' maintenance is said to require self control.
A bad habit on the other hand is one with a (perceived) long-term harmful effect for the individual. Although the individual components of bad habits may be extrinsically reinforced, it is important to recognize that even if extrinsic reinforcement were removed or minimized, such habits could be maintained because of the cost of their interruption. Highly abstract behavioral patterns, not supported by clear discriminative stimuli, may be costly to interrupt. Particular rewarded acts (such as taking one's umbrella to work, and avoiding getting wet) can be turned on and off in response to clearly defined signals (such as the weather report or the view outside the window). But where the pattern is not easily discriminable (umbrella plus dry clothes versus no umbrella and wet clothes) but more vague (like social approval consequent on being well dressed as is the case among some British businessmen), a habit (taking an umbrella to work every day) might be difficult to break -- even when the extrinsic reward is removed.
From the present viewpoint the distinction between intrinsic and extrinsic rewards is arbitrary rather than substantial. A rat's lever press may be said to be rewarded extrinsically when the rat eats a food pellet. But a more precise picture of the process may be obtained by considering the pattern: pressing- plus-eating, as intrinsically more valuable than not-pressing- plus-not-eating. (As the autoshaping literature makes clear, pressing-plus-eating is not necessarily less valuable than eating alone.) Thus, eating may be an extrinsic reward when considered separately from the lever press on which it is contingent, or an intrinsic reward when considered together with the lever press as part of a pattern. Economic theory has been used to quantitatively evaluate such patterns and their combinations (Rachlin et al., 1981). Contingencies of extrinsic reinforcement may be translated to economic terms by considering them as constraints on available patterns.
A theory of self control (as opposed to an ethical system) is not obligated to say why one perceivable pattern is better than another. All that is being argued here is that a pattern is perceived. Say student A perceives a pattern in student B's behavior as better than her own. She sees the rewards that student B gets or she just sees student B as happier. A problem of self control arises when student A actually begins to emulate B and finds that she prefers the particular acts she has always been doing to those she now begins (or does not begin) to do.
For the most part, however, the present article argues, we keep ourselves behaving well through expansion of our behavioral units to more and more abstract patterns, thus exposing ourselves to the intrinsic and extrinsic rewards contingent on those patterns. The rewards of sobriety for instance are slow, mild and abstract compared to the rewards contingent on a single glass of whiskey. If an alcoholic perceives his own behavior in terms of particular acts of acceptance or refusal of drinks, the reward for acceptance will always outweigh those for refusal. However an alcoholic who perceives his own behavior in terms of a whole day's activities (generated perhaps by planning in advance or keeping records) is more likely to choose a day of sobriety over a day of drunkenness. Commitment in such a case is not a matter of internal constraint or punishment as it is of external organization. Similarly, the student comes to study by organizing her activities in terms of weeks and months rather than days. Development of self control comes through development of wider patterns of overt behavior (mediated of course by development of whatever internal neurological mechanisms are required to support those patterns).10 It would seem that the (behavioral) study of how such patterns develop would be of at least as much interest to psychologists as the (cognitive/physiological) study of the mediating mechanisms.
Note
The research described in this article was supported by a grant from the National Institute of Mental Health. The author thanks Eric Siegel and Elizabeth Kudadjie-Gyamfi, graduate students at Stony Brook, for permission to describe their unpublished experiments. Reprints may be obtained from Howard Rachlin, Department of Psychology, SUNY at Stony Brook, Stony Brook, NY 11794-2500. Footnotes
1. The present article thus disagrees with Frank's (1992)
mostly implicit arguments for the function of emotions as
internal controlling devices. However, Frank's theory about
the function of emotions in social situations ("irrational"
anger as a threat, for instance) is not disputed here.
2. Premack's (1971) theory of reward as the contingency of a
higher-valued activity on a lower-valued activity and
punishment as the contingency of a lower-valued activity on
a higher-valued activity highlights the complete circularity
of the concept of self-punishment. For self-punishment to
occur an animal must choose a lower-valued over a higher-
valued activity even though the higher-valued activity is
available. But value itself is determined in Premack's
theory by consistent choice. If self-punishment were
possible, there would be no way in Premack's theory to
distinguish it from simple reward.
3. There may seem to be a confusion of terminology here between
apparently environmental events (stimuli and reinforcers)
and behavioral events (responses). However, in all cases,
Premack's (1971) identification of environmental events with
actions of an organism is intended here. (The apple is not
the reinforcer; eating the apple is the reinforcer.) Even
such apparently perceptual events as hearing a sound or
seeing a light may be conceived as patterns of overt
discriminative responding. [See Rachlin (1985) for a
discussion of pain as behavior.]
4. This terminology is a transposition of Guthrie's (1935)
distinction between (particular) movements and (more
abstractly conceived) acts. The present distinction however
emphasizes, more than Guthrie did, the relativity of the
distinction and its temporal character.
5. Kantor (1963) argued that the characterization and
development of these perceptions, rather than their
attribution to one or another internal mechanism, is the
true "folk psychology".
6. Just as the squirrel is not controlling itself when it saves
nuts, the child, spanked for crossing the street against the
light, is not necessarily controlling herself when she obeys
the light later.
7. Robert Eisenberger and associates studying choices between
small low-effort and large high-effort rewards have found
analogous results with children (Eisenberger, Mitchell &
Masterson, 1985) and rats (Eisenberger, Carlson & Frank,
1979). 8. It is not clear why an interruption at the end of the ratio
should have produced its effect at the beginning of the
ratio. Perhaps with the one-second blackout the beginning of
the ratio was occasionally confused with the crf condition.
9. This game is a within-subject version of a between-subjects
prisoners'-dilemma type game. In the between-subjects
version, instead of one subject making many choices, each of
a group of subjects (say, 10 subjects) makes one choice.
The rules are the same in both versions except in the
between-subjects version N is equal to the number of
subjects choosing A (rather than to the number of times a
single subject had chosen A in the past), rules are
explicitly described to all subjects, and no subject is
informed of the choices of any other subject.
10. However, control of compulsive gambling, unlike other areas
of self control, may be impervious to expansion of the
behavior unit. Due to the stochastic nature of gambling
games, organization of gambles into larger and larger units
(sequences of roulette bets, for instance, rather than
individual bets) retains the negative objective value of the
gamble and whatever positive subjective value the gamble may
have had in its particular form (Rachlin, 1990). References Ainslie, G. (1974). Impulse control in pigeons. Journal of the
Experimental Analysis of Behavior, 21, 485-489. Ainslie, G. (1992). Picoeconomics. New York: Cambridge
University Press. Ainslie, G. & Herrnstein, R. J. (1981). Preference reversal and
delayed reinforcement. Animal Learning and Behavior, 9,
476-482. Brownell, K. D., Greenwood, M. R. C., Stellar, E. & Shrager, E.
E. (1986). The effects of repeated cycles of weight loss
and regain in rats. Physiology and Behavior, 38, 459-464. Castro, L. & Rachlin, H. (1980). Self-reward, self-monitoring,
and self-punishment as feedback in weight control. Behavior
Therapy, 11, 38-48. Catania, A. C. (1975). The myth of self-reinforcement.
Behaviorism, 3, 192-199. Dennett, D. (1978). Brainstorms: Philosophical essays on mind
and psychology. Montgomery, VT: Bradford Books. Dewey, J. (1896). The reflex are concept in psychology.
Psychological Review, 3, 357-370. Eisenberger, R., Carlson, J. & Frank, M. (1979). Transfer of
effort across behavior. Quarterly Journal of Experimental
Psychology, 31, 679-700. Eisenberger, R., Mitchell, M. & Masterson, F. A. (1985). Effort
training increases generalized self control. Journal of
Personality and Social Psychology, 49, 1294-1301. Frank, R. H. The role of moral sentiments in the theory of
intertemporal choice (pp. 265-286). In G. F. Loewenstein
and J. Elster (Eds.), Choice over time. New York: Russell
Sage Foundation. Ferster, C. S. & Skinner, B. F. (1957). Schedules of
reinforcement. Englewood Cliffs, NJ: Prentice-Hall. Green, L., Fisher, E. B., Jr., Perlow, S. & Sherman, L. (1981).
Preference reversal and self control: Choice as a function
of reward amount and delay. Behaviour Analysis Letters, 1,
43-51. Guthrie, E. R. (1935). The psychology of learning. New York:
Harper. Herrnstein, R. J. (1961). Relative and absolute strength of
response as a function of frequency of reinforcement.
Journal of the Experimental Analysis of Behavior, 4, 267-
272. Herrnstein, R. J. (1969). Method and theory in the study of
avoidance. Psychological Review, 76, 49-70. Herrnstein, R. J., Loewenstein, G. F., Prelec, D. & Vaughan, W.
Jr. (in press). Behavioral Decision Making. Herrnstein, R. J. & Prelec, D. (1992). Melioration. In G. F.
Loewenstein and J. Elster (Eds.), Choice over time. Russell
Sage Foundation, New York City. Hinde, R. A. (1970). Animal behaviour: A synthesis of ethology
and comparative psychology. New York: McGraw-Hill. Hull, C. L. (1943). Principles of behavior. New York:
Appleton-Century. Kantor, J. R. (1963). The scientific evolution of psychology,
Vol. 2. Chicago: Principia Press. Lacey, H. M. & Rachlin, H. (1978). Behavior, cognition, and
theories of choice. Behaviorism, 6, 177-202. Land, E. H. (1964). The retinex. American Scientist, 52, 247-
264. Lewin, K. (1938). The conceptual representation and the
measurement of psychological forces. Durham, NC: Duke
University Press. Logue, A. W. (1988). Research on self control: An integrating
framework. The Behavioral and Brain Sciences, 11, 665-679. Mahoney, M. J. (1974). Self-reward and self-monitoring
techniques for weight control. Behavior Therapy, 5, 48-57. Mazur, J. E. (1987). An adjusting procedure for studying
delayed reinforcement. In M. L. Commons, J. E. Mazur, J. A.
Nevin, and H. Rachlin (Eds.), Quantitative analyses of
behavior: Vol. 5. The effect of delay and of intervening
events on reinforcement value (pp. 55-73). Hillsdale, NJ:
Erlbaum. Mazur, J. E. & Logue, A. W. (1978). Choice in a "self control"
paradigm: Effects of a fading procedure. Journal of the
Experimental Analysis of Behavior, 30, 11-17. Millar, A. & Navarick, D. J. (1984). Self control and choice in
humans. Effects of video game playing as a positive
reinforcer. Learning and Motivation, 15, 203-218. Mischel, W., Shoda, Y. & Rodriguez, M. (1989). Delay of
gratification in children. Science, 244, 933-938. Mowrer, O. H. (1960). Learning theory and behavior. New York:
John Wiley. Navarick, D. J. & Fantino, E. (1976). Self control and general
models of choice. Journal of Experimental Psychology:
Animal Behavior Processes, 2, 75-87. Nevin, J. A., Mandell, C. & Atak, J. R. (1983). The analysis of
behavioral momentum. Journal of the Experimental Analysis
of Behavior, 39, 49-60. Prelec, D. (1989). Decreasing impatience: Definition and
consequences. Manuscript. New York: Russell Sage
Foundation. Premack, D. (1971). Catching up with common sense or two sides
of a generalization: Reinforcement and punishment. In R.
Glaser (Ed.), The nature of reinforcement. New York:
Academic Press. Rachlin, H. (1974). Self control. Behaviorism, 2, 94-107. Rachlin, H. (1976). Behavior and learning. San Francisco: W.
H. Freeman. Rachlin, H. (1985). Pain and behavior. Behavioral and Brain
Sciences, 8, 43-83.
Rachlin, H. (1990). Why do people gamble and keep gambling
despite heavy losses? Psychological Sciences, 1, 294-297. Rachlin, H. (1992). Teleological behaviorism. American
Psychologist, 47, 1371-1382. Rachlin, H. (1994). Behavior and mind: The roots of modern
psychology. New York: Oxford University Press. Rachlin, H. (in press a). The context of pigeon and human
choice. Behavior and Philosophy. Rachlin, H. (in press b). From overt behavior to hypothetical
behavior to memory: Inference in the wrong direction.
Comments on an article by Peter Killeen. Behavioral and
Brain Sciences. Rachlin, H., Battalio, R., Kagel, J., & Green, L. (1981).
Maximization theory in behavioral psychology. Behavioral
and Brain Sciences, 4, 371-388. Rachlin, H. & Green, L. (1972). Commitment, choice and self-
control. Journal of the Experimental Analysis of Behavior,
17, 15-22. Rachlin, H. & Raineri, A. (1992). Irrationality, impulsiveness
and selfishness as discount reversal effects (pp. 93-118).
In G. F. Loewenstein and J. Elster (Eds.), Choice over time.
Russell Sage Foundation, New York City. Rachlin, H., Raineri, A. & Cross, D. (1991). Subjective
probability and delay. Journal of the Experimental Analysis
of Behavior, 55, 233-244. Rescorla, R. A. & Solomon, R. L. (1967). Two-process learning
theory: Relationships between Pavlovian conditioning and
instrumental learning. Psychological Review, 74, 151-182. Rozin, P. & Kalat, J. W. (1971). Specific hungers and poison
avoidance as adaptive specializations of learning.
Psychological Review, 78, 459-486. Ryle, G. (1949). the concept of mind. London: Hutchenson
House. Schelling, T. C. (1992). Self-command: A new discipline (pp.
167-176). In G. F. Loewenstein and J. Elster (Eds.), Choice
over time. New York: Russell Sage Foundation. Shefrin, H. M. & Thaler, R. H. (1992). Mental accounting,
saving, and self control (pp. 287-330). In G. F.
Loewenstein and J. Elster (Eds.), Choice over time. New
York: Russell Sage Foundation. Skinner, B. F. (1953). Science and human behavior. New York:
Macmillan. Solnick, J. W., Kannenberg, C., Eckerman, D. A. & Waller, M. B.
(1980). An experimental analysis of impulsivity and impulse
control in humans. Learning and Motivation, 1, 61-77. Solomon, R. L. & Wynne, L. C. (1954). Traumatic avoidance
learning: The principles of anxiety conservation and partial
irreversibility. Psychological Review, 61, 353-385. Staddon, J. E. R. & Simmelhaag, V. L. (1971). The
"superstition" experiment: A reexamination of its
implications for the principles of adaptive behavior.
Psychological Review, 78, 3-43. Strotz, R. H. (1956). Myopia and inconsistency in dynamic
utility maximization. Review of Economic Studies, 23, 165-
180. Thorndike, E. L. (1911). Animal intelligence. New York:
Macmillan. Wittgenstein, L. (1958). Philsophical investigations.
Translated by g. E. M. Anscombe. New York: Macmillan. Zuriff, G. E. (1979). Ten inner causes. Behaviorism, 7, 1-8. Figure Legends
1. Illustration of a choice between a smaller-sooner (SS)
reinforcer, available at time, t2 and a larger-later (LL)
reinforcer available at t3. The thin lines subtended from
points SS and LL are temporal discount functions indicating
how the value of the reinforcers diminishes with delay. The
crossing of the discount functions indicates a reversal of
value. At time t1, for instance, vLL > vSS. However, at t2,
when SS would be immediately available, vSS > vLL.
2. Alternatives in an experiment by Rachlin and Green (1972) on
commitment by pigeons. Pigeons (Choice-X at time t2)
choosing between a small-immediate reinforcer, SS (2s grain
availability), and a large-delayed reinforcer, LL (4s delay
followed by 4s grain availability) strongly preferred the
small-immediate reinforcer. However, at a prior point in
time (Choice-Y at t1) pigeons preferred an alternative (the
lower arm) that restricted their choice to LL only.
3. Symbolic representation of a self control problem. The
vertical line (V) represents a section of a value scale. The
wavy line represents a pattern of behavior of high value.
The circled X represents an element of the pattern, an act,
of low value by itself. The circled Y represents an
alternative act (a temptation) of higher value than X but of
lower value than the pattern as a whole.
4. Choices of pigeons in an experiment by Eric Siegel. Percent
choice of the SS alternative (defections, or lack of self-
control) versus the LL alternative (self control) as a
function of responses with the FR 30-responses and signalled
FR 30-responses schedules of reinforcement and as a function
of time with the FI 30-second schedule for a typical subject
(# 32). The square at the upper right indicates percent SS
choices in the control condition (crf) where a single
response (with an intertrial interval of 30 seconds during
which no responses were available) was used to chose between
SS and LL alternatives. Data are averages of the last 5
sessions at each condition.
5. Percent choices of alternative A by human subjects in an
experiment by Elizabeth Kudadjie. Choices by control groups
1, 2, and 3, differing in intertrial interval and an
experimental group in which trials were patterned are shown
on the figure. The outcome of Choice-A was always more
delayed than the outcome of Choice-B (the two outcomes being
equal in amount) but the delay of both outcomes increased
proportionally to the number of B-choices in the previous 10
trials. As intertrial interval increased subjects tended to
choose A (the more "abstract" reinforcer) more. However,
patterning trials into threes (experimental group) yielded
most choices of A. The vertical bars are standard
deviations.