Below is the unedited penultimate draft of:
Nevin, John A. and Grace, Randolph C. (1999) Behavioral Momentum and the Law of Effect Behavioral and Brain Sciences 23 (1): XXX-XXX.This is the unedited penultimate draft of a BBS target article that has been accepted for publication (Copyright 1999: Cambridge University Press U.K./U.S. -- publication date provisional) and is currently being circulated for Open Peer Commentary. This preprint is for inspection only, to help prospective commentators decide whether or not they wish to prepare a formal commentary. Please do not prepare a commentary unless you have received the hard copy, invitation, instructions and deadline information.
For information on becoming a commentator on this or other BBS target articles, write to: bbs@soton.ac.uk
For information about subscribing or purchasing offprints of the published version, with commentaries and author's response, write to: journals_subscriptions@cup.org (North America) or journals_marketing@cup.cam.ac.uk (All other countries).
In the metaphor of behavioral momentum, the rate of a free operant in the presence of a discriminative stimulus is analogous to the velocity of a moving body, and resistance to change measures an aspect of behavior that is analogous to its inertial mass. An extension of the metaphor suggests that preference measures an analog to the gravitational mass of that body. The independent functions relating resistance to change and preference to the conditions of reinforcement may be construed as convergent measures of a single construct, analogous to physical mass, that represents the effects of a history of exposure to the signaled conditions of reinforcement and that unifies the traditionally separate notions of the strength of learning and the value of incentives.
Research guided by the momentum metaphor emcompasses the effects of reinforcement on response rate, resistance to change, and preference, and has implications for clinical interventions, drug addiction, and self-control. In addition, its principles can be seen as a modern, quantitative version of Thorndike's (1911) Law of Effect, providing a new perspective on some of the challenges to his postulation of strengthening by reinforcement.
addiction, choice, extinction, generalization, law of effects, learning, momentum, movement, operant, Pavlov, preference, reinforcement, self-control, Thorndike
The stimulus presented by the experimenter, the response of the organism, and the reinforcer that follows the response, are fundamental elements in the science of behavior. Skinner (1969) suggested that these three terms define the discriminated operant as a unit for analysis. This target article will argue that there are two separable aspects of discriminated operant behavior that has been trained to asymptote: Its rate of occurrence, which depends primarily on the contingencies between the response and the reinforcer; and its resistance to change, which depends primarily on the contingencies between the stimulus and the reinforcer.
The distinction between response rate and resistance to change is captured by the metaphor of behavioral momentum, in which the rate of a simple, repeatable response in the presence of a distinctive stimulus is analogous to the velocity of a physical body in motion. Following Newton's second law, when responding is disrupted in some way that is analogous to imposing an external force on a moving body, resistance to change of response rate is related to an aspect of behavior that is analogous to inertial mass in classical mechanics.
To pursue the metaphor, Newton's law of gravitation suggests that an analog to gravitational mass may be derived from the attractiveness or value of access to reinforced responding as measured by preference. Although they refer to different aspects of behavior -- namely, resistance to change in the presence of a stimulus, and responding that gains access to that stimulus -- we will argue that resistance to change and preference covary, and that they provide independent, convergent measurement of a single construct analogous to the mass of a physical body. In terms of the traditional distinction between learning and performance, velocity (identified with response rate) characterizes ongoing performance, whereas mass (derived from resistance to change and preference) reflects the learning that results from a history of reinforcement in the presence of a distinctive stimulus situation.
We begin by contrasting response rate and resistance to change as measures of the traditional construct of response strength, and describe some research on resistance to change that distinguishes response rate and resistance. After a review of related work on preference, the convergence of resistance and preference is treated via the momentum metaphor. Supporting research involving concurrent measurement of resistance and preference is described, and the discrepancy between resistance and preference resulting from the partial reinforcement extinction effect is resolved by a model of resistance to change that incorporates generalization decrement. After considering some extensions to clinical interventions, drug effects, and self-control, we argue that the findings of research on behavioral momentum constitute a modern, quantitative verison of Thorndike's (1911) Law of Effect, and review some challenges to the Law from the perspective of behavioral momentum.
The idea that behavior varies along a dimension of strength appears in Sherrington's (1906) studies of reflexive behavior, where strength was measured by the latency and amplitude of response to an eliciting stimulus. It appears also in Pavlov's (1927) studies of conditional reflexes, where strength was measured by resistance to extinction or to external inhibition as well as by the latency or amplitude of response to a conditional stimulus. Hull's (1943) theorizing relied heavily on the construct of habit strength, which was established by reinforcement and expressed in performance measures such as latency, amplitude, probability, and resistance to extinction of a learned response. However, these measures did not always covary, casting doubt on the utility of the construct. Moreover, as suggested by Logan (1956), responses that varied along dimensions such as latency and amplitude could be construed as different responses rather than as instances of a single response that varied in strength.
Following Skinner's (1938) relatively atheoretical approach, the experimental analysis of behavior has either identified response strength with the rate of a free operant (e.g., Vaughan & Miller, 1984) or eschewed the notion altogether. In this section, we consider resistance to change as an alternative to response rate as a measure of strength. In addition, we suggest that resistance to change is related to learning whereas response rate characterizes performance.
It has long been recognized that response rate depends on the contingencies of reinforcement as well as the rate or magnitude of the reinforcer. For example, ratio schedules routinely maintain higher response rates than interval schedules with comparable obtained rates of reinforcement. However, it is not clear that ratio-schedule performance should be deemed stronger than interval-schedule performance because the contingencies that shape and maintain them differ. Following Logan's (1956) argument for discrete responses, ratio and interval performances could be construed as belonging to different classes, rather than as instances of a single class that varies in strength.
Morse (1966) distinguished the shaping effects of reinforcement contingencies on response rate -- for example, the difference between ratio and interval schedules in the reinforcement of interresponse times -- from the strengthening effects of reinforcement on average response rate. Presumably, if shaping contingencies were kept constant across conditions that varied in the rate or amount of reinforcement, average steady-state response rate would give a direct measure of the strengthening effect of reinforcement.
In his review of the literature, however, Morse noted that the steady-state rate of a single response maintained by a single schedule of reinforcement was not always an orderly function of reinforcer amount when reinforcement contingencies were constant. For example, Keesey and Kling (1961) found that response rates maintained by variable-interval (VI) schedules were essentially constant when reinforcer amount was varied. However, Keesey and Kling also reported that response rate was positively related to reinforcer amount when each of three different stimuli signaled a different amount and alternated within each session (see also Shettleworth & Nevin, 1965). In effect, Keesey and Kling's method established three discriminated operants defined jointly by the antecedent stimuli, the responses in their presence, and the consequences of responding signaled by the stimuli (Skinner, 1969). Following Skinner, we take the discriminated operant to be a fundamental unit in the science of behavior. The relations between the strength of discriminated operant behavior and the signaled conditions of reinforcement will be explored in Sections 3, 4, and 7.
Extensions of the steady-state operant paradigm to choice between two continuously available operants inspired a new approach to the measurement of response strength and reinforcement value. In a much-cited study, Herrnstein (1961) arranged concurrent VI VI schedules for pigeons' responses to two keys, effectively arranging two simultaneous discriminated operants defined by key location, and found that the relative frequency of responses to one alternative roughly equalled (matched) the relative frequency of food obtained from that alternative in each condition. Herrnstein's finding can be stated as:
B1/(B1 + B2) = r1/(r1 + r2), (1)
where B1 and B2 designate the response rates to each alternative, and r1 and r2 designate the reinforcer rates obtained from each alternative. This matching result proved to have remarkable generality (see de Villiers, 1977, and Williams, 1988, for review).
Herrnstein (1970) extended the matching law to the rate of a single response by assuming that all an organism's behavior, including unmeasured behavior B0, summed to a constant k, where B0 and k are expressed in units of the measured response. From the matching law, Equation 1,
B/(B + B0) = r/(r + r0) (2)
where B represents response rate, r represents the obtained rate of experimentally arranged reinforcers, and r0 represents the rate of extraneous, unspecified reinforcers for other activities that occur in the experimental setting expressed in units of the measured reinforcer. Because B + B0 = k, the sum of all possible behavior,
B = kr/(r + r0) (3)
Although it fails at very high reinforcer rates (Baum, 1993) and in long experimental sessions (McSweeney, 1992), Equation 3 provides an excellent description of the relation between response rate and reinforcer rate on interval schedules under most conditions. Moreover, it applies to discrete-trial as well as free-operant performance, to reinforcer magnitude as well as rate, and to negative as well as positive reinforcement (de Villiers, 1977). In view of its generality, Equation 3 has come to be known as the Relative Law of Effect, and is widely accepted as a modern version of Thorndike's Law. However, Equation 3 is limited in two ways: It does not address the effects of antecedent stimuli, and although it describes asymptotic performance, it does not address other effects of a history of reinforcement.
Learning has been defined as "a relatively permanent change in behavior potentiality which occurs as a result of reinforced practice" (Kimble, 1961, p. 6; emphases in original). Kimble's reference to potentiality suggests that although changes in behavior as a result of reinforced practice may be directly observable in current performance, reinforcement may also have effects that can only be detected by a separate test. The possibility of differences between the effects of reinforcement on behavior as evaluated during training and as evaluated by a later test of "potentiality," such as resistance to change, accords well with intuition, and is embodied in the long-standing distinction between "performance" and "learning" in the literature of learning theory.
The construct of response strength is similar to behavior potentiality. It is presumed to increase with reinforcement, and the connotations of "reinforcement" in our everyday language may help to identify a useful way to characterize strength. For example, concrete is said to be reinforced with steel rods to make it stronger as a building material. In this expression, "reinforcement" implies an increase in durability or resistance: Under an added load, a reinforced concrete wall does not collapse as readily as an unreinforced wall. However, an observer could not determine, by looking at it before a load test, whether the wall had been reinforced or how many steel rods had been used; the load that makes the wall collapse must be known. By analogy, we suggest that more frequently or generously reinforced behavior becomes more resistant to challenge or disruption, and this increase in its resistance need not imply an observable increase in the rate or probability of currently observed behavior. Instead, the strengthening effects of reinforcement may be evident only when responding is disrupted in some way.
A theoretical article by K. Smith (1974) proposed that reinforcement value could be measured by training the reinforced response to asymptote and then determining the intensity of "some standard attenuator required to just abort the behavior" (p. 141). He noted a consequence of this approach: "'the most potent reinforcer' is the one able to engender behavior most highly resistant to attenuation. To reinforce -- to 'strengthen' -- is thus to make refractory to attenuation" (p. 141).
Nevin (1974) independently suggested that response strength be equated with resistance to change, and explored this notion by arranging different conditions of food reinforcement in the presence of two successively alternating stimuli (the components of a multiple schedule) with pigeons as subjects. He found that resistance to disruption by an alternative source of food and resistance to extinction in a given component were both positively related to the rate or amount of food in that component during baseline training.
In Sections 3 and 4 of this article, we consider resistance to change as a measure of the strength of a discriminated operant. In Sections 5 and 6, we consider preference for access to that discriminated operant as an independent measure of the value of the conditions of reinforcement maintaining it. Metaphorical relations between resistance and preference are treated in Section 7, and empirical research linking resistance and preference is described in Section 8.
The majority of empirical research on resistance to change has employed multiple schedules of reinforcement, which define two or more discriminated operants. We begin by describing the paradigm and then review research, most of which has used pigeons as subjects, that illustrates the study of resistance to change and its determiners.
Multiple schedules are arranged by correlating two (or more) successive stimuli with independent schedules or contingencies of reinforcement, where each stimulus-contingency combination defines a schedule component. The paradigm is illustrated in Figure 1, which shows the successive presentation of two stimuli, S1 and S2, separated by a brief time out. In this illustration, a single response is intermittently reinforced in the presence of each stimulus according to independent variable-interval (VI) schedules, with the schedules chosen so that the average rate of reinforcement per unit time in the presence of S1 is greater than in S2.
Figure 1. A schematic diagram of the multiple-schedule paradigm of discriminated operant behavior. Stimuli S1 and S2 are presented successively, separated by a brief time out. A free-operant response is reinforced intermittently according to separate schedules in the presence of S1 and S2, defining two schedule components.
The paradigm has the following features: 1) The experimenter can control the duration of each component, and can arrange that they alternate, regularly or irregularly, a number of times within each experimental session. Thus, response rates in both components can be measured and related to the component schedules within sessions as well as within subjects. 2) Interactions between components, such as behavioral contrast -- an inverse relation between response rate in one component and reinforcer rate in the other component -- can be minimized by arranging timeout periods between components. 3) Most importantly for present purposes, a disruptor can be applied equally to both component performances and its effects can be compared between components, again within sessions as well as within subjects. The use of VI schedules assures that the number or rate of reinforcers obtained by the subject is roughly equal to the number or rate arranged by the experimenter even when response rate is moderately reduced.
Nevin (1974, Experiment 1) trained food-deprived pigeons on multiple VI 1-min, VI 3-min schedules yielding 60 reinforcers per hour in Component 1 and 20 per hour in Component 2. Responding was disrupted by presenting food during the timeout periods between components for 6-10 hours at rates that varied across successive determinations. Average response rates for the final hour of baseline training preceding timeout food and for the first hour of timeout food are shown in the upper panel of Figure 2. Baseline response rates were somewhat higher in Component 1 than in Component 2, and when food was presented during timeout periods, the decrease in response rate, relative to baseline, was always smaller in Component 1 than Component 2.
To summarize the data, response rates in each component during the first hour of exposure to timeout food were expressed as logarithms of proportions of the immediately preceding baseline response rates for each pigeon and averaged across pigeons. These log proportions of baseline are shown in the lower panel of Figure 2 as a function of the rate of timeout food presentations.
Figure 2. Average response rates of pigeons in two components of a multiple VI VI schedule with 60 reinforcers per hour in one component and 20 per hour in the other, showing the effects of the rate of free food presentations during timeout periods between components. The upper panel shows response rates during the last hour of baseline training and the first hour of disruption by timeout food across four conditions. In the lower panel, these data are reexpressed as log ratios of response rates with free food to response rates in the immediately preceding baseline and plotted as functions of timeout food rate. Adapted from Nevin (1974, Experiment 1).
The data are presented as logarithms of proportion of baseline for several reasons. First, proportion of baseline is a direct measure of resistance to change: The smaller the decrease, the larger the proportion. Second, logarithms are unbounded and permit examination of functional relations without distortion by floor effects. Third, the logarithmic transform renders equal proportional changes as equal differences. Suppose that presenting a given rate of timeout food reduces response rate to 50% of baseline. If this reduced level is construed as a new baseline and is further disrupted by doubling the rate of timeout food, one might reasonably expect another 50% reduction, to 25% of the original baseline. When expressed as logarithms, these successive reductions are the same, and for this example, the relation between log proportion of baseline and timeout food rate is linear (for further discussion of measurement issues, see Nevin, Mandell, & Atak, 1983; Nevin, Smith, & Roberts, 1987; and Grace & Nevin, 1997).
Both functions in Figure 2 are roughly linear (except for the initial portion of the function for Component 2) and can be characterized adequately by their slopes: -.10 for Component 1, which arranged 60 reinforcers per hour, and -.15 for Component 2, which arranged 20 reinforcers per hour. Strength of responding, construed as resistance to change, is inversely related to the slope of the function relating log proportion of baseline to the value of the disruptor: The shallower the slope, the greater the resistance. Thus, in this example, response strength is directly related to the rate of reinforcement in a schedule component.
The finding that time-out food has a smaller disruptive effect on performance in a multiple VI VI schedule component with more frequent reinforcement has been repeated several times (McLean, Campbell-Tie, & Nevin, 1996; Nevin, 1974, Experiment 5; Nevin et al., 1983). Similar results have been obtained with home-cage prefeeding (Nevin, 1992a; Nevin, Mandell, & Yarensky, 1981; Nevin, Tota, Torquato, & Shull, 1990), signaled alternative reinforcement (Nevin et al., 1981), and extinction (Nevin, 1974, Experiment 2; Nevin, 1992a; Nevin et al., 1983; Nevin et al., 1990). Without exception, the rate of responding decreased relatively less in the component with the greater rate of reinforcement (the richer component) during training.
The effects of aversive disruptors are entirely consistent with those described above. For example, Bouzas (1978) arranged intermittent electric-shock punishers at equal rates in both components of a multiple VI VI schedule and observed relatively smaller decrements in the richer component. Lyon (1963) and Blackman (1968, Experiment 2) presented signaled unavoidable shocks during both components of a multiple VI VI schedule and observed less conditioned suppression to the signal in the richer component. Importantly, Blackman arranging that the same interresponse times were reinforced in each component to insure that response rates were similar even though reinforcer rates differed between components.
Similar results have been obtained when reinforcer amount, rather than rate, has differed between components. For example, Shettleworth and Nevin (1965) observed greater resistance to extinction in the component with the larger reinforcer (see also Harper, 1996; Harper & McLean, 1992, Experiment 1; Millenson & de Villiers, 1972; Nevin, 1974, Experiment 3). In general, the effects of differential reinforcer rate and reinforcer amount on resistance to change are at least ordinally equivalent.
There have been some failures to find differential resistance in multiple VI VI schedules when drugs were used as disruptors (e.g., Cohen, 1986; Lucki & deLong, 1983; but see Egli, Schall, Thompson, & Cleary, 1992, and Section 10.2.1). Likewise, signaled or unsignaled within-component food appears not to have differential disruptive effects (Cohen, Riley, & Weigle, 1993; Nevin, 1984; J. B. Smith, 1974), and Harper and McLean (1992) failed to find differential effects of within- component changes in reinforcer rate (but see Harper, 1996). However, the overwhelmingly most general and reliable result is that asymptotic free-operant response rates in multiple VI VI schedules are more resistant to change in the presence of a signal for relatively frequent or large reinforcers than in the presence of a signal for relatively infrequent or small reinforcers. The convergence of these results across diverse disruptors confirms the utility of resistance to change as a measure of the strength of steady-state discriminated operant behavior.
There are several ways to distinguish the effects of reinforcer rate on response rate and resistance to change. One is to arrange schedules of interresponse time reinforcement that produce similar rates of responding despite differences in reinforcer rates, as in the work of Blackman (1968) cited above. Another is to arrange identical schedules in separate components that are followed by different components signaling reinforcer rates that either richer or leaner. For example, Nevin et al. (1987, Experiment 2) trained pigeons in a four-component procedure where identical VI 100-s schedules were arranged successively on the left and right side keys. One side-key component was always followed by a richer VI 20-s component signaled by one color on the center key, and the other side-key component was always followed by a period of nonreinforcement signaled by a different color on the center key. During baseline, response rates were higher in the side-key component that preceded nonreinforcement, an effect termed following-schedule contrast (Williams, 1981). By contrast, resistance to extinction was greater in the side-key component that had preceded the richer center-key component. These results were replicated systematically, and shown to hold to resistance to satiation and to prefeeding as well as resistance to extinction, by Tota-Faucette (1991; see also Nevin, 1984).
Nevin et al. (1987) interpreted their resistance data in relation to stimulus-reinforcer relations. Specifically, they suggested that there is a stronger stimulus-reinforcer correlation for the side-key component that reliably preceded a higher rate of reinforcement than for the side-key component that reliably preceded nonreinforcement. Alternatively, one might argue that each side-key component was embedded within a serial compound stimulus, and resistance depended on the reinforcer rate correlated with the compound (for discussion see McLean et al., 1996). The important result is that the reinforcer rate in the following component produced opposite effects on response rate and resistance to change in otherwise identical components.2
Two experiments by Nevin et al. (1990) employed a different method to separate the effects of reinforcer rate on response rate and resistance to change. Experiment 1 arranged a two-component multiple schedule where key pecking was reinforced according to the same VI schedule in both components. Throughout baseline training, additional reinforcers were provided concurrently and independently of responding by a variable-time (VT) schedule in Component 1, and response rates were consistently lower in that component. However, resistance was greater in Component 1: When performance was disrupted by prefeeding or by extinction, response rate decreased more rapidly in Component 2 and fell below that in Component 1. The lower baseline response rate in Component 1 is consistent with Herrnstein's Relative Law of Effect (Equation 3) because the additional reinforcers increase its denominator. The fact that resistance was greater in Component 1 confirms the independence of baseline response rate and resistance to change, and suggests that resistance depends on the total rate of reinforcement arranged in a component.
Similar results were obtained when the added reinforcers were contingent on a specified alternative response in a three-component multiple schedule (Nevin et al., 1990, Experiment 2). Component A arranged 15 reinforcers per hr on the right (target) key and 45 reinforcers per hr on the left (alternative) key of a two-key chamber; Component B arranged 15 reinforcers per hr on the right key and none on the left key; and Component C arranged 60 reinforcers per hr on the right key and none on the left key. Thus, relative right-key reinforcement was 0.25 in Component A and 1.0 in Components B and C. The critical comparisons involve the right key, where responding in Component C should be more resistant that in Component B on the basis of reinforcer rate for right-key responding. The effects of alternative reinforcement are given by comparing Components A and B, and the effects of total component reinforcer rate are given by comparing Components A and C.
Figure 3. The left panel displays average baseline response rates on the right-hand key in three multiple-schedule components. In Component A, reinforcer rate for right-key responses was 15/hr, with 45 reinforcers per hr available concurrently for left-key responses. In Component B, reinforcer rate for right-key responses was 15/hr, and in Component C it was 60/hr; no reinforcers were given for left-key responses in Components B and C. The right panel displays the slopes of functions characterizing resistance to satiation, resistance to prefeeding, and resistance to extinction in these three components. Standard errors are indicated by the error bars. Adapted from Nevin et al. (1990).
Average response rates and the slopes of functions relating log proportion of baseline to successive sessions of satiation, prefeeding, and extinction are displayed in Figure 3. Baseline response rates on the right key were highest in Component C, next highest in Component B, and lowest in Component A, in keeping with Herrnstein's formulation. However, when responding was disrupted by progressive satiation, prefeeding, or extinction, right-key responding in Component A was consistently more resistant to change than in Component B, and was similar to that in Component C. As discussed by Nevin et al., these resistance results are not readily accommodated by Herrnstein's formulation. Most importantly for present purposes, the similarity of right-key resistance in Components A and C despite large differences in baseline response rates again demonstrates the independence of these aspects of behavior. That result, and the greater right-key resistance in Component A than in Component B despite the same reinforcer rates for that response, again suggest that resistance depends on total reinforcement in the presence of a component stimulus.
These two experiments by Nevin et al. (1990) show that although baseline response rate depends on relative reinforcement for the target response according to Herrnstein's Relative Law of Effect (Equation 3), resistance to change in a given component was independent of baseline response rate, and depended directly on the total rate of food reinforcers obtained in that component, regardless of whether they were contingent on the target response, independent of that response, or contingent on an alternative response. Taken all in all, the results reviewed in this section suggest that resistance to change depends on Pavlovian, stimulus-reinforcer relations.
Key pecking by pigeons is notorious for its susceptibility to the Pavlovian relation between a key light and food. For example, pigeons will peck a key that signals food even if pecking cancels food presentations (e.g., Williams & Williams, 1969; for review, see Schwartz & Gamzu, 1977). This is an instance of biological preparedness (Seligman, 1970). Virtually all of the research cited above has used pigeons as subjects, pecking at a lighted key as the response, and food as the reinforcer. Therefore, it is important that the results of Nevin et al. (1990) have also been obtained with other stimuli, responses, and species.
Mace, Lalli, Shea, Lalli, West, Roberts, and Nevin (1990) replicated Experiment 1 of Nevin et al. (1990) with retarded adults engaged in a sorting task, where performance was disrupted by turning on a television set. The results were strikingly similar to the pigeon data. Cohen (1996) also replicated Experiment 1 with college students engaged in a typing task, where performance was disrupted by providing a puzzle book, and again the results were similar to the pigeon data. Harper (in press) replicated Experiment 1 with rats, using separate levers to define the responses in the two components, and obtained similar results. Mauro and Mace (1996) replicated Experiment 2 with rats, and obtained similar results when they used visual (but not auditory) stimuli to define the three components. All in all, the effects of stimulus-reinforcer relations on resistance to change have considerable generality across stimuli, responses, and species.
To summarize: We have argued that resistance to change measures the strength of responding in a stimulus situation. The results presented above show that resistance is positively related to the total rate of reinforcement signaled by a stimulus, and is independent of the asymptotic rate of responding in the presence of that stimulus. Asymptotic response rate, by contrast, depends on relative reinforcement of the response according to Herrnstein's Relative Law of Effect. Therefore, response rate and resistance to change are separate aspects of discriminated operant behavior: Response rate depends on response-reinforcer relations, whereas resistance to change depends on stimulus-reinforcer relations.
We now consider a quantitative model characterizing resistance to change as a function of stimulus-reinforcer relations.
There are several ways of quantifying a Pavlovian contingency between stimuli and reinforcers (Gibbon, Berryman, & Thompson, 1974). A simple, intuitively reasonable, and empirically useful way is to compute the ratio of the reinforcer rate in the presence of a stimulus to the overall average reinforcer rate in both the presence and absence of the stimulus (Gibbon, 1981). Intuitively, this ratio measures the informativeness of the stimulus with respect to reinforcement. For example, if the reinforcer rate in the presence of a particular stimulus is identical to the overall average rate of reinforcement, the ratio is 1.0 and the stimulus is not informative. If the ratio is greater than 1.0, onset of the stimulus predicts an increase in the average rate of reinforcement, and if it is less than 1.0, onset of the stimulus predicts a decrease.
Stimulus-reinforcer contingency ratios (CRs) for the two components of a standard multiple schedule may be expressed as
CR1 = rC1/rS (4a)
and CR2 = rC2/rS, (4b)
where r represents the rate of reinforcement subscripted for the components C1 and C2 and for the overall session S. Even though rC1, rC2, and the intercomponent interval may vary from one experimental condition to another, rS is the same for both components within each condition so the relative contingency ratio for the two components reduces to rC1/rC2. Therefore, if the contingency ratio is an effective specification of the stimulus-reinforcer relation, relative resistance to change should vary with the relative contingency ratio and, equally important, it should be unaffected by any variable that changes only rS.
An experiment by Nevin (1992a) confirmed these expectations. As summarized in his Table 2, Nevin arranged multiple VI VI schedules with 60 reinforcers per hr in a constant component, and either 300 or 10 reinforcers per hr in the alternated component. In Experiment 1, the intercomponent interval was 2 s, and in Experiment 2, it was 2 min. Figure 4 shows that average relative resistance to change, calculated as the inverse ratio of the slopes of resistance functions for prefeeding and extinction, depends similarly on the relative contingency ratio for both resistance tests in both experiments. This similarity holds despite substantial differences in average baseline response rates produced by the intercomponent interval. We conclude that relative resistance to change is independent of the overall context of reinforcement as determined by the intercomponent interval.
Figure 4. The logarithms of ratios of the slopes of functions relating response rates in the components of a multiple VI VI schedule to sessions of prefeeding (PF) or extinction (Ext), as functions of the logarithm of the ratio of contingency ratios or, equivalently, the log ratio of reinforcer rates in those components. Results are shown separately for conditions with 2-second and 2-minute timeouts between components. Adapted from Nevin (1992a).
Nevin (1992b) reviewed all of the two-component multiple- schedule data collected in his laboratory since 1965 and related relative resistance to change to the relative contingency ratio, as shown in Figure 5. For experiments that varied reinforcer duration rather than reinforcer rate between components, the contingency ratio is expressed as the duration ratio. Across experiments, or different conditions within experiments, the overall rate of reinforcement (rs) varied substantially, and the the resistance tests employed timeout food, prefeeding, and extinction. There is no evidence that relative resistance was systematically affected by either rs or by the testing method. Although there is a good deal of variation from one experiment to another, the overall trend of the data is adequately described by a linear function with a slope of about 0.35, which is quite similar to the slopes of the two-point functions shown in Figure 4. (Note: The example portrayed in Figure 2 appears as a single point, numbered 9, at x = -.48, y = -.26. The prefeeding and extinction data shown in Figure 3 appear as points numbered 5 and 6. The data shown in Figure 4 appear as points numbered 7 and 8.) To a first approximation, then, relative response strength, construed as relative resistance to change and measured as the reciprocal of the ratio of the slopes of resistance functions, is a power function of the ratio of reinforcer rates or durations experienced in the two components of a multiple schedule:
mr1/mr2 = [(rC1)/(rC2)]b, (5)
where mr1 and mr2 represent resistance to change and rC1 and rC2 represent reinforcer rates or amounts in Components 1 and 2, and b is a parameter reflecting the sensitivity of resistance ratios to reinforcer ratios. As we will show below, preference between two schedules may be described by a similar function.
Figure 5. The logarithms of ratios of slopes of functions characterizing resistance to change in two-component multiple VI VI schedules that differ in reinforcer rate or amount are related to the logarithm of the reinforcer ratio. The data points are averages across subjects for separate experimental conditions and methods for evaluating resistance to change, coded as follows: Nevin et al. (1983) 1: timeout food; 2: Extinction. Nevin et al. (1990) 3: Experiment 1, prefeeding; 4: Experiment 1, extinction; 5: Experiment 2, prefeeding; 6: Experiment 2, extinction. Nevin (1992a) 7: Prefeeding; 8: Extinction. Points numbered 9 represent single conditions from Shettleworth and Nevin (1965), extinction; Nevin (1974) Experiment 1, timeout food; Nevin (1974) Experiment 2, extinction; Nevin (1974) Experiment 3, timeout food; and Nevin (1988), extinction. From Nevin (1992b).
Nevin (1979) pointed out that there were a number of ordinal agreements between resistance to change and preference in the literature: Variables that increased resistance also increased preference relative to a constant alternative. We now describe steady-state research on preference in a way that parallels our discussion of resistance to change.
Preference has been studied extensively in a paradigm known as concurrent-chain schedules that is closely related to the multiple-schedule paradigm for evaluation of relative response strength. The basic concurrent-chain schedule paradigm is diagrammed in Figure 6. In a standard experiment, a pigeon is confronted with a pair of illuminated response keys where pecks on one key are followed by access to one signaled food-reinforcement schedule (C1) according to a VI schedule, and pecks to the other key are followed by access to a second signaled schedule (C2) according to a separate VI schedule, where C1 and C2 are mutually exclusive and occur successively as in multiple schedules. The choice phase of the experiment, when both keys are lighted, defines the initial links of two chains, and the multiple-schedule phase, when only one or the other key is lighted and food is available, defines their terminal links. If the initial-link schedules are the same, the allocation of responding between keys during the initial-link choice phase provides a direct measure of preference for the terminal-link, multiple-schedule components. If VI schedules are used in the initial links, subjects rarely respond exclusively to one or the other, and preference is continuously related to variations in the terminal links.
Concurrent-chain schedules separate preference for a schedule from the response rate controlled by that schedule, thus avoiding a difficulty with concurrent schedules. When qualitatively different schedules are defined for two concurrent operants, preference is confounded with the response rates shaped by the different contingencies of reinforcement arranged by the two schedules. For example, variable-ratio (VR) schedules usually maintain much higher response rates than VI schedules, and the allocation of responding in concurrent VR VI may reflect the shaping effects of the different schedule contingencies as well as the values of the schedules (e.g., Herrnstein & Heyman, 1979).
Figure 6. Schematic diagram of a typical concurrent-chains procedure. In the initial-links, both keys are lighted white, and responding occasionally produces entry, according to equal concurrent VI schedules, into one of two mutually exclusive terminal- links signaled by red or green. Responding in the terminal-links produces reinforcement, after which the initial-links are reinstated. The ratio of initial-link responses is taken as a measure of preference between the terminal-link discriminated operants.
Autor (1960/1969) arranged identical initial links and varied the reinforcer rates in the terminal links using VI, VR, and VI DRO schedules (where VI DRO signifies that food was presented at variable intervals if the subject refrained from responding) in three separate experiments. He found that the relative rate of responding to one initial link approximately matched the relative rate of reinforcement provided by its terminal link, regardless of the terminal-link contingencies or response rates. Subsequent research has confirmed these conclusions: Herrnstein (1964a) repeated Autor's results with VI and VR terminal links arranged within conditions, rather than between experiments, and Neuringer (1969) showed that pigeons were indifferent between terminal links that arranged response-contingent and response-independent reinforcement after the same delay, even though the pigeons rarely responded when reinforcement was independent of responding. Neuringer (1967) varied reinforcer amount in the terminal links and found that preference was directly related to amount even though there were no effects on terminal-link response rates. Thus, preference evidently does not depend on response-reinforcer contingencies or response rates, and depends directly on relative reinforcer rate or amount. In these respects, preference in concurrent chains is functionally similar to resistance to change in multiple schedules.
A number of models of preference in concurrent chains have been proposed since Autor's initial research; here, we consider two that are relevant to the model of resistance to change summarized above.
Although both Autor (1960/1969) and Herrnstein (1964a) observed approximate matching between relative response rates in the initial links of concurrent chains and relative terminal-link reinforcer rates, this matching result proved to be fortuitous when Fantino (1969) demonstrated that measured preference depended on the lengths of the identical initial links as well as the relative rates of food reinforcement in the terminal links. Fantino obtained matching with intermediate-length initial links, but preference approached indifference as the initial links were lengthened, and approached exclusive preference for the richer terminal link as initial links were shortened. Thus, matching appeared to be just one of a continuum of possible results.
To account for these and related results, Fantino (e.g., 1977) proposed that the value of a terminal link depended on the relative reduction in delay to food signaled by entry into that terminal link. More formally, Fantino's delay-reduction theory asserts that
B1/B2 = (T - t1)/(T - t2) (6)
where T is the overall average time from onset of the initial links to the delivery of a food reinforcer, t1 and t2 are the delays to food reinforcement in the terminal links, and B1 and B2 are the numbers of choice responses to the two keys in the initial links. The formulation is intuitively plausible: Signaled delays of 30 seconds and 1 minute differ by rather little relative to an overall delay lasting an hour, but differ by a great deal relative to an overall delay of 2 minutes. Indeed, Fantino's delay-reduction theory predicts exclusive preference for the shorter signaled delay to food when the length of the overall average delay is less than the longer signaled delay.
Fantino's account of preference in relation to delay reduction has some properties in common with the contingency-ratio account of resistance to change presented above. Note that T, the overall average time to reinforcement in Equation 6, is the same as 1/rS, the average rate of reinforcement in Equation 2; and likewise, t1 and t2 are the same as 1/rC1 and 1/rC2. Delay reduction theory suggests that the attractiveness or value of the terminal-link schedule in C1 (for example) is an increasing function of the difference between 1/rS and 1/rC1, whereas Nevin's account of response strength suggests that resistance to change in multiple-schedule component C1 is an increasing function of the ratio of rC1 to rS. However, both accounts embody the same intuition: The strength of responding in a multiple-schedule component, and the value of access to a terminal-link schedule, both depend on a comparison of component reinforcer rate (or terminal-link delay) with the overall average reinforcer rate (or delay) for the context in which the schedule appears.
Despite this similarity, there may be an important difference. Figure 4 suggests that relative response strength in C1 and C2 is roughly invariant with respect to the length of time-out periods between components, which influence rS. If the initial-link choice periods that precede access to the terminal links in concurrent chains are functionally equivalent to the timeout periods that precede multiple-schedule components, initial-link length should also have no effect on preference. However, according to delay reduction theory and as shown by Fantino (1969), preference for the richer terminal link in concurrent chains varies inversely with the length of initial-link choice periods, which influence 1/T. If resistance and preference are similarly determined, this difference must be resolved.
Grace (1994) has recently proposed a comprehensive account of performance in concurrent-chain schedules that assumes terminal-link values to be independent of the context of initial-link lengths within which they appear. Simply put, Grace's acccount assumes that terminal-link value depends only on the signaled delays to reinforcement, but the behavioral expression of relative value as preference in the initial links depends on the ratio of terminal-link to initial-link duration. The model is:
Bi1/Bi2 = b(rt1/rt2)a1[(1/dt1/1/dt2)a2(xt1/xt2)a3]Tt/Ti (7)
where Bi1 and Bi2 represent initial-link response rates, rt1 and rt2 are the rates of terminal-link entries, and dt1 and dt2 are the delays to reinforcement in the terminal links. The parameters b, a1, and a2 represent response bias, sensitivity to number of entries, and sensitivity to delay, respectively. Other variables that influence preference, such as reinforcer amount, are represented by xt1 and xt2, where a3 is sensitivity to those variables. The exponent Tt/Ti is the ratio of average terminal-link duration to average initial-link duration, which accounts for the effects of initial-link length reported by Fantino (1969).
Unlike Fantino's delay-reduction theory, Grace's formulation has a number of free parameters; but with the assumption that terminal-link value depends only on the delays to reinforcement, it provides an excellent descriptive summary of the results of a wide variety of concurrent-chain schedule experiments. Moreover, Grace (1996) has shown that estimates of value are consistent between the standard concurrent-chains procedure and Mazur's (1987) adjusting-delay procedure for determining indifference between two signaled conditions of reinforcement. The agreement between two different choice paradigms in their estimation of terminal-link value argues strongly for the identification of initial-link preference with the construct of reinforcement value. The contextual choice model also accords with Nevin's (1992a) results presented in Figure 4 in that relative terminal-link value in concurrent chains, like relative resistance to change in multiple schedules, is independent of the overall context of reinforcement.
The functional similarity of resistance to change and preference may be understood within the metaphor of behavioral momentum, which was characterized briefly in Section 1. Here, we explain it more fully.
In classical mechanics, momentum is given by the product of the velocity and mass of a moving body. Momentum cannot be ascertained by observing the steady-state velocity of a body unless its mass is known. If its mass is unknown, it is necessary to impose a known external force, observe the change in velocity, and then calculate mass from Newton's second law:
/\ v = f/m, (8)
which states that the change in velocity is directly proportional to the imposed force and inversely proportional to the mass of the body.
Nevin et al. (1983) suggested that behavior can be treated similarly. Asymptotic response rate under baseline training conditions is a behavioral analog to velocity under constant conditions, and the change in that response rate when responding is disrupted by altering those conditions in a way that is analogous to an external force allows us to estimate a behavioral analog to inertial mass: The smaller the decrease in response rate, the greater the behavioral mass. And just as velocity and mass are independent dimensions of a moving body, so response rate and resistance to change are independent dimensions of behavior, determined primarily by response-reinforcer and stimulus-reinforcer relations respectively.3
In physical science, the universal application of MKS measurement units assures dimensional consistency and comparability in measuring momentum across different external forces. In the science of behavior, however, there is no obvious system of units that can be applied to different disruptors. The change in response rate is dimensionless if post-disruption response rate is expressed relative to its baseline. Therefore, if the disruptor consists of imposed electric-shock punishment, the mass-like aspect of behavior must be expressed in units of electric shock to make Equation 8 dimensionally consistent; but if the disruptor consists of prefeeding, it must be expressed in units of food. Moreover, any attempt to write an equation relating behavioral mass to the contingency ratio, which is dimensionless, must introduce a scaling constant having units of the disruptor.
Both of these problems may be resolved by imposing the same disruptor x on two independently measured ongoing response rates. Then
/\ v1 = x/m1; /\ v2 = x/m2; and thus
m1/m2 = /\ v2//\ v1 , (9)
where the change in velocity (response rate) is measured as log proportion of baseline. Equation 9, which is dimensionless, provides a measure of relative rather than absolute behavioral mass. The two-component multiple schedule, which permits within- subject, within-session comparison of the resistance to change established by two different reinforcement conditions, is ideally suited for relative measurement of this sort.
In physics, the inertial mass of a body, which is determined by imposing an external force and measuring the change in motion, is equal to the gravitational mass of that body, which may be determined independently by its force of attraction to another body of known mass at a known distance (e.g., its weight at the earth's surface). Newton's law of gravitation describes the relation:
a = (m1*m2)/d2, (10)
where a is the force of attraction, m1 and m2 are the masses of the two bodies, and d is the distance separating their centers. In order to determine the relative gravitational masses of two bodies with masses m1 and m2, it is sufficent to measure their relative attractiveness to a third body, equidistant from both, with constant (but unknown) mass m3:
a1 = (m1*m3)/d2 and a2 = (m2*m3)/d2.
Thus, m1/m2 = a1/a2 (11)
The metaphorical connotations of "attraction" suggest that the behavioral equivalent of relative gravitational mass may be measured by the number of responses that bring the subject into contact with one or the other of two multiple-schedule components which are equidistant from choice, i.e., preference in concurrent-chain schedules with equal initial links. Preference, transformed via Grace's (1994) contextual choice model (Equation 7) may be construed as an estimate of the relative reinforcement value of those components. If behavioral mass is similar to physical mass, the relative inertial mass of a discriminated operant estimated from resistance to change and its relative gravitational mass estimated from preference should be related by a simple function, perhaps even by identity. The schematic diagram in Figure 7 summarizes the relations between these terms.
Figure 7. Summary of the relations between the conditions of reinforcement for two discriminated operants, their resistance to change (left branch) or preference between them (right branch), and the structural relation linking resistance and preference (bottom). Both resistance and preference are construed as expressions of a single central construct reflecting their strength, value, or behavioral mass.
The convergence of strength and value suggested by the momentum metaphor is supported by quantitative relations derived from previous research, as indicated in Figure 7, and by recent experimental evidence.
The relative-value kernel of Grace's (1994) model is:
v1/v2 = [(1/d1)/(1/d2)]a, (12)
which is derived from the full model (Equation 7) by neglecting response bias and assuming that average terminal-link (Tt) and initial-link (Ti) durations are kept constant while relative terminal-link delay is varied, and that the rates of terminal-link entries (r) and other variables (x) that affect preference are equated between alternatives. That is, the relative value of a signaled schedule of reinforcement is a power function of the relative reciprocal of delay (equivalently, relative immediacy of reinforcement or average reinforcer rate) in the terminal links of concurrent chains. Equation 12 is closely related to Equation 5 for relative resistance to change, which we repeat for convenience:
mr1/mr2 = [(rC1)/(rC2)]b, (13)
where the exponent b seems not to depend on the duration of intercomponent intervals within which C1 and C2 are set (see Figure 4 above). Likewise, Grace's account of relative value performs well if his exponent a is assumed not to depend on the duration of the initial links which precede the terminal links. If relative schedule value (preference) is a power function of the ratio of reinforcer rates arranged in the terminal links of concurrent chains, and relative response strength (resistance to change) is also a power function of the ratio of reinforcer rates in the components of multiple schedules, the relation between relative response strength and relative reinforcement value must also be a power function:
mr1/mr2 = (v1/v2)a/b (14)
Grace and Nevin (1997) devised a method for examining the power-law prediction directly by evaluating preference in concurrent chains in one half of an experimental session and evaluating resistance to change in multiple schedules in the other half. Specifically, in the concurrent-chains portion, two side keys were lighted white during the initial links, and pecks at one or the other side key gave access to its corresponding terminal link, signaled by lighting the center key red or green. In the multiple-schedule portion, the center key was lighted red or green after an intercomponent timeout, and the component schedules were identical to the concurrent-chains terminal links. After performance stabilized in both portions of the procedure, resistance to change was evaluated by presenting response-independent food during the timeout between components in the multiple-schedule portion of the session. Because only one timeout food rate was employed, we used a variation of the slope ratio to estimate relative resistance to change:
log [(BX1/BO1)/(BX2/BO2)], where B refers to response rate subscripted for Component 1 or 2, and for timeout food (X) and baseline (O). If timeout food produces greater decreases in response rate in Component 2 than in Component 1, relative to their respective baselines, log relative resistance is positive, and if it produces smaller decreases in Component 2 than in Component 1, log relative resistance is negative (see Appendix, Grace & Nevin, 1997, for discussion of this measure).
Preference and resistance to change were evaluated for five consecutive sessions in eight conditions, each of which arranged different pairs of variable delays whose sum was constant. Representative data for one pigeon (Bird 29) are shown in the three panels of Figure 8. The left panel shows the relation between preference (the log ratio of initial-link response rates) and the log ratio of the relative immediacy of food in the terminal links. The center panel shows the relation between relative resistance to change (calculated as described above) and the log ratio of the relative immediacy of food in the multiple-schedule components, which is the same as in the terminal links. The right-hand panel shows the structural relation between our two independently measured dependent variables, relative resistance and preference. That relation is a quantitative expression of the covariation of strength and value when the relative immediacy of reinforcement is varied.
Figure 8. The left panel shows the relation between preference in concurrent chains, measured as the logarithm of the ratio of initial-link responses, and the logarithm of the ratio of terminal-link reinforcer immediacy, for an individual pigeon. The center panel shows the relation between resistance to change in multiple-schedule components that were identical to the terminal links, measured as the logarithm of the ratio of response rate ratios with timeout food presentations to baseline, and the logarithm of the ratio of multiple-schedule reinforcer immediacy. The right panel shows the structural relation between resistance to change and preference. Adapted from Grace and Nevin (1997).
It is important to observe that deviations from linearity in the left and center panels are correlated: Pooled across all four of our subjects, the correlation is +.52 (p < .003). The fact that deviations are correlated suggests that both preference and relative resistance are related to a common factor that is largely, but not completely, determined by the ratio of experimentally arranged reinforcer rates. Whatever its additional determiners, which may vary between individuals and experimental conditions, that common factor represents the relative behavioral mass of the two operants defined by the terminal links or multiple-schedule components.
It is not surprising that the effects of reinforcer rate on resistance to change and preference are correlated within subjects, because both aspects of behavior have been shown to depend similarly on reinforcer rate in independent experiments. The same holds for reinforcer amount. However, some less obvious aspects of the conditions of reinforcement also have correlated effects on preference and resistance to change.
For example, Grace, Schwendiman, and Nevin (1998) degraded response-reinforcer contiguity by arranging a brief unsignaled delay before reinforcement in one terminal link of standard concurrent chains. All pigeons preferred the alternative terminal link, which arranged immediate reinforcement, even though the rates of reinforcement were about the same. In a separate multiple-schedule condition, they observed greater resistance to change in a component with immediate reinforcement. Moreover, the degree of preference covaried, across subjects, with the degree of differential resistance. Bell (in press) also found that resistance to change in a multiple-schedule component with immediate reinforcement was greater than in a second component with a brief unsignaled delay superimposed on the same VI schedule. In addition, Bell conducted choice probe tests in extinction and observed greater responding to the stimulus correlated with immediate reinforcement than to the stimulus correlated with unsignaled delayed reinforcement.
Signaling the delay to reinforcement may also have similar effects on preference and resistance. For example, with pigeons as subjects, Marcattilio and Richards (1981) reported preferences for a terminal link with a signaled delay over an otherwise identical terminal link with unsignaled delay. Relatedly, Roberts, Tarpy, and Lea (1984) examined response rate, resistance to prefeeding and resistance to extinction in a between-group study with rats as subjects, where one group received a brief signal before each reinforcer and the other received the same reinforcer delay but with signals presented randomly. Although baseline response rates were higher for the group with random signals, resistance to change was greater for the group with signaled delay.
Contingencies on response rate may also affect preference and resistance. Several experiments have shown that contingencies establishing low rates of responding generate greater resistance to disruption than high-rate contingencies when overall reinforcer rates are equated between multiple-schedule components (e.g., Blackman, 1968; Lattal, 1989; Nevin, 1974, Experiment 5; but see Fath, Fields, Malott, & Grosset, 1983). In concurrent-chains experiments, Fantino (1968) and Nevin (1979) found preference for low-rate over high-rate contingencies, and Nevin (1979) found that preference was greatest for the same birds that had most clearly shown an effect of low-rate vs. high-rate contingencies on resistance to change in his earlier Experiment 5 (1974).
These examples show that both resistance to change and preference may sometimes be affected by variables other than stimulus-reinforcer relations, but the effects are correlated. Although such findings challenge a purely Pavlovian account, they provide additional evidence that resistance to change and preference are independent measures of the strength, value, or behavioral mass of a discriminated operant.
An apparent exception to the agreement between resistance and preference arises when a fixed-interval (FI) schedule is compared with a VI schedule with the same arithmetic mean interval. Many studies (e.g., Herrnstein, 1964b; Killeen, 1968) have reported strong preferences for the VI schedule. Mandell (1980) confirmed this preference but found no difference in resistance between VI and FI schedules in the terminal links of chained VI VI and VI FI schedules. Mellon and Shull (1986) repeated part of her study and obtained modest evidence of greater resistance in the VI terminal links, but Mandell's failure to confirm the usual agreement between resistance and preference within her experiment remains to be explained. One source of interpretive difficulty is that FI performance is typically biphasic, consisting of an initial pause followed by rapid responding. Thus, changes in average response rate during resistance tests may not be a fair measure of the resistance of FI responding. In general, it may prove difficult to compare resistance to change between performances differing in temporal pattern or topography of responding, and apparent failures of agreement with preference may arise for this reason.
Our formulation of behavioral mass as a single construct expressed separately in resistance and preference is seriously challenged by any systematic dissociation between these aspects of behavior. The well-known and much-debated Partial Reinforcement Extinction Effect (PREE) presents a major challenge of this sort.
D'Amato, Lachman, & Kivy (1958) and several subsequent researchers (e.g., vom Saal, 1972) have shown that animals respond more to a stimulus correlated with continuous reinforcement (CRF) than to one correlated with partial or intermittent reinforcement (PRF). However, responding is less resistant to extinction after CRF than after PRF in a wide variety of procedures (see Mackintosh, 1974, for review). Thus, preference and resistance to extinction are related to the training schedule in opposite directions.
In addition, the PREE is a major exception to the general finding that resistance to change in multiple VI VI schedules, including resistance to extinction, depends directly on the rate of reinforcement. Clearly, the rate of reinforcement is greater when every response is reinforced (CRF) than when only some proportion of those responses is reinforced (PRF). Thus, the PREE is a major exception to the claim that resistance to any sort of change depends directly on rate of reinforcement and that it is correlated with preference.
When reinforcement is terminated after extensive training, there are two separable aspects of the transition to extinction that must be distinguished. First, reinforcers are no longer contingent on responding, and second, the overall stimulus situation changes because reinforcers no longer occur. These effects are separable: Response rate decreases when the contingency is removed even though reinforcers are presented independently of responding (e.g., Rescorla and Skucy, 1969), and response rate decreases when there is a change in the stimulus situation, at least temporarily, even though reinforcers may still be presented (e.g., Ferster & Skinner, 1957, p. 78). The latter effect is known as generalization decrement, which has been invoked frequently to explain the PREE: Reinforcers, considered as stimuli, are part of the stimulus situation in which training occurs; and when extinction begins, there is a smaller change in the overall stimulus situation after PRF than after CRF because the average reinforcer rate is lower.
We suggest that CRF establishes greater behavioral mass than PRF, consistent with all the research on the rate of reinforcement reviewed above, but that the transition to extinction may decrease responding more rapidly after CRF than PRF because of the greater generalization decrement. In terms of the momentum metaphor, the disruptive force of extinction must include both the suspension of the contingency and the decremental effect of situation change. We now consider a way to model these two forces during extinction by augmenting our basic model of resistance to change.
The basic model is:
log(Bx/Bo) = - x/m (15)
where log(Bx/Bo) is the change in responding during disruption relative to baseline, x is the value of the disruptor, and m is behavioral mass. Grace and Nevin (1997) suggested that for a given schedule component, m depends on reinforcer rate according to a power function, which is consistent with previous results for relative resistance to change (Figure 5) and with Grace's (1994) model of preference (Section 6.3). Thus,
log(Bx/Bo) = - x/ra (16)
where r is reinforcer rate during training and a is the exponent of the function relating m to r. To capture the effects of suspending the reinforcement contingency and changing the situation by omitting reinforcers, the disruptor x, representing time in extinction, is multiplied by the additive combination of terms representing these separate factors:
log(Bx/Bo) = -x(c+dr)/ra (17)
where c represents the decremental effect of suspending the contingency and d represents the decremental effect of situation change arising from terminating reinforcer rate r. Thus, the force-like term in the basic momentum model is augmented by an additive term for the effectiveness of situation change in extinction (dr). The units of c and d must be such that the right side of Equation 17 is dimensionless.
The effects of disruptors that do not involve termination of reinforcement, such as deprivation change, can be captured by Equation 16 with the addition of a parameter f that scales the effectiveness of deprivation change in units that retain dimensional consistency:
log(Bx/Bo) = -xf/ra (18)
To illustrate the application of the model set forth in Equations 17 and 18, we estimated their parameters by fits to the average slope data of Experiment 2 of Nevin et al. (1990 -- see Figure 3). The parameter c in Equation 17 was set at 1.0, so that fp (for prefeeding) and fs (for satiation) express the effectiveness of those disruptors relative to the effect of suspending the contingency. Estimated parameter values are: fs = 1.03, suggesting that satiation was about as effective as suspending the contingency; fp = 1.45, suggesting that prefeeding was about half again as effective; a = 0.35; and d = 0.001. The relation between obtained and predicted slopes is shown in the left panel of Figure 9. The model accounts for 91% of the data variance, and predicted values are usually within the range of the standard error of the data.4
Figure 9. The left panel relates the average slopes of resistance functions obtained by Nevin et al. (1990 -- see Figure 3) to the predictions of Equations 17 and 18. Error bars show the standard errors of the mean obtained slopes. The right panel shows the slope of the extinction curve predicted by Equation 17 as a function of reinforcer rate during training, with parameters a and d set at the values estimated by fits to the data of Nevin et al. (1990).
These data do not constitute a good test of the model because there were only two reinforcer rates in the three schedule components, and four free parameters were estimated from nine slopes. Larger data sets would provide a more stringest test of the model. Nevertheless, the model predicts the effects of wider variations in reinforcer rates. The right panel of Figure 9 shows the predicted slope of the extinction curve when the reinforcer rate during training is varied from 10 to 5000 per hr (the latter value may be unrealistically high, but reinforcer rates obtained on CRF, corrected for eating time, have sometimes exceeded 4000/hr in our laboratory). Predictions were derived from Equation 17 using the parameter values estimated for a and d from the data of Nevin et al. (1990). Note that the slope becomes shallower as the reinforcer rate increases up to about 500 reinforcers per hr, and then becomes steeper as the reinforcer rate increases further. Thus, even though behavioral mass is a continuous positive function of reinforcer rate, resistance to extinction is predicted to be lower after training with high reinforcer rates under CRF than after training with somewhat lower rates characteristic of PRF.
We have suggested above that behavioral mass is also measured by preference, which is directly related to reinforcer rate. Thus, the apparent dissociation between preference and resistance to extinction after training with CRF as opposed to PRF is resolved if this model is accepted.
Equation 17 cannot be fully correct because the effects of situation change when reinforcement is terminated must decrease as time elapses in extinction, suggesting that the value of d must decay with time. Moreover, free-operant extinction often shows an initial increase in response rate that is sometimes describes as a frustrative effect of reinforcer omission, whereas Equation 17 predicts only decreases. Modifying Equation 17, fitting it to the many extant data sets, and determining how its parameters depend on experimental variables such as reinforcer magnitude or length of training is a task for the future.
In addition to guiding basic research and theory, the momentum metaphor may be fruitful in applied work. The next three sections describe some applications of our work to clinical intervention, drug addiction, and self-control.
An important goal of clinical intervention is to establish desirable behavior so that it occurs reliably during therapy and persists effectively when therapy ends; in metaphorical terms, to maximize both its velocity and mass. This goal suggests the use of high rates of contingent reinforcement during therapy, which should maximize both terms.
Many researchers have discussed the persistence of therapeutic gains in the client's natural environment, without the reinforcers mediated by the therapist, in relation to resistance to extinction, and have recommended partial reinforcement in order to capitalize on the PREE (e.g., Nation & Woods, 1980). However, many other disruptors that inevitably occur in everyday life, including competition from the undesirable behavior that led to therapy, must also be considered. As summarized in Sections 3 and 4, resistance to other disruptors such as distraction or competing behavior increases monotonically with increasing reinforcer rates, and the therapist must consider the relative importance of disruptors other than extinction in designing clinical interventions (see Lerman & Iwata, 1996, for a review of extinction in relation to other factors in applied settings).
Biofeedback has been used extensively to help clients manage a variety of health problems including muscle tension. However, effects established in the clinic have often failed to generalize to everyday life, presumably because of the absence of explicit biofeedback (unless the client acquires the necessary apparatus) as well as situation change. Tota-Faucette (1991) addressed this problem in a study of biofeedback for muscle relaxation with normal children. She arranged two distinctively signaled situations: In one, the children received tones plus points exchangeable for toys for meeting the relaxation criterion, and in the second, they also received additional, noncontingent points or toys. During training, the children achieved significant reductions in muscle action potential (EMG) levels and significant increases in the proportion of time spent at or below the relaxation criterion. After 24 30-s trials with each situation, levels of relaxation were similar in both situations. However, when all auditory feedback and points were discontinued in an extinction test, relaxation was substantially more persistent in the situation that had included additional noncontingent reinforcers. This result is a systematic replication of the experiments described in Section 3.4 with a clinically important response.
Unfortunately, added reinforcers should similarly increase the persistence of undesirable responses. This expectation is particularly important because a common procedure for reducing the rate of undesired behavior is to provide reinforcers for a competing alternative response, or for unspecified behavior occurring in the absence of the target response. As shown by McDowell (1982), the reduction in the target response is predicted by Herrnstein's Relative Law of Effect (Section 2.2). However, the addition of explicit reinforcers to the unknown reinforcers that maintain the undesired target response may increase its persistence even as they reduce its rate of occurrence. Mace (1991) obtained this perverse outcome with food-stealing by a retarded child. After explicit reinforcement for proper eating, the rate of food theft was substantially lower than baseline; but when thefts were physically blocked, attempts to steal food persisted far longer than in a previous condition where alternative reinforcement had not been provided. This outcome is entirely consistent with the research described in Section 3.4, and with the metaphorical notion that although alternative reinforcement reduced the velocity-like aspect of food theft, it also increased its mass-like aspect and thus tended to counteract the purpose of the intervention. At the least, the possibility of such outcomes must be considered by therapists who use alternative reinforcement to reduce undesired behavior.
The momentum metaphor has been used effectively by Mace, Hock, Lalli, West, Belfiore, Pinter, and Brown (1988) to establish compliance with demanding requests that were normally resisted. Briefly, Mace et al. presented retarded adults in a group home with a series of easy requests that were fun to comply with (e.g., "Give me five"), and gave social reinforcement for compliance, immediately before a more demanding request (e.g., "Empty the trash"). They obtained substantially greater compliance than when the demanding requests were not preceded by easy requests. The metaphorical interpretation is that the series of easy requests endowed compliance as a general response class with both velocity and mass, thereby increasing its rate and reducing its disruptability by more demanding requests. Nevin (1996a) discussed the interpretation of this procedure for establishing compliance in relation to the momentum metaphor, and concluded that its effectiveness can be understood and, perhaps, enhanced by reference to research on resistance to change.
When people persist in efforts to procure and consume drugs, to the detriment of their health, occupation, and social life, their behavior is often characterized as addictive. As Heyman (1996) pointed out, the compulsive quality of addiction has led many researchers to conclude that it is not under the control of its long-term consequences. Heyman argued to the contrary, and showed that a model of choice which incorporates changes in the value of drugs and nondrug reinforcers, together with control by local relative value, can account for addictive behavior. However, his model does not stress the role of environmental stimuli in addiction.
Evidence for control by environmental stimuli comes from studies of relapse after drug use has been eliminated during treatment in an inpatient facility. For example, relapse is very likely when a former addict returns to a situation in which drug use has previously occurred (Brownell, Marlatt, Lichtenstein, & Wilson, 1986; Hunt & Oderoff, 1962). Conversely, when a former addict moves to a radically different stimulus situation, as when soldiers who were addicted to heroin in Vietnam returned to the United States, there is little evidence of relapse (Robins, Helzer, Hesselbrook, & Wish, 1977). As noted by Nevin (1996b), these observations suggest that addictive behavior has considerable stimulus-specific mass.
In line with the distinction between response rate and resistance to change that has been made repeatedly above, we suggest that Heyman's choice model can account for the rate of addictive behavior, but its persistence may depend on historical stimulus-reinforcer relations. The effects of choice processes and stimulus-reinforcer relations may converge to endow drug-taking with high momentum, thereby making addiction especially refractory to treatment and prone to relapse in the addict's normal environment (Nevin, 1996b). In particular, drugs may be viewed as disrupting many everyday activities that do not involve drug-taking and simultaneously reinforcing the behavior that procures them in a way that is particularly resistant to change. Here, we ask whether the effects of drugs are consistent with research on behavioral momentum.
There has been a vast amount of relevant research with nonhuman subjects, and we cannot review it systematically here. Instead, we will consider a few examples involving cocaine -- a highly addictive drug that has created serious personal and public health problems. We begin by considering cocaine as a disruptor of ongoing operant behavior maintained by conventional reinforcers, which may be construed as a model for the deleterious effects of cocaine use on everyday activities.
The disruptive effects of acute and chronic cocaine adminstration have been studied by Hoffman, Branch, and Sizemore (1987) in a three-component multiple FR schedule with food reinforcement and pigeons as subjects. They found evidence that acute administration affected responding in ways consistent with other disruptors reviewed above: Relative to performance in vehicle control sessions, decreases in response rate were greatest in the component with the largest fixed ratio, and least in the component with the smallest fixed ratio, and thus were ordered with respect to obtained reinforcer rate. Hoffman et al. also found that development of tolerance was directly related to reinforcer rate: With repeated administration of a moderate cocaine dose, response rate recovered to near baseline levels in the component with the smallest ratio and recovered least, if at all, in the component with the largest ratio. Thus, cocaine administration is analogous to a disruptive force: Its effects are greatest on behavior maintained by a relatively low reinforcer rate, both upon initial administration and as its effectiveness wanes during the development of tolerance.
Cocaine is also a highly effective reinforcer. In monkeys, characteristic patterns of operant behavior are maintained by fixed-interval and fixed-ratio schedules of cocaine reinforcement (e.g., Goldberg & Kelleher, 1976), and choice between two concurrently available cocaine doses roughly matches relative dose level in a fashion similar to the relative magnitude of conventional reinforcers (Llewellyn, Iglauer, & Woods, 1976). There is some evidence that increasing doses of cocaine reinforcement may also increase resistance to change. For example, Glowa, Wojnicki, Matecka, Bacher, Mansbach, Balster, and Rice (1995) administered a dopamine reuptake inhibitor to their monkeys before selected experimental sessions with cocaine reinforcement. They found that reductions in cocaine-maintained responding were inversely related to cocaine dose per reinforcer. The pre-session inhibitor may be viewed as similar to prefeeding, which also produces reductions in response rate that are inversely related to magnitude of food reinforcers in pigeons (Nevin et al., 1981).
Research by Carroll and Lac (1993) is more directly relevant to the prevention of cocaine addiction. They found that access to a glucose plus saccharine solution interfered with the acquisition of cocaine-reinforced autoshaping and subsequent cocaine self-administration, but only if glucose plus saccharine were given in the operant chamber. This could be interpreted as an instance of blocking by the experimental context (e.g., Tomie, 1976). Alternatively, it may be that access to alternative reinforcers in the operant chamber endowed unmeasured behavior that competed with cocaine self-administration with high mass. This interpretation is admittedly speculative, but there may be some practical utility to the notion that arranging a high density of conventional (i.e., non-drug) reinforcers in a given environment may increase resistance to the reinforcing as well as the disruptive effects of drugs.
Here, we discuss two approaches to the problem of self-control in relation to behavioral momentum. Self-control may be characterized as accepting some short-term deprivation (as in refraining from an addictive drug) and thereby obtaining some larger, long-term good (health and well-being),
One experimental analog of self-control involves choice between small, immediate reinforcers and large, delayed reinforcers. Pigeons and (in many situations) humans generally exhibit impulsiveness by choosing the smaller, more immediate reinforcer (see Logue, 1988, for review). These preference results are well described by a version of the generalized matching law which assumes that the effects of reinforcer amount and immediacy are additive (Logue, Rodriguez, Pena-Correal, & Mauro, 1984; Grace, 1995). In a parametric study, Bedell, Grace, and Nevin (1997) assessed preference and relative resistance to change in pigeons choosing between alternatives that differed in reinforcer amount and immediacy. They found that the effects of these variables on resistance to change were additive, consistent with the preference data.
Effective methods for enhancing choice of large, delayed reinforcers in nonhuman subjects include progressively lengthening the delay and presenting stimuli that bridge the delay to the larger reinforcer (Mazur & Logue, 1978), or increasing the delay equally for both alternatives (Green, Fisher, Perlow, & Sherman, 1981). It would be interesting to determine whether these methods also enhance the resistance to change of responding for the large delayed reinforcer, consistent with the general correlation between preference and resistance. Such an outcome would have immediate relevance for the transfer of self-control training in the clinic to everyday life.
Rachlin (1995) has suggested a different approach that is related to the principles of behavioral momentum. Specifically, he argued that self-control involves an extended pattern of engagement in high-valued behavior (e.g., a healthy lifestyle) that persists despite occasional tempting alternatives, even though those alternatives, considered individually and locally, have a higher value than individual components of the pattern.
We suggest that Rachlin's extended pattern is analogous to sustained responding in the initial link of a chain schedule in that, from a molar perspective, continued access to the terminal-link reinforcer (analogous to health) depends on continued initial-link responding (analogous to moderate drinking, low-fat diet, etc.) throughout the experiment. In a study of resistance to change in chained schedules, Nevin et al. (1981, Experiment 2) showed that average initial-link response rates in pigeons were less disrupted by the occasional availability of a single immediate reinforcer on an adjacent key (mimicking temptation) when terminal-link food was relatively large and immediate. Similar results were obtained with prefeeding, suggesting that resistance to a tempting alternative is functionally equivalent to resistance to the other disruptors reviewed in this article.
In real life, as opposed to the pigeon chamber, the presumed ultimate reinforcer for living a healthy lifestyle -- a long, healthy life -- does not occur at any particular moment, and indeed may not occur at all (one could be hit by a bus). Therefore, the contingency between living a healthy lifestyle and its ultimate benefits is at best remote. How, then, is the healthy lifestyle to be maintained? In view of the strengthening effects of added reinforcers (Sections 3.4, 13.1), we suggest that "self-control" -- e.g., maintaining a healthy lifestyle despite succumbing occasionally to the third martini or seconds on cheesecake -- may be enhanced by arranging additional reinforcers that are unrelated to health, such as listening to music, in a person's normal environment. The same general approach may be useful in sustaining any desirable pattern of action where the intended consequences are remote, such as political efforts on behalf of world peace.
Almost a century ago, Thorndike proposed his famous Law of Effect:
"Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur; those which are accompanied or closely followed by discomfort to the animal will, other things being equal, have their connections with that situation weakened, so that, when it recurs, they will be less likely to occur. The greater the satisfaction or discomfort, the greater the strengthening or weakening of the bond" (Thorndike, 1911, p. 244).
Although many aspects of this oft-quoted law have been challenged, we suggest that its central principles are compatible with the work on resistance to change and preference that we have described above.
Thorndike (1911) proposed to define "satisfaction" and thereby achieve "more detailed and perfect prophecy" as follows:
"By a satisfying state of affairs is meant one which the animal does nothing to avoid, often doing such things as attain or preserve it (p. 245)."
In Section 7.3, we suggested that the value of a discriminated operant may be estimated by its attractiveness, expressed as preference for access to that operant -- a notion that is quite similar to attaining or preserving a satisfying state of affairs.
Thorndike also anticipated the identification of asymptotic strength with resistance to change:
"In certain cases in which the probability that the connection will be made is 100 per cent, the connections may still exist with different degress of strength, shown by the fact that the probability of 100 per cent will hold for a week only or for a year; will succumb to a slight, or prevail over a great distraction; or otherwise show much or little strength" (Thorndike, 1913, p. 3).
Although Thorndike's Law was principally concerned with acquisition, its initial statement that the probability of a response depends on its consequences relative to those of other responses is amply supported by molar analyses of steady-state response rate in relation to schedules of reinforcement, as summarized by Herrnstein's Relative Law of Effect. Thorndike's statement that satisfaction establishes a connection between the situation and the response is amply supported by research on resistance to change in a stimulus situation, which measures the strength of a discriminated operant and is directly related to the rate or magnitude of reinforcers contiguous with that stimulus.
Although Thorndike anticipated the possibility that strength may be independent of the asymptotic rate or probability of responding before resistance is evaluated, he did not distinguish their determiners. As we have shown, response rate depends on response-reinforcer relations, whereas resistance to change is determined primarily by stimulus-reinforcer relations. Although other variables may also influence resistance to change, it appears that whatever its determiners, response strength, as estimated from resistance to change in multiple schedules, is positively related to reinforcer value, as estimated by preference in concurrent chains. Thus, Thorndike's statement that the strength of connection is directly related to the magnitude of satisfaction is supported by the structural relation linking resistance to change and preference. Because these terms are measured independently, the relation between them is immune to the charges of tautology that have often been leveled against the Law of Effect (e.g., Postman, 1947).
Thorndike's assertion that a stimulus-response bond is strengthened or stamped in by reinforcement appears to be at odds with research demonstrating abrupt changes in behavior when the reinforcer is changed or devalued. We now consider some of these studies from the perspective of behavioral momentum.
A number of early studies demonstrated that reducing the magnitude or quality of the reinforcer resulted in abrupt decrements in behavior. For example, in a frequently cited study, Crespi (1942) trained rats in an alley with a large reinforcer and then shifted to a small reinforcer. Running speed decreased substantially in the next trial, to a level below that maintained by training with the small reinforcer only. The result suggests that although running may have been acquired as a result of reinforcement, the reinforcer did not stamp in a habitual connection between the alley and running as expected according to the Law of Effect. In a review of this and related studies, Mackintosh (1974) concluded that ".. reinforcers do not increase the strength of an association between stimulus and response; they are themselves associated with the response" (p. 216).
In Mackintosh's terms, we suggest that whether or not instrumental learning involves response-reinforcer associations, reinforcers do increase the strength of an association between stimulus and response as measured by resistance to change. Consider Crespi's result in relation to Equation 17 above. Abrupt reduction in reinforcer magnitude may be construed as a resistance test, on a continuum with reduction to zero -- i.e, extinction -- and its effects may be attributed, at least in part, to the change in the stimulus situation that necessarily accompanies changes in the reinforcer, construed as a part of the set of events associated with training. These effects would compete with the persistence of running based on the behavioral mass established by the alley-reinforcer relation during training, which could be assessed independently by a resistance test such as prefeeding that did not involve changing the reinforcer. Thus, abrupt changes in behavior when the reinforcer changes are not incompatible with the development of a Thorndikean bond -- they merely complicate its measurement.
A number of studies have evaluated response-reinforcer associations by devaluing the reinforcer, usually by pairing it with a drug that causes gastric upset. For example, Colwill & Rescorla (1985a) arranged liquid sucrose or food reinforcers for lever pressing or chain pulling, counterbalanced across groups of rats. In a second phase of the experiment, they devalued one reinforcer by pairing it with gastric upset with the lever and chain removed from the chamber. In a final extinction test, they observed selective suppression of the response that had produced that reinforcer during training. This result suggests that the rats had associated each response with its respective reinforcer during training, as suggested by Mackintosh (1974, quoted above), and then anticipated those reinforcers in the final test. These and related findings (e.g., Adams & Dickinson, 1981), are contrary to Thorndike's original Law because situation-response connections should have been equally strong for both responses.
Although our approach to the strength of discriminated operant behavior does not address the mechanism of response-specific reinforcer devaluation when responding is precluded, there is at least one aspect of the results that is related to analyses of resistance to change. Both Adams and Dickinson (1981) and Colwill and Rescorla (1985a,b) found that although responding established by a contingent and subsequently devalued reinforcer was suppressed relative to that established by a reinforcer that was not devalued, it was not totally suppressed despite the fact that the rats never consumed the devalued reinforcer. In other words, responding persisted despite the joint disruptive effects of reinforcer devaluation and extinction. In keeping with our arguments above, the persistence of responding suggests that situation-response connections had been formed during training. This conclusion is consistent with Dickinson's (1994) suggestion that "instrumental training established lever pressing partly as a goal-directed action, mediated by knowledge of the instrumental relation, and partly as an S-R habit impervious to outcome devaluation" (pp. 51-52).
The effects of reinforcer devaluation may be isolated by comparing resistance to extinction of a response when its reinforcer had been devalued with that of a response when its reinforcer had not been devalued. Examination of Colwill and Rescorla's (1985a) Figure 1 shows that when the reinforcer had not been devalued, training with sucrose led to substantially more responding during extinction than training with food. Thus, sucrose was the more effective reinforcer in that it established greater resistance to extinction. Consistent with this interpretation, sucrose-reinforced responding after devaluation was greater, relative to responding when sucrose had not been devalued, than was food-reinforced responding after devaluation relative to responding when food had not been devalued. Thus, sucrose apparently established stronger situation-response connections as evidenced by greater resistance to reinforcer devaluation as well as to extinction.
Because the data are presented as averages, we cannot determine whether this difference in relative responding is statistically significant, and in any case Colwill and Rescorla's experiment was not designed to evaluate relative resistance to devaluation. It would be interesting to examine devaluation effects in multiple schedules with substantially different reinforcer rates and ascertain whether responding during an extinction test depends on stimulus-reinforcer relations in the same way as responding that has been reduced by other disruptors. Experiments of this sort could lead to a fruitful interaction between analyses of resistance to change and research concerned with the associative structure of learning.
On the basis of the research described in this article, we propose a modern version of Thorndike's Law of Effect for discriminated operant behavior:
When a response has been reinforced in a distinctive stimulus situation, its probability or rate of occurrence depends on the response-reinforcer contingencies. At the same time, it becomes connected to the situation and will tend to recur despite challenging disruptions. The greater the value of the situation, as determined by the conditions of reinforcement and as measured by preference, the greater the strength of connection as measured by resistance to change.
Adams, C. D., & Dickinson, A. (1981). Instrumental responding following reinforcer devaluation. Quarterly Journal of Experimental Psychology, 33B:109-121.
Autor, S. M. (1960). The strength of conditioned reinforcers as a function of frequency and probability of reinforcement. Unpublished doctoral dissertation, Harvard University; reprinted in Hendry, D. P. (Ed.) (1969) Conditioned reinforcement (pp. 127-162). Homewood, IL: Dorsey.
Baum, W. M. (1993). Performance on interval and ratio schedules of reinforcement: Data and theory. Journal of the Experimental Analysis of Behavior, 59:245-264.
Bedell, M. A., Grace, R. C., & Nevin, J. A. (1997). Effects of reinforcement delay and magnitude on preference and resistance to change. Poster presented at the meetings of the Association for Behavior Analysis, Chicago, IL, May.
Bell, M. (in press). Pavlovian contingencies and resistance to change. Journal of the Experimental Analysis of Behavior.
Blackman, D. E. (1968). Response rate, reinforcement frequency, and conditioned suppression. Journal of the Experimental Analysis of Behavior, 11:503-516.
Bouzas, A. (1978). The relative law of effect: Effects of shock intensity on response strength in multiple schedules. Journal of the Experimental Analysis of Behavior, 30:307- 314.
Brownell, K. D., Marlatt, G. A., Lichstenstein, E., & Wilson, G. T. (1986). Understanding and preventing relapse. American Psychologist, 41:765-782.
Carroll, M. E, & Lac, S. T. (1993). Autoshaping i.v. cocaine self-administration in rats: Effects of nondrug alternative reinforcers on acquisition. Psychopharmacology, 110:5-12.
Cohen, S. L. (1986). A pharmacological examination of the resistance-to-change hypothesis of response strength. Journal of the Experimental Analysis of Behavior, 46:363- 379.
Cohen, S. L. (1996). Behavioral momentum of typing behavior in college students. Journal of Behavior Analysis and Therapy, 1:36-51.
Cohen, S. L., Riley, D. S., & Weigle, P. A. (1993). Tests of behavioral momentum in simple and multiple schedules with rats and pigeons. Journal of the Experimental Analysis of Behavior, 60:255-291.
Colwill, R. C., & Rescorla, R. A. (1985a). Postconditioning devaluation of a reinforcer affects instrumental responding. Journal of Experimental Psychology: Animal Behavior Processes, 11:120-132.
Colwill, R. D., & Rescorla, R. A. (1985b). Instrumental responding remains sensitive to reinforcer devaluation after extensive training. Journal of Experimental Psychoplogy: Animal Behavior Processes, 11:520-536.
Crespi, L. P. (1942). Quantitative variation of incentive and performance in the white rat. American Journal of Psychology 55:467-517
D'Amato, M. R., Lachman, R., & Kivy, P. (1957). Secondary reinforcement as affected by reward schedule and the testing situation. Journal of Comparative and Physiological Psychology, 51:734-741.
de Villiers, P. A. (1977). Choice in concurrent schedules and a quantitative formulation of the law of effect. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior (pp. 233-287). Englewood Cliffs, NJ: Prentice- Hall.
Dickinson, A. (1994). Instrumental conditioning. In N. J. Mackintosh (Ed.) Animal learning and cognition (pp. 45-79). New York: Academic Press.
Egli, M., Schaal, D. W., Thompson, T., & Cleary, J. (1992). Opioid-indiced response-rate decrements in pigeons responding under variable-interval schedules: reinforcement mechansisms. Behavioural Pharmacology, 3:581-591.
Fantino, E. (1968). Effects of required rates of responding upon choice. Journal of the Experimental Analysis of Behavior, 11:15-22.
Fantino, E. (1969). Choice and rate of reinforcement. Journal of the Experimental Analysis of Behavior, 12:723-730.
Fantino, E. (1977). Conditioned reinforcement: Choice and information. In W. K. Honig & J. E. R. Staddon (Eds.), Handbook of operant behavior (pp. 313-339). New York: Prentice Hall.
Fath, S. J., Fields, L., Malott, M. K., & Grossett, D. (1983). Response rate, latency, and resistance to change. Journal of the Experimental Analysis of Behavior, 39:267-274.
Ferster, C. B., & Skinner, B. F. (1957) Schedules of reinforcement. New York: Appleton-Century-Crofts.
Gibbon, J. (1981). The contingency problem in autoshaping. In C. M. Locurto, H. S. Terrace, & J. Gibbon (Eds.) Autoshaping and conditioning theory (pp. 285-308). New York: Academic
Press.
Gibbon, J., Berryman, R., & Thompson, R. L. (1974). Contingency spaces and measures in classical and instrumental conditioning. Journal of the Experimental Analysis of Behavior, 21:585-605.
Glowa, J. R., Wojnicki, F. H. E., Mateka, D., Bacher, J. D., Mansbach, R. S., Balster, R. L., & Rice, K. C. (1995). Effects of dopamine reuptake inhibitors on food- and cocaine-maintained responding: I. Dependence on unit dose of cocaine. Experimental and Clinical Psychopharmacology, 3:219-231.
Goldberg, S. R., & Kelleher, R. T. (1976). Behavior controlled by scheduled injections of cocaine in squirrel and rhesus monkeys. Journal of the Experimental Analysis of Behavior, 25:93-104.
Grace, R. C. (1994). A contextual model of concurrent-chains choice. Journal of the Experimental Analysis of Behavior, 61:113-129.
Grace, R. C. (1995). Independence of reinforcement delay and magnitude in concurrent chains. Journal of the Experimental Analysis of Behavior, 63:255-276.
Grace, R. C. (1996). Choice between fixed and variable delays to reinforcement in the adjusting-delay procedure and concurrent chains. Journal of Experimental Psychology: Animal Behavior Processes, 22:362-383.
Grace, R. C., & Nevin, J. A. (1997). On the relation between preference and resistance to change. Journal of the Experimental Analysis of Behavior, 67:43-65.
Grace, R. C., Schwendiman, J. I., & Nevin, J. A. (1998). Effects of delayed reinforcement on preference and resistance to change. Journal of the Experimental Analysis of Behavior, 69:247-261.
Green, L., Fisher, E. B. Jr., Perlow, S., & Sherman, L. (1981). Preference reversal and self control: Choice as a function of reward amount and delay. Behaviour Analysis Letters, 1:43-51.
Harper, D. N. (1996). Response-independent food delivery and behavioral resistance to change. Journal of the Experimental Analysis of Behavior, 65:549-560.
Harper, D. N. (1999, in press). Drug induced changes in responding are dependent upon baseline stimulus-reinforcer contingencies. Psychobiology, __:______.
Harper, D. N., & McLean, A. P. (1992). Resistance to change and the law of effect. Journal of the Experimental Analysis of Behavior, 57:317-337.
Herrnstein, R. J. (1961). Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4:267- 272.
Herrnstein, R. J. (1964a). Secondary reinforcement and rate of primary reinforcement. Journal of the Expermental Analysis of Behavior, 7:27-36.
Herrnstein, R. J. (1964b). Aperiodicity as a factor in choice. Journal of the Experimental Analysis of Behavior, 7:179-182.
Herrnstein, R. J. (1970). On the law of effect. Journal of the Experimental Analysis of Behavior, 13:243-266.
Herrnstein, R. J., & Heyman, G. M. (1979). Is matching compatible with maximization on concurrent variable interval variable ratio? Journal of the Experimental Analysis of Behavior, 31:209-223.
Heyman, G. M. (1996). Resolving the contradictions of addiction. Behavioral and Brain Sciences, 19:561-610.
Hoffman, S. H., Branch, M. N., & Sizemore, G. M. (1987). Cocaine tolerance: Acute versus chronic effects as dependent upon fixed-ratio size. Journal of the Experimental Analysis of Behavior, 47:363-376.
Hull, C. L. (1943). Principles of behavior. Appleton-Century- Crofts.
Hunt, G. H., & Oderoff, M. E. (1962). Follow-up study of narcotic drug addict after hospitalization. Public Health Reports, 77:41-54.
Keesey, R. E., & Kling, J. W. (1961). Amount of reinforcement and free-operant responding. Journal of the Experimental Analysis of Behavior, 4:125-132.
Killeen, P. (1968). On the measurement of reinforcement frequency in the study of preference. Journal of the Experimental Analysis of Behavior, 11:263-269.
Kimble, G. A. (1961). Hilgard and Marquis's Conditioning and Learning, 2nd Edition. New York: Appleton-Century-Crofts.
Lattal, K. A. (1989). Contingencies on response rate and resistance to change. Learning and Motivation, 20:191-203.
Lerman, D. C., & Iwata, B. A. (1996). Developing a technology for the use of operant extinction in clinical settings: An examination of basic and applied research. Journal of Applied Behavior Analysis, 29:345-382.
Llewellyn, M. E., Iglauer, C., & Woods, J. H. (1976). Relative reinforcer magnitude under a nonindependent concurrent schedule of cocaine reinforcement in rhesus monkeys. Journal of the Experimental Analysis of Behavior, 25:81-91.
Logan, F. A. (1956). A micromolar approach to behavior theory. Psychological Review, 63:63-73.
Logue, A. W. (1988). Research on self-control: An integrating framework. Behavioral and Brain Sciences, 11:665-679.
Logue, A. W., Rodriguez, M. L., Pena-Correal, T. E., & Mauro, B. C. (1984). Choice in a self-control paradigm: Quantification of experience-based differences. Journal of the Experimental Analysis of Behavior, 41:53-67.
Lucki, I., & deLong, R. E. (1983). Control rate of response or reinforcement and amphetamine's effect on behavior. Journal of the Experimental Analysis of Behavior, 40:123-132.
Lyon, D. O. (1963). Frequency of reinforcement as a parameter of conditioned suppression. Journal of the Experimental Analysis of Behavior, 6:95-98.
Mace, F. C. (1991). Recent advances and functional analysis of behavior disorders. Paper presented at the meetings of the Association for Behavior Analysis, Atlanta, GA, May.
Mace, F. C., Hock, M. L., Lalli, J. S., West, B. J., Belfiore, P., Pinter, E., & Brown, D. K. (1988). Behavioral momentum in the treatment of noncompliance. Journal of Applied Behavior Analysis, 21:123-141.
Mace, F. C., Lalli, J. S., Shea, M. C., Lalli, E. P., West, B. J., Roberts, M., and Nevin, J. A. (1990). The momentum of human behavior in a natural setting. Journal of the Experimental Analysis of Behavior, 54:163-172.
Mackintosh, N. J. (1974). The psychology of animal learning. London: Academic Press.
Mandell, C. (1980). Response strength in multiple periodic and aperiodic schedules. Journal of the Experimental Analysis of Behavior, 33:221-241.
Marcattilio, A. J. M., & Richards, R. W. (1981). Preference for signaled vs. unsignaled reinforcement delay in concurreent- chain schedules. Journal of the Experimental Analysis of Behavior, 36, 221-229.
Mauro, B. C., & Mace, F. C. (1996). Differences in the effect of Pavlovian contingencies upon behavioral momentum using auditory versus visual stimuli. Journal of the Experimental Analysis of Behavior, 65:389-399.
Mazur, J. E. (1987). An adjusting procedure for studying delayed reinforcement. In M. L. Commons, J. E. Mazur, J. A. Nevin, & H. Rachlin (Eds.) Quantitative analyses of behavior: Vol. 5. The effect of delay and intervening events on reinforcement value (pp. 55-73). Hillsdale, NJ: Erlbaum.
Mazur, J. E, & Logue, A. W. (1978). Choice in a "self-control" paradigm: Effects of a fading procedure. Journal of the Experimental Analysis of Behavior, 30:11-17.
McDowell, J. J (1982). The importance of Herrnstein's mathematical statement of the law of effect for behavior therapy. American Psychologist, 37:771-779.
McLean, A. P., Campbell-Tie, P. & Nevin, J. A. (1996). Resistance to change as a function of stimulus-reinforcer and location- reinforcer contingencies. Journal of the Experimental Analysis of Behavior, 66:169-191.
McSweeney, F. K. (1992). Rate of reinforcement and session duration as determinants of within-session patterns of responding. Animal Learning and Behavior, 20:160-169.
Mellon, R. C., & Shull, R. L. (1986). Resistance to change produced by access to fixed-delay versus variable-delay terminal links. Journal of the Experimental Analysis of Behavior, 46:79-92.
Millenson, J. R., & de Villiers, P. A. (1972). Motivational properties of conditioned anxiety. In R. M. Gilbert & J. R. Millenson (Eds.) Reinforcement: Behavioral analyses (Pp. 98- 128). New York: Academic Press.
Morse, W. H. (1966). Intermittent reinforcement. In W. K. Honig (Ed.), Operant behavior: Areas of research and application (pp. 52-108). New York: Appleton-Century- Crofts.
Nation, J. R., & Woods, D. J. (1980). Persistence: The role of partial reinforcement in psychotherapy. Journal of Experimental Psychology: General, 109:175-207.
Neuringer, A. J. (1967). Effects of reinforcement magnitude on choice and rate of responding. Journal of the Experimental Analysis of Behavior, 10:417-424.
Neuringer, A. J. (1969). Delayed reinforcement versus reinforcement after a fixed interval. Journal of the Experimental Analysis of Behavior, 12:375-383.
Nevin, J. A. (1974). Response strength in multiple schedules. Journal of the Experimental Analysis of Behavior, 21:389- 408.
Nevin, J. A. (1979). Reinforcement schedules and response strength. In M. D. Zeiler & P. Harzem (Eds.) Reinforcement and the organization of behaviour (Pp.117-158). Chichester, England: Wiley.
Nevin, J. A. (1984). Pavlovian determiners of behavioral momentum. Animal Learning and Behavior, 12:363-370.
Nevin, J. A. (1988). Behavioral momentum and the partial reinforcement effect. Psychological Bulletin, 103:44-56.
Nevin, J. A. (1992a). Behavioral contrast and behavioral momentum. Journal of Experimental Psychology: Animal Behavior Processes, 18:126-133.
Nevin, J. A. (1992b). An integrative model for the study of behavioral momentum. Journal of the Experimental Analysis of Behavior, 57:301-316.
Nevin, J. A. (1996a). The momentum of compliance. Journal of Applied Behavior Analysis, 29:535-547.
Nevin, J. A. (1996b). Stimulus factors in addiction. Behavioral and Brain Sciences, 19:590-591.
Nevin, J. A., & Grace, R. C. (1999, in press). Does the context of reinforcement affect resistance to change? Journal of Experimental Psychology: Animal Behavior Processes.
Nevin, J. A., Mandell, C., & Atak, J. R. (1983). The analysis of behavioral momentum. Journal of the Experimental Analysis of Behavior, 39:49-59.
Nevin, J. A., Mandell, C. & Yarensky, P. (1981). Response rate and resistance to change in chained schedules. Journal of Experimental Psychology: Animal Behavior Processes, 7:278- 294.
Nevin, J. A., Smith, L. D., & Roberts, J. (1987). Does contingent reinforcement strengthen operant behavior? Journal of the Experimental Analysis of Behavior, 48:17-33.
Nevin, J. A., Tota, M. E., Torquato, R. D., & Shull, R. L. (1990). Alternative reinforcement increases resistance to change: Pavlovian or operant contingencies? Journal of the Experimental Analysis of Behavior, 53:359-379.
Pavlov, I. P. (1927). Conditioned reflexes (G. V. Anrep, trans.) Oxford Unviersity Press.
Postman, L. (1947). The history and present status of the Law of Effect. Psychological Bulletin, 44:489-563.
Rachlin, H. (1995). Self-control: Beyond commitment. Behavioral and Brain Sciences, 18:109-121.
Rescorla, R. A., & Skucy, J. C. (1969). Effect of response- independent reinforcers during extinction. Journal of Comparative and Physiological Psychology, 67:381-389.
Roberts, J. E., Tarpy, R. M., & Lea, S. E. G. (1984). Stimulus- response overshadowing: Effects of signaled reward on instrumental responding as measured by response rate and resistance to change. Journal of Experimental Psychology: Animal Behavior Processes, 10, 244-255.
Robins, L. N., Helzer, J. E., Hesselbrook, M., & Wish, E. D. (1977). Vietnam veterans three years after Vietnam: How our study changed our view of heroin. In L. Harris (Ed.), Problems of drug dependence. Richmond, VA: Committee on Problems of Drug Dependence.
Schwartz, B., & Gamzu, E. (1977). Pavlovian control of operant behavior. In. W. K. Honig & J. E. R. Staddon (Eds.) Handbook of operant behavior, (pp. 53-97). Englewood Cliffs, NJ: Prentice Hall.
Seligman, M. E. P. (1970). On the generality of the laws of learning. Psychological Review, 77:406-418.
Sherrington, C. S. (1906). The integrative action of the nervous system. Yale University Press.
Shettleworth, S., & Nevin, J. A. (1965). Relative rate of responding and relative magnitude of reinforcement in multiple schedules. Journal of the Experimental Analysis of Behavior, 8:199-202.
Skinner, B. F. (1938). The behavior of organisms. New York: Appleton-Century-Crofts.
Skinner, B. F. (1969). Contingencies of reinforcement: A theoretical analysis. New York: Appleton-Century-Crofts.
Smith, J. B. (1974). Effects of response rate, reinforcement frequency, and the duration of a stimulus preceding response-independent food. Journal of the Experimental Analysis of Behavior, 21:215-221.
Smith, K. (1974). The continuum of reinforcement and attenutation. Behaviorism, 2:124-145.
Tomie, A. (1976). Interference with autoshaping by prior context conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 2:323-334.
Thorndike, E. L. (1911). Animal intelligence: An experimental study of the associative processes in animals. Psychological Review Monograph Supplement, 2:Whole No. 8.
Thorndike, E. L. (1913). Educational psychology: Volume 2: The psychology of learning. New York: Teachers College Press, Columbia University.
Tota-Faucette, M. E. (1991). Alternative reinforcement and resistance to change. Unpublished doctoral dissertation, University of North Carolina at Greensboro.
Vaughan, W. Jr., & Miller, H. L. Jr. (1984). Optimization versus response strength accounts of behavior. Journal of the Experimental Analysis of Behavior, 42:337-348.
vom Saal, W. (1972). Choice between stimuli previously presented separately. Learning and Motivation, 3:209-222.
Williams, B. A. (1981). The following schedule of reinforcement as a fundamental determinant of steady state contrast in multiple schedules. Journal of the Experimental Analysis of Behavior, 35:293-310.
Williams, B. A. (1988). Reinforcement, choice, and response strength. In R. C. Atkinson, R. J. Herrnstein, G. Lindzey, & R. D. Luce (Eds.) Stevens' handbook of experimental psychology, 2nd edition, Vol. 2 (pp. 167-244). New York: Wiley.
Williams, B. A. (1991). Behavioral contrast and reinforcement value. Animal Learning and Behavior, 19:337-344.
Williams, D. R., and Williams, H. (1969). Auto-maintenance in the pigeon: Sustained pecking despite contingent non- reinforcement. Journal of the Experimental Analysis of Behavior, 12:511-520.
Preparation of this article was supported by National Science Foundation Grant IBN-9507584. We thank the reviewers, especially Peter Killeen, for their thoughtful comments.
Williams (1991) performed a related experiment in which stimuli signaling the two identical schedules were presented simultaneously in occasional choice probe tests. He found that his subjects made more choice-probe responses to the stimulus preceding the richer schedule than to the stimulus preceding the leaner schedule even though baseline response rate in the component preceding the richer schedule was lower. As will be seen in Section 8.3, the agreement between relative resistance in Nevin et al. (1987) and probe choice in Williams (1991) may exemplify the general correlation between resistance and preference.
We recognize that the analogy between velocity and response rate is inexact, in that velocity is a vector that measures direction as well as distance per unit time. Nevertheless, the analogy to number of repeatable free-operant responses per unit time is suggestive, and may be especially helpful in applied behavior analyses, as suggested by Nevin (1996a).
Nevin and Grace (1999, in press) have recently used a version of this model to interpret differences between resistance to prefeeding and resistance to extinction in their data.