To be published in Behavioral and Brain Sciences (in press)

© Cambridge University Press 2007

 

 


Below is the unedited, uncorrected final draft of a BBS target article that has been accepted
 for publication. This preprint has been prepared for potential commentators who wish to 
nominate themselves for formal commentary invitation. Please DO NOT write a commentary
 until you receive a formal invitation. If you are invited to submit a commentary, a copyedited, 
corrected version of this paper will be posted.

 

 


Base-rate Respect: From Ecological Rationality to Dual Processes

 

Running head: Base-rate respect

 

Aron K. Barbey                               

Department of Psychology       

Emory University                              

Atlanta, GA 30322                   

(404) 727-7386                               

abarbey@emory.edu

http://www.psychology.emory.edu/cognition/abarbey/index.html         

                                                                                   

Steven A. Sloman

Cognitive and Linguistics Science

Brown University, Box 1978

Providence, RI 02912

(401) 863-7595

Steven_Sloman@brown.edu

http://www.cog.brown.edu/~sloman/

 

Abstract: The phenomenon of base-rate neglect has elicited much debate.  One arena of debate concerns how people make judgments under conditions of uncertainty.  Another more controversial arena concerns human rationality.  In this paper, we attempt to unpack the perspectives in the literature on both kinds of issues and evaluate their ability to explain existing data and their conceptual coherence.  We will conclude that the best account of the data should be framed in terms of a dual-process model of judgment that attributes base-rate neglect to associative judgment strategies that fail to adequately represent the set structure of the problem.  Base-rate neglect is reduced when problems are presented in a format that affords accurate representation in terms of nested sets of individuals.

 

Keywords:  Base-rate neglect, Probability judgment, Bayesian reasoning, Dual process theory, Nested set hypothesis

 

 

1.0.  Introduction

Diagnosing whether a patient has a disease, predicting whether a defendant is guilty of a crime, and other everyday as well as life-changing decisions in part reflect the decision-maker’s subjective degree of belief in uncertain events.  Intuitions about probability frequently deviate dramatically from the dictates of probability theory (e.g., Gilovich et al., 2002).  One form of deviation is notorious:  People’s tendency to neglect base-rates in favor of specific case data.  A number of theorists (e.g., Cosmides & Tooby, 1996; Brase, 2002a; Gigerenzer & Hoffrage, 1995) have argued that such neglect reveals little more than experimenters’ failure to ask about uncertainty in a form that naïve respondents can understand, specifically in the form of a question about natural frequencies.  The brunt of our argument will be that this perspective is far too narrow.  After surveying the theoretical perspectives on the issue, we will show that both data and conceptual considerations demand that judgment be understood in terms of dual processing systems, one that is responsible for systematic error and another that is capable of reasoning not just about natural frequencies, but about relations among any kind of set representation.

Base-rate neglect has been extensively studied in the context of Bayes’ theorem, which provides a normative standard for updating the probability of a hypothesis in light of new evidence.  Research has evaluated the extent to which intuitive probability judgment conforms to the theorem by employing a Bayesian inference task in which the respondent is presented a word problem and has to infer the probability of a hypothesis (e.g., the presence versus absence of breast cancer) on the basis of an observation (e.g., a positive mammography).  Consider the following Bayesian inference problem motivated by Eddy (1982; cf. Gigerenzer & Hoffrage, 1995):

The probability of breast cancer is 1% for a woman at age forty who participates in routine screening [base-rate].  If a woman has breast cancer, the probability is 80% that she will get a positive mammography [hit-rate].  If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography [false-alarm rate].  A woman in this age group had a positive mammography in a routine screening.  What is the probability that she actually has breast cancer? __%


According to Bayes’ theorem[1], the probability that the patient has breast cancer given that she has a positive mammography is 7.8 per cent.  Evidence that people’s judgments on this problem accord with Bayes’ theorem would be consistent with the claim that the mind embodies a calculus of probability, whereas the lack of such a correspondence would demonstrate that people’s judgments can be at variance with sound probabilistic principles and, as a consequence, that people can be led to make incoherent decisions (Savage, 1954; Ramsey, 1964).  Thus, the extent to which intuitive probability judgment conforms to the normative prescriptions of Bayes’ theorem has implications for the nature of human judgment (for a review of the theoretical debate on human rationality, see Stanovich, 1999).  In the case of Eddy’s study, fewer than 5 per cent of the respondents generated the Bayesian solution.


Early studies evaluating Bayesian inference under single-event probabilities also showed systematic deviations from Bayes’ theorem.  Hammerton (1973), for example, found that only 10 per cent of the physicians tested generated the Bayesian solution, with the median response approximating the hit-rate of the test.  Similarly, Casscells, Schoenberger, and Grayboys (1978) and Eddy (1982) found that a low proportion of respondents generated the Bayesian solution:  18 per cent in the former and 5 per cent in the latter, with the modal response in each study corresponding to the hit-rate of the test. All of this suggests that the mind does not normally reason in a way consistent with the laws of probability theory.

 

1.1.   Base-rate facilitation

However, this conclusion has not been drawn universally.  Eddy’s (1982) problem concerned a single event, the probability that a particular woman has breast cancer.  In some problems, when probabilities that refer to the chances of a single event occurring (e.g., 1 %) are reformulated and presented in terms of natural frequency formats (e.g., 10 out of 1,000), people more often draw probability estimates that conform to Bayes theorem.  Consider the following mammography problem presented in a natural frequency format by Gigerenzer and Hoffrage (1995).

10 out of every 1,000 women at age forty who participate in routine screening have breast cancer [base-rate].  8 out of every 10 women with breast cancer will get a positive mammography [hit-rate].  95 out of every 990 women without breast cancer will also get a positive mammography [false-alarm rate].  Here is a new representative sample of women at age forty who got a positive mammography in routine screening.  How many of these women do you expect to actually have breast cancer?  ___ out of ___.

The proportion of responses conforming to Bayes’ theorem increased by a factor of about three in this case, 46 per cent under natural frequency formats versus 16 per cent under a single-event probability format.  The observed facilitation has motivated researchers to argue that coherent probability judgment depends on representing events in the form of natural frequencies (e.g., Cosmides & Tooby, 1996; Brase, 2002a; Gigerenzer & Hoffrage, 1995).    

Cosmides and Tooby (1996) also conducted a series of experiments that employed Bayesian inference problems that had previously elicited judgmental errors under single-event probability formats.  In Experiment 1, they replicated Casscells et al. (1978), demonstrating that only 12 per cent of their respondents produced the Bayesian answer when presented single-event probabilities.  Cosmides and Tooby then transformed the single-event probabilities into natural frequencies, resulting in a remarkably high proportion of Bayesian responses:  72 per cent of respondents generated the Bayesian solution, supporting the author’s conclusion that Bayesian inference depends on the use of natural frequencies. 

Gigerenzer (1996b) explored whether physicians, who frequently assess and diagnose medical illness, would demonstrate the same pattern of judgments as clinically untrained college undergraduates.  Consistent with the judgments drawn by college students (e.g., Gigerenzer & Hoffrage, 1995), Gigerenzer found that the sample of 48 physicians tested generated the Bayesian solution in only 10 per cent of the cases under single-event probability formats whereas 46 per cent did with natural frequency formats.  Physicians spent about 25 per cent more time on the single-event probability problems, suggesting that they found these problems more difficult to solve than problems presented in a natural frequency format.  Thus, the physician’s judgments were consistent with those of non-physicians, suggesting that formal training in medical diagnosis does not lead to more accurate Bayesian reasoning and that natural frequencies facilitate probabilistic inference across populations.   

Further studies have demonstrated that the facilitory effect of natural frequencies on Bayesian inference observed in the laboratory has the potential for improving the predictive accuracy of professionals in important real-world settings.  Gigerenzer and his colleagues have shown, for example, that natural frequencies facilitate Bayesian inference in AIDS counseling (Gigerenzer et al., 1998), in the assessment of statistical information by judges (Lindsey et al., 2003), and in teaching Bayesian reasoning to college undergraduates (Sedlmeier & Gigerenzer, 2001; Kuzenhauser & Hoffrage, 2002). In summary, the reviewed findings demonstrate facilitation in Bayesian inference when single-event probabilities are translated into natural frequencies, consistent with the view that coherent probability judgment depends on natural frequency representations.

 

1.2.    Theoretical accounts

Explanations of facilitation in Bayesian inference can be grouped into five types that can be arrayed along a continuum of cognitive control, from accounts that ascribe facilitation to processes that have little to do with strategic cognitive processing to those that appeal to general-purpose reasoning procedures.  The five accounts we discuss can be contrasted at the coarsest level on five dimensions (see Table 1).  We do not claim that theorists have consistently made these distinctions in the past, only that these distinctions are in fact appropriate ones.

 

Table 1.  Prerequisites for reduction of base-rate neglect according to 5 theoretical frameworks.

 

 

Mind as Swiss army knife

Natural frequency algorithm

Natural frequency heuristic

Non-evolutionary natural frequency heuristic

Nested sets and dual processes

Cognitive impenetrability

X

 

 

 

 

Informational encapsulation

X

X

 

 

 

Appeal to evolution

X

X

 X

 

 

Cognitive process uniquely sensitive to natural frequency formats

X

X

X

X

 

Transparency of nested set relations

X

X

X

X

X

 

Note.  The prerequisites of each theory are indicated by an ‘X’.

 

A parallel taxonomy for theories of categorization can be found in Sloman, Lombrozo, and Malt (in press).  We briefly introduce the theoretical frameworks here.  The discussion of each will be elaborated as required to reveal assumptions and derive predictions in the following sections in order to compare and contrast them.

 

1.2.1.  Mind as Swiss army knife

Several theorists have argued that the human mind consists of a number of specialized modules (Cosmides & Tooby, 1995; Gigerenzer & Selten, 2001).  Each module is assumed to be unavailable to conscious awareness or deliberate control (cognitively impenetrable) and able to process only a specific type of information (informationally encapsulated; see Fodor, 1983).  One module in particular is designed to process natural frequencies.  This module is thought to have evolved because natural frequency information is what was available to our ancestors in the environment of evolutionary adaptiveness.  On this view, facilitation occurs because natural frequency data are processed by a computationally effective processing module.


Two arguments have been advanced in support of the ecological validity of natural frequency data.  First, as natural frequency information is acquired it can be “easily, immediately, and usefully incorporated with past frequency information via the use of natural sampling, which is the method of counting occurrences of events as they are encountered and storing the resulting knowledge base for possible use later” (Brase, 2002, p. 384).  Second, information stored in a natural frequency format preserves the sample size of the reference class (e.g., 10 out of 1,000 women have breast cancer), and are arranged into subset relations (e.g., of the 10 women that have breast cancer, 8 are positively diagnosed) that indicate how many cases of the total sample there are in each subcategory (i.e., the base-rate, the hit-rate, and false-alarm rate).  Because natural frequency formats entail the sample and effect sizes, posterior probabilities consistent with Bayes’ theorem can be calculated without explicitly incorporating base-rates, thereby allowing simple calculations[2] (Kleiter, 1994).  Thus proponents of this view argue that the mind has evolved to process natural frequency formats over single-event probabilities and, in particular, includes a cognitive module  that “maps frequentist representations of prior probabilities and likelihoods onto a frequentist representation of a posterior probability in a way that satisfies the constraints of Bayes’ theorem” (Cosmides & Tooby, 1996, p. 60).   


 Theorists who take this position uniformly motivate their hypothesis via a process of natural selection.  However, the cognitive and evolutionary claims are in fact conceptually independent.  The mind could consist of cognitively impenetrable and informationally encapsulated modules whether or not any or all of those modules evolved for the specific reasons offered.

 

1.2.2.  Natural frequency algorithm

A weaker claim is that the mind includes a specific algorithm for effectively processing natural frequency information (Gigerenzer & Hoffrage, 1995).  Unlike the mind-as-Swiss-army-knife view, this hypothesis makes no general claim about the architecture of mind.  Despite this difference in scope, these theories adopt the same computational and evolutionary commitments. 

Consistent with the mind-as-Swiss-army-knife view, this approach proposes that coherent probability judgment derives from a simplified form of Bayes’ theorem.  The proposed algorithm computes the number of cases where the hypothesis and observation co-occur, N(H and D), out of the total number of cases where the observation occurs, N(H and D) + N(not-H and D) = N(D) (Kleiter, 1994; Gigerenzer & Hoffrage, 1995).  Because this form of Bayes’ theorem expresses a simple ratio of frequencies, we refer to it as “the Ratio.”

Following the mind-as-Swiss-army knife view, proponents of this approach have ascribed the origin of the Bayesian ratio to evolution.  Gigerenzer and Hoffrage (1995, p. 686), for example, state “The evolutionary argument that cognitive algorithms were designed for frequency information, acquired through natural sampling, has implications for the computations an organism needs to perform when making Bayesian inferences….  Bayesian algorithms are computationally simpler when information is encoded in a frequency format rather than a standard probability format.”  As a consequence, this view predicts that “Performance on frequentist problems will satisfy some of the constraints that a calculus of probability specifies, such as Bayes’ rule.  This would occur because some inductive reasoning mechanisms in our cognitive architecture embody aspects of a calculus of probability” (Cosmides & Tooby, 1996, p. 17). 

The proposed algorithm is necessarily informationally encapsulated as it operates on a specific information format, natural frequencies, but it is not necessarily cognitively impenetrable as no one has claimed that other cognitive processes can’t affect or use the algorithm’s computations.  The primary motivation for the existence of this algorithm has been computational (Kleiter, 1994; Gigerenzer & Hoffrage, 1995).  As reviewed above, the value of natural frequencies is that these formats entail the sample and effect sizes and, as a consequence, simplify the calculation of Bayes’ theorem:  Probability judgments are coherent with Bayesian prescriptions even without explicit consideration of base-rates. 

 

1.2.3.  Natural frequency heuristic

A claim that puts facilitation under more cognitive control is that people use heuristics to make judgments (Tversky & Kahneman, 1974; Gigerenzer & Selten, 2001) and that the Ratio is one such heuristic (Gigerenzer, Todd & the ABC research group, 1999).  According to this view, “heuristics can perform as well, or better, than algorithms that involve complex computations….  The astonishingly high accuracy of these heuristics indicates their ecological rationality; fast and frugal heuristics exploit the statistical structure of the environment, and they are adapted to this structure” (Gigerenzer, 2006).  Advocates of this approach motivate the proposed heuristic by pointing to the ecological validity of natural frequency formats, as Gigerenzer further states (p. 52), “To evaluate the performance of the human mind, one needs to look at its environment and, in particular, the external representation of the information.  For most of the time during which the human mind evolved, information was encountered in the form of natural frequencies…”  Thus, this view proposes that the mind evolved to process natural frequencies and that this evolutionary adaptation gave rise to the proposed heuristic that computes the Bayesian Ratio from natural frequencies. 

 

1.2.4.  Non-evolutionary natural frequency heuristic

Evolutionary arguments about the ecological validity of natural frequency representations provide part of the motivation for the preceding theories.  In particular, proponents of the theories argue that throughout the course of human evolution natural frequencies were acquired via natural sampling (i.e., encoding event frequencies as they are encountered and storing them in the appropriate reference class). 

In contrast, the non-evolutionary natural frequency theory proposes that natural sampling is not necessarily an evolved procedure for encoding statistical regularities in the environment, but a useful sampling method that, one way or another, people can appreciate and use.  The natural frequency representations that result from natural sampling, on this view, simplify the calculation of Bayes’ theorem and, as a consequence, facilitate Bayesian inference (Kleiter, 1994).  Thus, this view differs from the preceding accounts by resting on a purely computational argument that is independent of any commitments to which cognitive processes have been selected for by evolution. 

This theory proposes that the computational simplicity afforded by natural frequencies gives rise to a heuristic that computes the Bayesian Ratio from natural frequencies.  The proposed heuristic implies a higher degree of cognitive control than the preceding modular proposed algorithms.

 

1.2.5.  Nested sets and dual processes

The most extreme departure from the modular view claims that facilitation is a product of general-purpose reasoning processes (Evans et al., 2000; Fox & Levav, 2004; Girotto & Gonzales, 2001; Johnson-Laird et al., 1999; Kahneman & Frederick, 2002, 2005; Over, 2003; Sloman et al., 2003).  On this view, people use two systems to reason (Evans & Over, 1996; Kahneman & Frederick, 2002, 2005; Sloman, 1996; Stanovich & West, 2000), often called Systems 1 and 2 but in an effort to use more expressive labels, we will employ Sloman’s terms “associative” and “rule-based.” 

The dual-process model attributes responses based on associative principles like similarity or retrieval from memory to a primitive associative judgment system.  It attributes responses based on more deliberative processing that involves working memory such as the elementary set operations that respect the logic of set inclusion and facilitate Bayesian inference to a second rule-based system.  Judgmental errors produced by cognitive heuristics are generated by associative processes, whereas the induction of a representation of category instances that makes nested set relations transparent also induces use of rules about elementary set operations, operations of the sort perhaps described by Fox and Levav (2004) or Johnson-Laird et al. (1999).

According to this theory, base-rate neglect results from associative responding and facilitation occurs when people correctly use rules to make the inference.  Rule-based inference is more cognitively demanding than associative inference and is therefore more likely to occur when participants have more time, more incentives, or more external aids to make a judgment and are under fewer other demands at the moment of judgment.  It is also more likely for people who have greater skill employing the relevant rules.  This last prediction is supported by Stanovich and West (2000) who find correlations between intelligence and use of base-rates.

Rules are effective devices for solving a problem to the extent that the problem is represented in a way compatible with the rules.  For example, long division is an effective method for solving division problems but only if numbers are represented using Arabic numerals; division with Roman numerals requires different rules.  By analogy, the reason that natural frequencies facilitate use of base-rates on this view is that the rules that people have access to and are able to use to solve the specific kind of problem studied in the base-rate neglect literature are more compatible with natural frequency formats than single-event probability formats.

Specifically, people are adept at using rules consisting of simple elementary set operations.  But these operations are only applicable when problems are represented in terms of sets, as opposed to single events.  According to this view, facilitation in Bayesian inference occurs under natural frequencies because these formats are an effective cue to the representation of the set structure underlying a Bayesian inference problem.  This is the nested sets hypothesis of Tversky & Kahneman (1983).  On this view, natural frequency formats prompt the respondent to adopt an outside view by inducing a representation of category instances (e.g., 10 out of 1,000 women have breast cancer) that reveals the set structure of the problem and makes the nested set relations transparent for problem solving[3].  We refer to this hypothesis as the nested sets theory (Ayton & Wright, 1994; Evans et al., 2000; Fox & Levav, 2004; Girotto & Gonzalez, 2001, 2002; Johnson-Laird et al., 1999; Kahneman & Tversky, 1983; Macchi, 2000; Mellers & McGraw, 1999; Sloman et al., 2003).  Unlike the other theories, it predicts that facilitation should be observable in a variety of different tasks, not just posterior probability problems, when nested set relations are made transparent. 


 

2.0.  Overview of empirical and conceptual issues reviewed

We now turn to an evaluation of these five theoretical frameworks.  We evaluate a range of empirical and conceptual issues that bear on the validity of these frameworks.    

 

2.0.  Review of empirical literature

The theories are evaluated with respect to the empirical predictions summarized in Table 2.  The predictions of each theory derive from (i) the degree of cognitive control attributed to probability judgment (see Table 1), and (ii) the proposed cognitive operations that underlie estimates of probability. 

Theories that adopt a low degree of cognitive control — proposing cognitively impenetrable modules or informationally encapsulated algorithms — restrict Bayesian inference to contexts that satisfy the assumptions of the processing module or algorithm.  In contrast, theories that adopt a high degree of cognitive control — appealing to a natural frequency heuristic or a domain general capacity to perform set operations — predict Bayesian inference in a wider range of contexts.  The latter theories are distinguished from one another in terms of the cognitive operations they propose:  The evolutionary and non-evolutionary natural frequency heuristics depend on structural features of the problem like question form and reference class.  They imply the accurate encoding and comprehension of natural frequencies and an accurate weighting of the encoded event frequencies to calculate the Bayesian ratio.  In contrast, the nested sets theory does not rely on natural frequencies and instead predicts facilitation in Bayesian inference, and in a range of other deductive and inductive reasoning tasks, when the set structure of the problem is made transparent, thereby promoting use of elementary set operations and inferences about the logical (i.e., extensional) properties they entail. 

 

Table 2.  Empirical predictions of the five theoretical frameworks.

 

 

Mind as Swiss army knife

Natural frequency algorithm

Natural frequency heuristic

Non-evolutionary natural frequency heuristic

Nested sets and dual processes

Facilitation with natural frequencies (information format and judgment domain)

X

X

X

X

X

Facilitation with questions that prompt the respondent to compute the Bayesian ratio (question form)

 

 

X

X

X

Facilitation with statistical information organized in a partitive structure (reference class)

 

 

X

X

X

Facilitation with diagrammatic representations that highlight the set structure of the problem

 

 

X

X

X

Inaccurate frequency judgments

 

 

 

 

X

Equivalent comprehension of natural frequencies and single-event probabilities

 

 

 

 

X

Non-normative weighting of likelihood ratio and prior odds

 

 

 

 

X

Facilitation with set representations in deductive and inductive reasoning

 

 

 

 

X

 

Note.  The predictions of each theory are indicated by an ‘X.’ 

 

2.1.    Information format and judgment domain

The preceding review of the literature found that natural frequencies formats consistently reduced base-rate neglect relative to probability formats.  However, the size of this effect varied considerably across studies (see Table 3). 

Cosmides and Tooby (1996), for example, observed a 60-point percent difference between the proportions of Bayesian responses under natural frequencies versus single-event probabilities, whereas Gigerenzer and Hoffrage (1995) reported a difference only half that size.  The wide variability in the size of the effects makes it clear that in no sense do natural frequencies eliminate base-rate neglect, though they do reduce it. 

Sloman, Over, Slovak, and Stibel (2003) conducted a series of experiments that attempted to replicate the effect sizes observed by the previous studies (e.g., Cosmides & Tooby, 1996, Experiment 2, Condition 1).  Although Sloman et al. found facilitation with natural frequencies, the size of the effect was smaller than that observed by Cosmides and Tooby: The percent of Bayesian solutions generated under single-event probabilities (20%) was comparable to Cosmides and Tooby (12%), but the percentage of Bayesian answers generated under natural frequencies was smaller (i.e., 72% versus 51% for Sloman et al.).  In a further replication, Sloman et al. found that only 31 per cent of their respondents generated the Bayesian solution, a statistically non-significant advantage for natural frequencies. 

 

Table 3.  Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)

 

                                                Information format and judgment domain

                                                                       

Study                                                   Probability                    Frequency

           

Casscells et al., (1978)                         18 (60)                         ---

 

Cosmides & Tooby (1996; Exp. 2)       12 (25)                         72 (25)

 

Eddy (1988)                                         5 (100)                         ---

 

Evans et al., (2000; Exp. 1)                  24 (42)                         35 (43)‡

 

Gigerenzer (1996)                                10 (48)                         46 (48)

 

Gigerenzer & Hoffrage (1995)              16 (30)                         46 (30)

 

Macchi (2000)                                     6 (30)                           40 (30)

           

Sloman et al., (2003) (Exp.1)                20 (25)                         51 (45)

 

Sloman et al., (2003) (Exp. 1b)             ---                                31 (48)‡

 

Note.  Probability problems require that the respondent compute a conditional-event probability from data presented in a non-partitive form, whereas frequency problems include questions that prompt the respondent to evaluate the two terms of the Bayesian ratio and present data that is partitioned into these components.

p > 0.05

 

Evans, Handley, Perham, Over, and Thompson, (2000; Experiment 1) similarly found only a small effect of information format.  They report 24 per cent Bayesian solutions under single-event probabilities and 35 per cent under natural frequencies, a difference that was not reliable. 

Brase, Fiddick, and Harries (in press) examined whether methodological factors contribute to the observed variability in effect size.  They identified two factors that modulate the facilitory effect of natural frequencies in Bayesian inference: (1) the academic selectivity of the university the participants attend, and (2) whether or not the experiment offered a monetary incentive for participation.  Experiments whose participants attended a top-tier national university and were paid reported a significantly higher proportion of Bayesian responses (e.g., Cosmides & Tooby, 1996) than experiments whose participants attended a second-tier regional university and were not paid (e.g., Brase et al., in press, Experiments 3 and 4).  These results suggest that a higher proportion of Bayesian responses is observed in experiments that (a) select participants with a higher level of general intelligence, as indexed by the academic selectivity of the university the participant attends (Stanovich & West, 1998), and (b) increase motivation by providing a monetary incentive.  The former observation is consistent with the view that Bayesian inference depends on domain general cognitive processes to the degree that intelligence is domain general.  The latter suggests that Bayesian inference is strategic, and not supported by automatic (e.g., modularized) reasoning processes.

 

2.2.  Question form

One methodological factor that may mediate the effect of problem format is the form of the Bayesian inference question presented to participants (Girotto & Gonzalez, 2001).  The Bayesian solution expresses the ratio between the size of the subset of cases where the hypothesis and observation co-occur and the total number of observations.  Thus, it follows that the respondent should be more likely to arrive at this solution when prompted to adopt an outside view by utilizing the sample of category instances presented in the problem (e.g., “Here is a new sample of patients who have obtained a positive test result in routine screening.  How many of these patients do you expect to actually have the disease? ___ out of ___”) versus a question that presents information about category properties (e.g., “… Pierre has a positive reaction to the test…”) and prompts the respondent to adopt an inside view by considering the fact about Pierre to compute a probability estimate.  As a result, the form of the question should modulate the observed facilitation.

In the preceding studies, however, information format and judgment domain were confounded with question form: Only problems that presented natural frequencies prompted use of the sample of category instances presented in the problem to compute the two terms of the Bayesian solution (an outside view), whereas single-event probability problems prompted the use of category properties to compute a conditional probability. 

To dissociate these factors, Girotto and Gonzalez (2001) proposed that single-event probabilities (e.g., 1%) can be represented as chances[4] (e.g., “1 chance out of 100”).  Under the chance formulation of probability, the respondent can be asked either for the standard conditional probability or for values that correspond more closely to the ratio expressed by Bayes’ theorem.  The latter question asks the respondent to evaluate the chances that Pierre has a positive test and the infection, out of the total chances that Pierre has a positive test, thereby prompting consideration of the chances that Pierre — who could be anyone with a positive test in the sample — has the infection.  In addition to encouraging an outside view, this question prompts the computation of the Bayesian ratio in two clearly defined steps: First calculate the overall number of chances where the conditioning event is observed, then compare this quantity to the number of chances where the conditioning event is observed in the presence of the hypothesis.   


 To evaluate the role of question form in Bayesian inference, Girotto and Gonzalez (2001, Study 1) conducted an experiment that manipulated question form independently of information format and judgment domain.  The authors presented the following Bayesian inference scenario to 80 college undergraduates of the University of Provence, France:

 

A person who was tested had 4 chances out of 100 of having the infection.  3 of the 4 chances of having the infection were associated with a positive reaction to the test.  12 of the remaining 96 chances of not having the infection were also associated with a positive reaction to the test.

 

Half of the respondents were then asked to compute a conditional probability (i.e., “If Pierre has a positive reaction, there will be ___ chance(s) out of ___ that the infection is associated with his positive reaction.”), whereas the remaining respondents were asked to evaluate the ratio of probabilities expressed in the Bayesian solution (i.e., “Imagine that Pierre is tested now.  Out of the total 100 chances, Pierre has ___ chances of having a positive reaction, ___ of which will be associated with having the infection.”). 

Girotto and Gonzalez (2001) found that only 8 per cent of the respondents generated the Bayesian solution when asked to compute a conditional probability, consistent with the earlier literature.  But the proportion of Bayesian answers increased to 43 per cent when the question prompted the respondent to evaluate the two terms of the Bayesian solution.  The same pattern was observed with the natural frequency format problem.  Only 18 per cent of the respondents generated the Bayesian solution when asked to compute a conditional frequency, whereas this proportion increased to 58 per cent when asked to evaluate the two terms separately.  This level of performance is comparable to that observed under standard natural frequency formats (e.g., Gigerenzer & Hoffrage, 1995), and supports Girotto and Gonzalez’s claim that the two-step question approximates the question asked with standard natural frequency formats.  In further support of Girotto and Gonzalez’s predictions, there were no reliable effects of information format or judgment domain across all the reported comparisons. 

These findings suggest that people are not predisposed against using single-event probabilities but instead appear to be highly sensitive to the form of the question:  When asked to reason about category instances to compute the two terms of the Bayesian ratio, respondents were able to draw the normative solution under single-event probabilities.  Facilitation in Bayesian inference under natural frequencies need not imply that the mind is designed to process these formats, but instead can be attributed to the facilitory effect of prompting use of the sample of category instances presented in the problem to evaluate the two terms of the Bayesian ratio. 

 

2.3.  Reference class

To assess the role of problem structure in Bayesian inference, we review studies that have manipulated structural features of the problem.  Girotto and Gonzalez (2001) report two experiments that systematically assess performance under different partitionings of the data:  Defective frequency partitions and non-partitive frequency problems.  Consider the following medical diagnosis problem, which presents natural frequencies under what Girotto and Gonzalez (2001, Study 5) term a defective partition: 

 

4 out of 10 people tested were infected.  3 of the 4 infected people had a positive reaction to the test.  84 of the 96 uninfected people did not have a positive reaction to the test.  Imagine that a group of people is now tested.  In a group of 100 people, one can expect ___ individuals to have a positive reaction, ___ of whom will have the infection.

 

In contrast to the standard partitioning of the data under natural frequencies, here the frequency of uninfected people who did not have a positive reaction to the test is reported, instead of the frequency of uninfected, positive reactions.  As a result, to derive the Bayesian solution, the first value must be subtracted from the total population of uninfected individuals to obtain the desired value (96 – 84 = 12), and the result can be used to determine the proportion of infected, positive people out of the total number of people who obtain a positive test (e.g., 3 / 12 = 1 / 4).  Although this problem exhibits a partitive structure, Girotto and Gonzalez predicted that the defective partitioning of the data would produce a greater proportion of errors than observed under the standard data partitioning, because the former requires an additional computation.  Consistent with this prediction, only 35 per cent of respondents generated the Bayesian solution, whereas 53 per cent did under the standard data partitioning.  Nested set relations were more likely to facilitate Bayesian reasoning when the data were partitioned into the components that are needed to generate the solution.     


Girotto and Gonzalez (2001, Study 6) also assessed performance under natural frequency formats that were not partitioned into nested set relations (i.e., unpartitioned frequencies).  As in the case of standard natural frequency format problems (e.g., Cosmides & Tooby, 1996), these multiple-sample problems employed natural frequencies and prompted the respondent to compute the two terms of the Bayesian solution[5].  Such a problem must be treated in the same way as a single-event probability problem (i.e., using the conditional probability and additivity laws) to determine the two terms of the Bayesian ratio.  Girotto and Gonzalez therefore predicted that performance under multiple samples would be poor, approximating that observed under standard probability problems.  As predicted, none of the respondents generated the Bayesian solution under the multiple sample or standard single-event probability frames.  Natural frequency formats facilitate Bayesian inference only when they partition the data into components needed to draw the Bayesian solution.


Converging evidence is provided by Macchi (2000), who presented Bayesian inference problems in either a partitive or non-partitive form. Macchi found that only 3 per cent of respondents generated the Bayesian solution when asked to evaluate the two terms of the Bayesian ratio with non-partitive frequency problems.  Similarly, only 6 per cent of the respondents generated the Bayesian solution when asked to compute a conditional probability under non-partitive probability formats (see also Sloman et al., 2003, Experiment 4).  But when presented under a partitive formulation and asked to evaluate the two terms of the Bayesian ratio the proportions increased to 40 per cent under partitive natural frequency formats, 33 per cent under partitive single-event probabilities, and 36 per cent under the modified partitive single-event probability problems. The findings reinforce the nested sets view that information structure is the factor determining predictive accuracy. 

To further explore the contribution of information structure and question form in Bayesian inference, Sloman et al. (2003) assessed performance using a conditional chance question. In contrast to the standard conditional probability question that presents information about a particular individual (e.g., “… Pierre has a positive reaction to the test”), their conditional probability question asked the respondent to evaluate “the chance that a person found to have a positive test result actually has the disease.”  This question requests the probability of an unknown category instance and therefore prompts the respondent to consult the data presented in the problem to assess the probability that this person — who could be any randomly chosen person with a positive result in the sample — has the disease.  In Experiment 1, they looked for facilitation in Bayesian inference on a partitive single-event probability problem by prompting use of the sample of category instances presented in the problem to compute a conditional probability, as the nested sets hypothesis predicts.  Forty-eight per cent of the 48 respondents tested generated the Bayesian solution, demonstrating that making partitive structure transparent facilitates Bayesian inference. 

In summary, the reviewed findings suggest that when the data are partitioned into the components needed to arrive at the solution and participants are prompted to use the sample of category instances in the problem to compute the two terms of the Bayesian ratio, the respondent is more likely to (1) understand the question, (2) see the underlying nested set structure by partitioning the data into exhaustive subsets, and (3) select the pieces of evidence that are needed for the solution. According to the nested sets theory, accurate probability judgments derive from the ability to perform elementary set operations whose computations are facilitated by external cues.  Facilitation does not require prompting to compute the two terms of the Bayesian ratio, but by any cue that increases the transparency of the relevant set relations. 

 

2.4.  Diagrammatic representations

Sloman et al. (2003, Experiment 2) explored whether Euler circles, which were employed to construct a nested set structure for standard non-partitive single-event probability problem (e.g., Cosmides & Tooby, 1996), would facilitate Bayesian inference (see Figure 1).  These authors found that 48 per cent of the 25 respondents tested generated the Bayesian solution when presented non-partitive single-event probability problems with an Euler diagram that depicted the underlying nested set relations.  This finding demonstrates that the set structure of standard non-partitive single-event probability problems can be represented by Euler diagrams to produce facilitation. Supporting data can be found in Yamagishi (2003) who used diagrams to make nested set relations transparent in other inductive reasoning problems.  Similar evidence is provided by Bauer and Johnson-Laird (1993) in the context of deductive reasoning. 

 

 

 

 

 

2.5.  Accuracy of frequency judgments

Theories based on natural frequency representations (i.e., the mind-as-Swiss-army-knife, natural frequency algorithm, natural frequency heuristic, and non-evolutionary natural frequency heuristic theories) propose that “the mind is a frequency monitoring device” and that the cognitive algorithm that computes the Bayesian ratio encodes and processes event frequencies in naturalistic settings (Gigerenzer, 1993, p. 300).  The literature that evaluates the encoding and retrieval of event frequencies is large and extensive and includes assessments of frequency judgments under well-controlled laboratory settings based on relatively simple and distinct stimuli (e.g., letters, pairs of letters, or words), and naturalistic settings in which respondents report the frequency of their own behaviors (e.g., the medical diagnosis of patients).  Laboratory studies tend to find that frequency judgments are surprisingly accurate (see Zacks & Hasher, 2002, for a recent review), whereas naturalistic studies often find systematic errors in frequency judgments (see Bradburn et al., 1987).  Recent efforts have been made to integrate these findings under a unified theoretical framework (e.g., Sedlmeier & Betsch, 2002; Schwartz & Sudman, 1994; Schwartz & Wanke, 2002;). 

Are frequency judgments relatively accurate under the naturalistic settings described by standard Bayesian inference problems?  Bayesian inference problems tend to involve hypothetical situations that, if real, would be based on autobiographical memories encoded under naturalistic conditions, such as the standard medical diagnosis problem in which a particular set of patients is hypothetically encountered (cf. Sloman & Over, 2003).  Thus, the present review focuses on the accuracy of frequency judgments for the autobiographical events alluded to by standard Bayesian inference problems (see Sections 2.1, 2.2, and 2.3) to assess whether Bayesian inference depends on the accurate encoding of autobiographical events. 

Gluck and Bower (1988) conducted an experiment that employed a learning paradigm to assess the accuracy of frequency judgments in medical diagnosis.  The respondent learned to diagnose a rare (25%) or a common (75%) disease on the basis of four potential symptoms exhibited by the patient (e.g., stomach cramps, discolored gums).  During the learning phase, the respondent diagnosed 250 hypothetical patients and in each case was provided feedback on the accuracy of their diagnosis.  After the learning phase, the respondent estimated the relative frequency of patients who had the diseases given each symptom.  Gluck and Bower found that relative frequency estimates of the disease were determined by the diagnosticity of the symptom (the degree to which the respondent perceived that the symptom provided useful information in diagnosing the disease) and not the base-rate frequencies of the disease.  These findings were replicated by Estes, Campbell, Hatsopoulos, and Hurwirtz (1989, Experiment 1) and Nosofsky, Kruschke, and McKinley (1992, Experiment 1).

Bradburn, Rips, and Shevell (1987) evaluated the accuracy of autobiographical memory for event frequencies by employing a range of surveys that assessed quantitative facts, such as “During the last 2 weeks, on days when you drank liquor, about how many drinks did you have?” These questions require the simple recall of quantitative facts, in which the respondent “counts up how many individuals fall within each category” (Cosmides & Tooby, 1996, p. 60).  Recalling the frequency of drinks consumed over the last 2 weeks, for example, is based on counting the total number of individual drinking occasions stored in memory. 

Bradburn et al. (1987) found that autobiographical memory for event frequencies exhibits systematic errors characterized by (a) the failure to recall the entire event or the loss of details associated with a particular event (e.g., Linton, 1975, Wagenaar, 1986), (b) the combining of similar distinct events into a single generalized memory (e.g., Linton, 1975, 1982), or (c) the inclusion of events that did not occur within the reference period specified in the question (e.g., Pillemer et al., 1986).  As a result, Bradburn et al. propose that the observed frequency judgments do not reflect the accurate encoding of event frequencies, but instead entail a more complex inferential process that typically operates on the basis of incomplete, fragmentary memories that do not preserve base-rate frequencies. 

These findings suggest that the observed facilitation in Bayesian inference under natural frequencies cannot be explained by an (evolved) capacity to encode natural frequencies.  Apparently, people don’t have that capacity.

 

2.6.  Comprehension of formats

Advocates of the nested sets view have argued that the facilitation of Bayesian inference under natural frequencies can be fully explained via a simple computation that delivers the same result as Bayes’ theorem that is afforded by transparent nested set relations, without appealing to (an evolved) capacity to process natural frequencies (e.g., Johnson-Laird et al., 1999).  The question therefore arises whether the ease of processing natural frequencies goes beyond the reduction in computational complexity of Bayes’ theorem that they provide (Brase, 2002a).  To assess this issue, we review evidence that evaluates whether natural frequencies are understood more easily than single-event probabilities.

Brase (2002b) conducted a series of experiments to evaluate the relative clarity and ease of understanding a range of statistical formats, including natural frequencies (e.g., 1 out of 10) and percentages (e.g., 10%).  Brase distinguished natural frequencies that have a natural sampling structure (e.g., 1 out 10 have the property, 9 out of 10 do not) from “simple frequencies” that refer to single numerical relations (e.g., 1 out of 10 have the property).  This distinction, however, is not entirely consistent with the literature as natural frequency theorists have often used single numerical statements for binary hypotheses to express natural frequencies (e.g., Gigerenzer & Hoffrage, 1995).  In any case, for binary hypotheses the natural sampling structure can be directly inferred from simple frequencies.  If we observe, for example, that I win the weekly poker game “1 out of 10 nights,” we can infer that I lose “9 out of 10 nights” and construct a natural sampling structure that represents the size of the reference class and is arranged into subset relations.  Thus, single numerical statements of this type have a natural sampling structure and therefore we refer to Brase’s “simple frequencies” as natural frequencies in the following discussion.  

Percentages express single-event probabilities in that they are normalized to an arbitrary reference class (e.g., 100) and can refer to the likelihood of a single-event (Brase, 2002b, Gigerenzer & Hoffrage, 1995).  We therefore examine whether natural frequencies are understood more easily and have a greater impact on judgment than percentages. 

To test this prediction, Brase (2002b, Experiment 1) assessed the relative clarity of statistical information presented in a natural frequency format versus percentage format at small, intermediate, and large magnitudes.  Each respondent received four statements in one statistical format, each at a different magnitude, and rated the clarity, impressiveness, and “monetary pull” of the presented statistics according to a 5-point scale.  Example questions are shown in Table 4.

 

Table 4.  Example questions presented by Brase (2002b)                                                 ______

 

Statement

It is estimated that by the year 2020, 1 of every 100 Americans will have been exposed to Flu strain X [natural frequency format of low magnitude]

 

It is estimated that by the year 2020, 33 % of all Americans will have been exposed to Flu strain X [single-event probability of intermediate magnitude]

 

Questions

1.       How clear and easy to understand is the statistical information presented in the above sentence?  [Clarity rating]

2.       How serious do you think the existence of virus X is [Impressiveness rating]

3.       If you were in charge of the annual budget for the U.S Department of Health, how much of every $100 would you dedicate to dealing with virus X?  ___ out of every $100 [Monetary pull rating]

______________________________________________________________________________           

 

Brase (2002b) found that across all statements and magnitudes both natural frequencies and percentages were rated as “Very Clear,” with average ratings of 3.98 and 3.89, respectively.  These ratings were not reliably different, demonstrating that percentages are perceived as clearly and are as understandable as natural frequencies.  Furthermore, Brase found no reliable differences in the impressiveness ratings (from question 2) of natural frequencies and percentages at intermediate and large statistical magnitudes, suggesting that these formats are typically viewed as equally impressive.  A significant difference between these formats was observed, however, at low statistical magnitudes:  On average, natural frequencies were rated as “Impressive,” whereas percentages were viewed as “Fairly Impressive.”  The observed difference in the impressiveness ratings at low statistical magnitudes did not accord with the respondent’s monetary pull ratings, their willingness to allocate funds to support research studying the issue at hand, which were approximately equal for the two formats across all statements and magnitudes, hence the difference in the impressiveness ratings at low magnitudes does not denote differences in people’s willingness to act.

These data are consistent with the conclusion that percentages and natural frequency formats (a) are perceived equally clearly and are equally understandable, (b) are typically viewed as equally impressive (i.e., at intermediate and large statistical magnitudes), and (c) have the same degree of impact on behavior.  Natural frequency formats do apparently increase the perceptual contrast of small differences.  Overall, however, the two formats are perceived similarly, suggesting that the mind is not designed to process natural frequency formats over single-event probabilities.

 

2.7.    Are base-rates and likelihood ratios equally weighted?

Does the facilitation of Bayesian inference under natural frequencies entail that the mind naturally incorporates this information according to Bayes’ theorem or that elementary set operations can be readily computed from problems that are structured in a partitive form?  Natural frequencies preserve the sample size of the reference class and are arranged into subset relations that preserve the base-rates.  As a result, judgments based on these formats will entail the sample and effect sizes; the respondent need not calculate them.  To assess whether the cognitive operations that underlie Bayesian inference are consistent with the application of Bayes’ theorem, studies that evaluate how the respondent derives Bayesian solutions are reviewed. 

Griffin and Buehler (1999) employed the classic lawyer-engineer paradigm developed by Kahneman and Tversky (1973) involving personality descriptions randomly drawn from a population of either 70 engineers and 30 lawyers or 30 engineers and 70 lawyers.  Participants’ task is to predict whether the description was taken from an engineer or a lawyer (e.g., “My probability that this man is one of the engineers in this sample is ___%”). Kahneman and Tversky’s original findings demonstrated that the respondent consistently relied upon category properties to guide their judgment (i.e., how representative the personality description is of an engineer or a lawyer) without fully incorporating information about the population base-rates (for a review see Koehler, 1996).  However, when the base-rates were presented via a counting procedure that induces a frequentist representation of each population and the respondent is asked to generate a natural frequency prediction (e.g., “I would expect that ___ out of the 10 descriptions would be engineers”), base-rate usage increased (Gigerenzer et al., 1988). 

To assess whether the observed increase in base-rate usage reflects the operation of a Bayesian algorithm that is designed to process natural frequencies, Griffin and Buehler (1999) evaluated whether participants derived the solution by utilizing event frequencies according to Bayes’ theorem.  This was accomplished by first collecting estimates of each of the components of Bayes’ theorem in odds form[6]:  Respondents estimated (a) the probability that the personality description was taken from the population of engineers or lawyers, (b) the degree to which the personality description was representative of these populations, and (c) the perceived population base-rates.  Each of these estimates was then divided by their compliment to yield the posterior odds, likelihood ratio, and prior odds, respectively. Theories based on the Bayesian ratio predict that under frequentist representations the likelihood ratios and prior odds will be weighted equally (Griffin & Buehler, 1999).


Griffin and Buehler evaluated this prediction by conducting a regression analysis using the respondent’s estimated likelihood ratios and prior odds to predict their posterior probability judgments (cf. Keren & Thijs, 1996).  Consistent with the observed increase in base-rate usage under frequentist representations (Gigerenzer et al., 1988), Griffin and Buehler (1999, Experiment 3b) found that the prior odds (i.e., the base-rates) were weighted more heavily than the likelihood ratios, with corresponding regression weights (β values) of 0.62 and 0.39.  The failure to weight them equally violates Bayes’ theorem.  Although frequentist representations may enhance base-rate usage, they apparently do not induce the operation of a mental analogue of Bayes’ theorem. 

Further support for this conclusion is provided by Evans, Handley, Over, and Perham (2002) who conducted a series of experiments demonstrating that probability judgments do not reflect equal weighting of the prior odds and likelihood ratio.  Evans et al., (2002, Experiment 5) employed a paradigm that extended the classic lawyer-engineer experiments by assessing Bayesian inference under conditions where the base-rates are supplied by commonly held beliefs and only the likelihood ratios are explicitly provided.  These authors found that when prior beliefs about the base-rate probabilities were rated immediately before the presentation of the problem, the prior odds (i.e., the base-rates) were weighted more heavily than the likelihood ratios, with corresponding regression weights (β values) of 0.43 and 0.19.

Additional evidence supporting this conclusion is provided by Kleiter, Krebs, Doherty, Garavan, Chadwick, and Blake (1997) who found that participants assessing event frequencies in a medical diagnosis setting employed statistical evidence that is irrelevant to the calculation of Bayes’ theorem.  Kleiter et al. (1997, Experiment 1) presented a list of event frequencies to respondents, which included those that were necessary for the calculation of Bayes’ theorem (e.g., Pr(DH)) and other statistics that were irrelevant (e.g., Pr(~D)).  Participants were then asked to identify the event frequencies that were needed to diagnose the probability of the disease given the symptom (i.e., the posterior probability).  Of the 4 college faculty and 26 graduate students tested, only 3 made the optimal selection by identifying only the event frequencies required to calculate Bayes’ theorem.

These data suggest that the mind does not utilize a Bayesian algorithm that “maps frequentist representations of prior probabilities and likelihoods onto a frequentist representation of a posterior probability in a way that satisfies the constraints of Bayes’ theorem” (Cosmides & Tooby, 1996, p. 60).   Importantly, the findings that the prior odds and likelihood ratio are not equally weighted according to Bayes’ theorem (Griffin & Buehler, 1999; Evans et al., 2002) imply that Bayesian inference does not rely on Bayesian computations per se. 

Thus, the findings are inconsistent with the mind-as-Swiss-army-knife, natural frequency algorithm, natural frequency heuristic, and non-evolutionary natural frequency heuristic theories, which propose that coherent probability judgment reflects the use of the Bayesian ratio.  The finding that base-rate usage increases under frequentist representations (Griffin & Buehler, 1999; Evans et al., 2002) supports the proposal that the facilitation in Bayesian inference from natural frequency formats is due to the property of these formats to induce a representation of category instances that preserves the sample and effect sizes and, as a consequence, clarifies the underlying set structure of the problem, making the relevance of base-rates more obvious without providing an equation that generates Bayesian quantities.

 

2.8.  Convergence with disparate data

A unique characteristic of the dual process position is that it predicts that nested sets should facilitate reasoning whenever people tend to rely on associative rather than extensional, rule-based processes; facilitation should be observed beyond the context of Bayesian probability updating.  The natural frequency theories expect facilitation only in the domain of probability estimation.

In support of the nested sets position, facilitation through nested set representations has been observed in a number of studies of deductive inference.  Grossen and Carnine (1990) and Monaghan and Stenning (1998) report significant improvement in syllogistic reasoning when participants were taught using Euler circles.  The effect was restricted to participants that are ‘learning impaired’ (Grossen & Carnine, 1990) or have a low GRE score (Monaghan & Stenning, 1998).  Presumably those that did not show improvement did not require the Euler circles because they were already representing the nested set relations.

Newstead (1989; Experiment 2) evaluated how participants interpreted syllogisms when represented by Euler circles versus quantified statements.  Newstead found that although Gricean errors of interpretation occurred when syllogisms were represented by Euler circles and quantified statements, the proportion of conversion errors, such as converting “Some A are not B” to “Some B are not A,” was significantly reduced in the Euler circle task.  For example, less than 5% of the participants generated a conversion error for “Some… not” on the Euler circle task, whereas this error occurred on 90% of the responses for quantified statements. 

Griggs and Newstead (1982) tested participants on the THOG problem, a difficult deductive reasoning problem involving disjunction.   They obtained a substantial amount of facilitation by making the problem structure explicit using trees.  According to the authors, the structure is normally implicit due to negation and the tree structure facilitates performance by cuing formation of a mental model similar to that of nested sets.

Facilitation has also been obtained by making extensional relations more salient in the domain of categorical inductive reasoning.  Sloman (1998) found that people who were told that all members of a superordinate have some property, e.g., all flowers are susceptible to thrips, did not conclude that all members of one of its subordinates inherited the property, e.g., they did not assert that this guaranteed that all roses are susceptible to thrips.  This was true even for those people who believed that roses are flowers.  But if the assertion that roses are flowers was included in the argument, then people did abide by the inheritance rule, assigning a probability of 1 to the statement about roses.  Sloman argued that this occurred because induction is mediated by similarity and not by class inclusion unless the categorical – or set – relation is made transparent within the statement composing the argument (for an alternative interpretation, see Calvillo & Revlin, 2005).

Facilitation in other types of probability judgment can also be obtained by manipulating the salience and structure of set relations.  Sloman et al. (2003) found that almost no one exhibited the conjunction fallacy when the options were presented as Euler circles, a representation that makes set relations explicit.  Fox and Levav (2004) and Johnson-Laird et al. (1999) also improved judgments on probability problems by manipulating the set structure of the problem.

 

2.9.  Empirical summary and conclusions

In summary, the empirical review supports five main conclusions.  First, the facilitory effect of natural frequencies on Bayesian inference varied considerably across the reviewed studies (see Table 3), potentially resulting from differences in the general intelligence level and motivation of participants (Brase et al., in press).  These findings support the nested sets hypothesis to the degree that intelligence and motivation reflect the operation of domain general and strategic—rather than automatic (i.e., modular) cognitive processes. 

Second, questions that prompt use of category instances and divide the solution into the sets needed to compute the Bayesian ratio facilitate probability judgment, suggesting that facilitation depends on cues to the set structure of the problem rather than (an evolved) capacity to process natural frequencies.  In further support of this conclusion, partitioning the data into nested sets facilitates Bayesian inference regardless of whether natural frequencies or single-event probabilities are employed (see Table 5). 

Third, frequency judgments are guided by inferential strategies that reflect incomplete, fragmentary memories that do not entail the base-rates (e.g., Gluck & Bower, 1988; Bradburn et al., 1987), suggesting that Bayesian inference does not derive from the accurate encoding and retrieval of natural frequencies.  In addition, natural frequencies and single-event probabilities are rated similarly in their perceived clarity, understandability, and impact on the respondent’s behavior (Brase, 2002b), further suggesting that the mind does not embody inductive reasoning mechanisms (that are designed) to process natural frequencies. 

Fourth, people (a) do not accurately weight and combine event frequencies, and (b) utilize event frequencies that are irrelevant in the calculation of Bayes’ theorem (e.g., Griffin & Buehler, 1999; Kleiter et al., 1997), suggesting that the cognitive operations that underlie Bayesian inference do not conform to Bayes’ theorem.  Furthermore, base-rate usage increases under frequentist representations (e.g., Griffin & Buehler, 1999), suggesting that facilitation results from the property of natural frequencies to represent the sample and effect sizes, which highlight the set structure of the problem and make transparent what is relevant for problem solving.

 

 

 

Table 5.  Percent correct for Bayesian inference problems reported in the literature (sample sizes in parentheses)

 

                                                                             Information structure                       

                                                           

         Non-partitive                                            Partitive                                                                       

Study                                                    Probability         Frequency         Probability    Frequency

 

Girotto & Gonzalez (2001, Study 5)         ---                   ---                    ---                    53 (20)

 

Girotto & Gonzalez (2001, Study 6)         0 (20)               0 (20)               ---                    ---

 

Macchi (2000)                                       6 (30)*              3 (30)               36 (30)              40 (30)

 

Sloman et al., (2003, Exp. 1)                   20 (25)*            ---                    48 (48)*            51 (45)             

Sloman et al., (2003, Exp. 2)                   ---                    ---                    48 (25)*            ---

 

Sloman et al., (2003, Exp. 4)                   ---                    21 (33)              ---                    ---

 

Note.  Studies that present questions that require the respondent to compute a conditional-event probability are indicated by *.  The remaining studies present questions that prompt the respondent to compute the two terms of the Bayesian solution.

 

Finally, nested set representations facilitate reasoning in a range of classic deductive and inductive reasoning tasks, supporting the nested set hypothesis that the mind embodies a domain general capacity to perform elementary set operations and that these operations can be induced by cues to the set structure of the problem to facilitate reasoning in any context where people tend to rely on associative rather than extensional, rule-based processes.  

 

3.0.  Conceptual Issues

This section provides a conceptual analysis that addresses (1) the plausibility of the natural frequency assumptions, and (2) whether natural frequency representations support properties that are central to human inductive reasoning competence, including reasoning about statistical independence, estimating the probability of unique events, and reasoning on the basis of similarity, analogy, association, and causality. 

 

3.1.  Plausibility of natural frequency assumptions

The natural sampling framework was established by the seminal work of Kleiter (1994), who assessed “the correspondence between the constraints of the statistical model of natural sampling on the one hand, and the constraints under which human information is acquired on the other" (p. 376).  Kleiter proved that under natural sampling and other conditions (e.g., independent identical sampling), the frequencies corresponding to the base-rates are redundant and can be ignored. Thus conditions of natural sampling can simplify the calculation of the relevant probabilities and, as a consequence, facilitate Bayesian inference (see footnote 2).  Kleiter’s computational argument does not appeal to evolution and was advanced with careful consideration of the assumptions upon which natural sampling are based.  Kleiter noted, for example, that the natural sampling framework (a) is limited to hypotheses that are mutually exclusive and exhaustive, and (b) depends on collecting a sufficiently large sample of event frequencies to reliably estimate population parameters.

Although people may sometimes treat hypotheses as mutually exclusive (e.g., “this person is a Democrat, so they must be anti-business”), this constraint is not always satisfied:  Many hypotheses are nested (e.g., “she has breast cancer” vs. “she has a particular type of breast cancer”) or overlapping (e.g., “this patient is anxious or depressed”).  People’s causal models typically provide a wealth of knowledge about classes and properties, allowing consideration of many kinds of hypotheses that do not necessarily come in mutually exclusive, exhaustive sets.  As a consequence, additional principles are needed to broaden the scope of the natural sampling framework to address probability estimates drawn from hypotheses that are not mutually exclusive and exhaustive.  In this sense, the nested sets theory is more general:  It can represent nested and overlapping hypotheses by taking the intersection (e.g., “she has breast cancer and it is type X”) and union (e.g., “the patient is anxious or depressed) of sets, respectively.

As Kleiter further notes, inferences about hypotheses from encoded event frequencies are warranted to the extent that the sample is sufficiently large and provides a reliable estimate of the population parameters.  The efficacy of the natural sampling framework therefore depends on establishing (1) the approximate number of event frequencies that are needed for a reliable estimate, (2) whether this number is relatively stable or varies across contexts, and (3) whether or not people can encode and retain the required number of events.

 

3.2.  Representing qualitative relations

In contrast to single-event probabilities, natural frequencies preserve information about the size of the reference class and, as a consequence, do not directly indicate whether an observation and hypothesis are statistically independent.  For example, probability judgments drawn from natural frequencies do not tell us that a symptom present in (a) 640 out of 800 patients with the disease and (b) 160 out of 200 patients without the disease is not diagnostic because 80% have the symptom in both cases (Over 2000a, 2000b; Over & Green, 2001; Sloman & Over, 2003).  Thus, probability estimates drawn from natural frequencies do not capture important qualitative properties. 

Furthermore, in contrast to the cited benefits of non-normalized representations (e.g., Gigerenzer & Hoffrage, 1995), normalization may serve to simplify a problem.  For example, is someone offering us the same proportion if he tries to pay us back with 33 out of 47 nuts he has gathered (i.e., 70%), after we have earlier given him 17 out of 22 nuts we have gathered (i.e., 77%)?  This question is trivial after normalization, as its transparent that 70 out of 77 out of 100 are nested sets (Over, in press).

 

3.3.  Reasoning about unique events and associative processes

One objection to the claim that the encoding of natural frequencies supports Bayesian inference is that intuitive probability judgment (a) often concerns beliefs regarding single events or (b) the assessment of hypotheses about novel or partially novel contexts, for which prior event frequencies are unavailable.  For example, the estimated likelihoods of specific outcomes are often based on novel and unique one-time events, such as the likelihood that a particular constellation of political interests will lead to a coalition.  Thus, Kahneman and Tversky (1996, p. 589) argue that the subjective degree of belief in hypotheses derived from single events or novel contexts “cannot be generally treated as a random sample from some reference population, and their judged probability cannot be reduced to a frequency count.”  

Furthermore, theories based on natural frequency representations do not allow for the widely observed role of similarity, analogy, association, and causality in human judgment (for recent reviews of the contribution of these factors, see Gilovich et al., 2002 and Sloman, 2005).  The nested sets hypothesis presupposes these determinants of judgment by appealing to a dual-process model of judgment (Evans & Over, 1996; Sloman, 1996; Stanovich & West, 2000), a move that natural frequency theorists are not (apparently) willing to make (Gigerenzer & Regier, 1996).  The dual-process model attributes responses based on associative principles like similarity or responses based on retrieval from memory like analogy to a primitive associative judgment system.  It attributes responses based on more deliberative processing involving rule-based inference such as the elementary set operations that respect the logic of set inclusion and facilitate Bayesian inference to a second deliberative system.  However, this second system is not limited to analyzing set relations.  It can also, under the right conditions, do the kinds of structural analyses required by analogical or causal reasoning.

Within this framework, natural frequency approaches can be viewed as making claims about rule-based processes (i.e., the application of a psychological plausible rule for calculating Bayesian probabilities), without addressing the role of associative processes in Bayesian inference.  In light of the substantial literatures that demonstrate the role of associative processes in human judgment, Kahneman and Tversky (1996, p. 589) conclude, “there is far more to inductive reasoning and judgment under uncertainty than the retrieval of learned frequencies.”

 

4.0.  Summary and Conclusions

The conclusions drawn from the diverse body of empirical and conceptual issues addressed by this review consistently challenge theories of Bayesian inference that depend on natural frequency representations (see Table 2), demonstrating that coherent probability estimates are not derived according to an equational form for calculating Bayesian posterior probabilities that requires the use of natural frequency representations.  

The evidence instead supports the nested sets hypothesis that judgmental errors and biases are attenuated when Bayesian inference problems are represented in a way that reveals underlying set structure, thus demonstrating that the cognitive capacity to perform elementary set operations constitutes a powerful means of reducing associative influences and facilitating probability estimates that conform to Bayes’ theorem.  An appropriate representation can induce people to substitute reasoning by rules for reasoning by association.  In particular, the review demonstrates that judgmental errors and biases were attenuated when (a) the question induced an outside view by prompting the respondent to utilize the sample of category instances presented in the problem and (b) the sample of category instances were represented in a nested set structure that partitioned the data into the components needed to compute the Bayesian solution. 

Although we disagree with the various theoretical interpretations one could attribute to natural frequency theorists regarding the architecture of mind, we do believe that they have focused on and enlightened us about an important phenomenon.  Frequency formulations are a highly efficient way to obtain drastically improved reasoning performance in some cases.  Not only is this an important insight to improve and teach reasoning, but it also focuses theorists on a deep and fundamental problem:  What are the conditions that compel people to overcome their natural associative tendencies in order to reason extensionally?

 

Acknowledgements

This work was supported by National Science Foundation Grants DGE-0536941 and DGE-0231900 to Aron K. Barbey.  We are grateful to Gary Brase, Jonathan Evans, Vittorio Girotto, Philip Johnson-Laird, Gernot Kleiter, and David Over for their very helpful comments on prior drafts of this paper.  AKB would also like to thank Lawrence W. Barsalou, Sergio Chaigneau, Brian R. Cornwell, Pablo A. Escobedo, Shlomit R. Finkelstein, Corey Kallenberg, Robert N. McCauley, Richard Patterson, Diane Pecher, Philippe Rochat, Ava Santos, W. Kyle Simmons, Irwin Waldman, Christine D. Wilson, and Phillip Wolff for their encouragement and support while writing this paper. 

 

References

Ayton, P. & Wright, G. (1994).  Subjective probability: What should we believe.  In G.

Wright & P. Ayton (Eds.), Subjective Probability (pp. 163-183).  Chichester, UK: Wiley.

Barwise, J. & Etchemendy, J. (1989).  Model-theoretic semantics.  In Posner M. (Ed.)

Foundations of cognitive science (pp. 207-243).

Bauer, M.I. & Johnson-Laird, P.N. (1993).  How diagrams can improve reasoning. 

Psychological Science, 4, 372-378.

Bradburn, N.M., Rips, L.J. & Shevell, S.K. (1987).  Answering autobiographical

questions: The impact of memory and inference on surveys.  Science, 236, 157-

161.

Brase, G. (2002a).  Ecological and evolutionary validity: Comments on Johnson-Laird,

Legrenzi, Girotto, Legrenzi, and Caverni’s (1999) mental-model theory of

extensional reasoning.  Psychological Review, 109, 722-728.

Brase, G. (2002b).  Which statistical formats facilitate what decisions?  The perception

and influence of different statistical information formats.  Journal of Behavioral

Decision Making, 15, 381-401.

Brase, G., Fiddick, L. & Harries, C. (in press).  Participant recruitment methods and

statistical reasoning performance.  The Quarterly Journal of Experimental

Psychology (in press).

Brown, N.R., Rips, L.J. & Shevell, S.K. (1985).  The subjective dates of natural events

in long-term memory.  Cognitive Psychology, 17, 139-177.

Calvillo, D.P. & Revlin, R. (2005).  The role of similarity in deductive categorical

inference.  Psychonomic Bulletin and Review, 12, 938–944.

Casscells, W., Schoenberger, A. & Graboys, T.B. (1978).  Interpretation by physicians

of clinical laboratory results.  The New England Journal of Medicine, 299, 999-

1000.

Cosmides, L. & Tooby, J. (1996).  Are humans good intuitive statisticians after all? 

Rethinking some conclusions from the literature on judgment under uncertainty. 

Cognition, 58, 1-73.

Eddy, D.M. (1982).  Probabilistic reasoning in clinical medicine: Problems and

opportunities.  In D. Kahneman, P. Slovic, & A. Tversky (Eds.), Judgment Under

Uncertainty: Heuristics and Biases (pp. 249-267).  Cambridge, England:

Cambridge University Press.

Estes, W.K., Campbell, J. A., Hatsopoulos, N. & Hurwitz, J.B. (1989).  Base-rate

effects in category learning: A comparison of parallel network and memory

storage-retrieval models.  Journal of Experimental Psychology: Learning,

Memory, and Cognition, 15, 556-571.

Evans, J.St.B.T., Handley, S.J., Perham, N, Over, D.E. & Thompson, V.A. (2000). 

Frequency versus probability formats in statistical word problems.  Cognition, 77,

197-213.

Evans, J.St.B.T., Handley, S. J., Over, D. E. & Perham, (2002).  Background beliefs in

Bayesian inference.  Memory and Cognition, 30, 179-190.

Evans, J.St. B. T. & Over, D. E., (1996).  Rationality and reasoning.  Hove: Psychology

Press.

Fodor, J.A. (1983).  Modularity of Mind.  Cambridge, MA:  The MIT Press.

Fox, C. & Levav, J. (2004).  Partition-edit-count:  Naïve extensional reasoning in

judgment of conditional probability.  Journal of Experimental Psychology: 

General, 133, 626-642.   

Gigerenzer, G. (1991).  How to make cognitive illusions disappear: Beyond “heuristics

and biases.”  European Review of Social Psychology, 2, 83-115.

Gigerenzer, G. (1993).  The superego, the ego, and the id in statistical reasoning.  In G.

Keren & G. Lewis (Eds.), A Handbook of Data Analysis in the Behavioral

Sciences (pp. 331-339). Hilsdale, NJ: Erlbaum. 

Gigerenzer, G. (1996a).  On narrow norms and vague heuristics: A reply to Kahneman

and Tversky (1996).  Psychological Review, 103, 592-596.

Gigerenzer, G. (1996b).  The psychology of good judgment: Frequency formats and

simple algorithms.  Medical Decision Making, 16, 273-280).

Gigerenzer, G. (1998).  Ecological intelligence: An adaptation for frequencies.  In D. D.

Cummins & C. Allens (Eds.), The Evolution of Mind.  New York, NY: Oxford

University Press.

Gigerenzer, G. (2000).  Adaptive Thinking: Rationality in the Real World.  New York,

NY: Oxford University Press.

Gigerenzer, G. (2002).  Calculated Risks.  New York, NY: Simon and Schuster Press.

Gigerenzer, G. (2006).  Center for Adaptive Behavior and Cognition summary of

research area II:  Ecological rationality.  Retrieved October 1, 2006, from the

Center for Adaptive Behavior and Cognition Web site:  http://www.mpib-

berlin.mpg.de/en/forschung/abc/forschungsfelder/feld2.htm

Gigerenzer, G., Hell, W. & Blank, H. (1988).  Presentation and content: The use of base-

rates as a continuous variable.  Journal of Experimental Psychology: Human

Perception and Performance, 14, 513-525.

Gigerenzer, G. & Hoffrage, U. (1995).  How to improve Bayesian reasoning without

instruction:  Frequency formats.  Psychological Review, 102, 684-704.

Gigerenzer, G. & Regier, T.P. (1996).  How do we tell an association from a rule?

Psychological Bulletin, 119, 23-26.

Gigerenzer, G. & Selten, R. (Eds.) (2001).  Bounded Rationality: The Adaptive Toolbox. 

Cambridge, MA: MIT Press.

Gigerenzer, G., Todd, P. & the ABC research group (1999).  Simple Heuristics That

Make Us Smart.  New York: NY:  Oxford University Press.

Gilovich, T., Griffin, D. & Kahneman, D. (Eds.) (2002).  Heuristics and Biases: The

Psychology of Intuitive Judgment.  Cambridge, England: Cambridge University

Press. 

Girotto, V. & Gonzalez, M. (2001).  Solving probabilistic and statistical problems: A

matter of information structure and question form.  Cognition, 78, 247-276.

Girotto, V. & Gonzalez, M. (2002).  Chances and frequencies in probabilistic reasoning:

rejoinder to Hoffrage, Gigerenzer, Krauss, and Martignon.  Cognition, 84, 353-

359.

Gluck, M.A. & Bower, G.H. (1988).  From conditioning to category learning: An

adaptive network model.  Journal of Experimental Psychology: General, 117,

227-247.

Griffin, D. & Buehler, R. (1999).  Frequency, probability, and prediction: Easy solutions

to cognitive illusions?  Cognitive Psychology, 38, 48-78.Griggs, R.  A., and 

Griggs, R. & Newstead, S. (1982).  The role of problem structure in a deductive

reasoning task.  Journal of Experimental Psychology:  Learning, Memory,

and Cognition, 8, 297-307. 

Grossen, B. & Carnine, D. (1990).  Diagramming a logic strategy:  Effects on difficult

problem types and transfer.  Learning Disability Quarterly 13: 168-182.

Hammerton, M. (1973).  A case of radical probability estimation.  Journal of

Experimental Psychology, 101, 252-254.

Hertwig, R. & Gigerenzer, G. (1999).  The ‘conjunction fallacy’ revisited: How

intelligent inferences look like reasoning errors.  Journal of Behavioral Decision

Making, 12, 275-305.

Hoffrage, U., Gigerenzer, G., Krauss, S. & Martignon, L. (2002).  Representation

facilitates reasoning: What natural frequencies are and what they are not. 

Cognition, 84, 343-352.

Johnson-Laird, P.N., Legrenzi, P., Girotto, V., Sonino Legrenzi, M. & Caverni, J.

(1999).  Naïve probability: A mental model theory of extensional reasoning. 

Psychological Review, 106, 62-88.

Kahneman, D. & Frederick, S. (2002).  Representativeness revisited: Attribute

substitution in intuitive judgment.  In T. Gilovich, D. Griffin, & D. Kahneman,

(Eds.).  Heuristics and Biases: The Psychology of Intuitive Judgment. 

Cambridge, England: Cambridge University Press. 

Kahneman, D. & Frederick, S. (2005).  A model of heuristic judgment.  In K. J. Holyoak

& R. G. Morris (Eds.), The Cambridge Handbook of Thinking and Reasoning. 

Cambridge University Press, 267-293. 

Kahneman, D. & Tversky, A. (1973).   On the psychology of prediction.  Psychological

Review, 80, 237-251.

Kahneman, D. & Tversky, A. (1983).  Can rationality be intelligently discussed? 

Behavioral and Brain Sciences, 6, 509-510.  

Kahneman, D. & Tversky, A. (1996).  On the reality of cognitive illusions. 

Psychological Review, 103, 582-591.

Kahneman, D., Slovic, P. & Tversky, A. (1982).  Judgment Under Uncertainty:

Heuristics and Biases.  Cambridge, England: Cambridge University Press. 

Keren, K. & Thijs, L.J. (1996).  The base-rate controversy:  Is the glass half-full or half

empty?, Behavioral and Brain Sciences, 19, 26.

Kleiter, G.D. (1994).  Natural sampling: Rationality without base-rates.  In G.H. Fischer

& D. Laming (Eds.) Contributions to Mathematical Psychology, Psychometrics,

and Methodology (pp. 375-388).  New York, NY: Springer-Verlag press. 

Kleiter, G.D., Krebs, M., Doherty, M.E., Gavaran, H., Chadwick, R. & Brake, G.

(1997).  Do subjects understand base-rates?  Organizational Behavior and Human

Decision Processes, 72, 25-61.

Koehler, J.J. (1996).  The base-rate fallacy reconsidered: Descriptive, normative, and

methodological challenges.  Behavioral and Brain Sciences, 19, 1-53.

Kurzenhauser, S. & Hoffrage, U. (2002).  Teaching Bayesian reasoning: An evaluation

of a classroom tutorial for medical students.  Medical Teacher, 24, 516-521.

Lagnado, D. & Sloman, S.A., (2004). Inside and outside probability judgment.  D.J.

Koehler and N. Harvey (Eds.) Blackwell Handbook of Judgment and

Decision Making, pp.  157-176.  Oxford, UK: Blackwell Publishing.

Lichtenstein, S., Slovic, P., Fischoff, B., Layman, M. & Combs, B.  (1978).  Judged

frequency of lethal events.  Journal of Experimental Psychology: Human

Learning and Memory, 4, 551-578.

Lindsey, S., Hertwig, R. & Gigerenzer, G. (2003).  Communicating statistical DNA

evidence.  Jurimetrics, 43, 147-163.

Linton, M. (1975).  Memory for real-world events.  In Norman, D.A.,  Rumelhart, D.E.

(Eds.) Explorations in Cognition.  (pp. 376-404).  San Francisco: Freedman Press.

Linton, M. (1982).  Transformations of memory in everyday life.  In Neisser, U. (Ed.)

Memory Observed (pp. 77-91).  San Francisco: Freedman Press.

Macchi, L. (2000).  Partitive formulation of information in probabilistic problems:

Beyond heuristics and frequency format explanations.  Organizational Behavior

and Human Decision Processes, 82, 217-236.          

Mellers, B. & McGraw, A.P. (1999).  How to improve Bayesian reasoning: Comments

on Gigerenzer & Hoffrage (1995).  Psychological Review, 106, 417-424.

Monaghan, P. & Stenning, K. (1998).  Effects of representational modality and thinking

style on learning to solve reasoning problems.  Proceedings of the 20th Annual

Meeting of the Cognitive Science Society, 716-721.  Lawrence Erlbaum

Associates, Maweh, NJ.

Newstead, S.E. (1982)  The role of problem  structure  in  a  deductive  reasoning  task,

Journal of  Experimental Psychology  Learning, Memory, and Cognition, 8, 297-

307.

Newstead, S.E. (1989).  Interpretational errors in syllogistic reasoning.  Journal of

Memory and Language, 28: 78-91. 

Nosofsky, R.M., Kruschke, J.K. & McKinley, S.C. (1992).  Combining exemplar-

based category representations and connectionist learning rules.  Journal of

Experimental Psychology: Learning, Memory, and Cognition, 18, 211-233.

Over, D.E. (2000a).  Ecological rationality and its heuristics.  Thinking and Reasoning,

6, 182-192.

Over, D.E. (2000b).  Ecological issues: A reply to Todd, Fiddick, & Krauss.  Thinking

and Reasoning, 6, 385-388.

Over, D.E. (2003).  From massive modularity to meta-representation: The evolution of

higher cognition.  In D.E. Over (Ed.) Evolution and the Psychology of Thinking:

The Debate (pp. 121-144).  New York, NY: Psychology Press.

Over, D.E. (in press).  Content-independent conditional inference.  In Maxwell J.

Roberts (Ed.), Integrating the Mind: Domain General versus Domain Specific

Processes in Higher Cognition.  Hove, UK: Psychology Press.

Over, D.E. & Green, D.W. (2001).  Contingency, causation, and adaptive inference. 

Psychological Review, 108, 682-684.

Over, D.E. (Ed.) (2003).  Evolution and the Psychology of Thinking: The Debate.  New

York: Psychology Press.

Pillemer, E.D., Rhinehart, E. D. & White, S. H. (1986).  Memories of life transitions:

The first year in college.  Human Learning: Journal of Practical Research &

Applications, 5, 109-123.

Ramsey, F.P. (1964).  Truth and probability.  In H.E. Kyburg, Jr., & E. Smokler, (Eds.). 

Studies in Subjective Probability (pp. 61-92).  New York: Wiley.

Ross, M. & Sicoly, F. (1979).  Egocentric biases in availability and attribution.  Journal

of Personality and Social Psychology, 37, 322-336.

Savage L.J. (1954).  The Foundations of Statistics.  New York: Wiley.

Schwartz, N. & Wanke, M. (2002).  Experimental and contextual heuristics in frequency

judgment: Ease of recall and response scales.  In P. Sedlmeier, & T. Betsch (Eds.)

Etc. Frequency Processing and Cognition (pp. 89-108).  New York: Oxford

University Press.

Schwartz, N. & Sudman, S. (1994).  Autobiographical memory and the validity of

retrospective reports.  New York: Springer Verlag.

Sedlmeier, P. & Betsch T. (2002) Etc. Frequency Processing and Cognition (pp. 89-

108).  New York: Oxford University Press.

Sedlmeier, P. & Gigerenzer, G. (2001).  Teaching Bayesian reasoning in less than two

hours.  Journal of Experimental Psychology: General, 130, 380-400.

Sloman, S.A. (1996).  The empirical case for two systems of reasoning.  Psychological

Bulletin, 1, 3-22. 

Sloman, S.A. (1998).  Categorical inference is not a tree:  The myth of inheritance

hierarchies.  Cognitive Psychology, 35, 1-33. 

Sloman, S.A. (2005).  Causal models:  How we think about the world and its

            alternatives.  New York: Oxford University Press.

Sloman, S.A. & Over, D.E. (2003).  Probability judgment from the inside and out.  In

D.E. Over (Ed.) Evolution and the Psychology of Thinking: The Debate (pp. 145-

170).  New York: Psychology Press.

Sloman, S.A., Lombrozo, T. & Malt, B.C. (in press).  Mild ontology and domain-

specific categorization.  In M. J. Roberts (Ed.). Integrating the mind. Hove, UK:

Psychology Press.

Sloman, S.A., Over, D.E., Slovak, L. & Stibel, J.M. (2003).  Frequency illusions and

other fallacies.  Organizational Behavior and Human Decision Processes, 91,

296-309.

Stanovich, K.E. (1999).  Who is Rational?  Studies of Individual Differences in

Reasoning.  Mahwah, N. J.: Lawrence Erlbaum Associates.

Stanovich, K.E. & West, R.F. (1998).  Individual differences in rational thought. 

Journal of Experimental Psychology: General, 127, 161-188.

Stanovich, K.E. & West, R.F. (2000).  Individual differences in reasoning:

Implications for the rationality debate.  Behavioral and Brain Sciences, 23, 645-

726.

Stanovich, K.E. & West, R.F. (2003).  Evolutionary versus instrumental goals: How

evolutionary psychology misconceives human rationality.  In D. E. Over (Ed.)

Evolution and the Psychology of Thinking (pp. 171-230).  New York, NY:

Psychology Press.

Stenning, K. (2002).  Seeing Reason: Image and Language in Learning to Think. 

Oxford: Oxford University Press.

Tversky, A. & Kahneman, D. (1973).  Availability: A heuristic for judging frequency

and probability.  Cognitive Psychology, 5, 207-232. 

Tversky, A. & Kahneman, D. (1974).  Judgment under uncertainty: Heuristics and

biases.  Science, 185, 1124-1131. 

Tversky, A. & Kahneman, D. (1983).  Extensional versus intuitive reasoning: The

conjunction fallacy in probability judgment.  Psychological Review, 90, 293-315.

Tversky, A. & Koehler, D. (2002).  Support theory: A nonexistential representation of

subjective probability.  In T. Gilovich, D. Griffin, & D. Kahneman, (Eds.) (2002). 

Heuristics and Biases: The Psychology of Intuitive Probability Judgment.  (pp.

441-473) New York: Cambridge University Press.

Vranas, P.B.M. (2000).  Gigerenzer’s normative critique of Kahneman and Tversky.    

Cognition, 76, 179-193.

Vranas, P.B.M. (2001).  Single-case probabilities and content-neutral norms: A reply to

Gigerenzer.  Cognition, 81, 105-111.

Wagenaar, W.A. (1986).  My memory: A study of autobiographical memory over six years.

Cognitive Psychology, 18, 225-252.

Yamagishi, K. (2003).  Facilitating normative judgments of conditional probability: Frequency

or nested sets?  Experimental Psychology, 50, 97-106.

Zacks, R.T. & Hasher, L. (2002).  Frequency processing: A twenty-five year

perspective.  In P. Sedlmeier, & T. Betsch (Eds.) Etc. Frequency Processing and

Cognition (pp. 21-36).  New York: Oxford University Press.

 

 

 


Footnotes



1 The respondent’s subjective degree of belief in the hypothesis (H) that the patient has breast cancer given the observed datum (D) that she has a positive mammography (i.e., the posterior probability, Pr(H│D)) can be expressed numerically as the ratio between (a) the probability that the patient has the disease and obtains a positive mammogram (Pr(H ∩D)), and (b) the probability that the patient obtains a positive mammogram (Pr(D)).  To calculate this ratio, Bayes’ theorem incorporates two axioms of mathematical probability theory: the conditional probability and additivity laws.  According to the former, (a) can be expressed by the probability that the patient has the disease (i.e., the base-rate of the hypothesis) multiplied by the probability that the patient obtains a positive mammogram given that she has the disease (i.e., the hit-rate of the test): Pr(H ∩D) = Pr(H) Pr(D│H).  The additivity rule is then applied to express (b) as the probability that the patient has the disease and obtains a positive mammogram, plus the probability that the patient does not have the disease and obtains a positive mammogram: Pr(D) = Pr(H ∩D) + Pr(~H ∩D).  The conditional probability rule can be further applied to express this latter quantity as the complement of the base-rate multiplied by the probability that the patient obtains a positive mammogram given that she does not have the disease (i.e., the false alarm rate of the test):  Pr(~H ∩D) = Pr(~H) Pr(D│~H).  Thus, according to Bayes’ theorem, the probability that the patient has breast cancer given that she has a positive mammography equals Pr(H│D) = Pr(H│D) / Pr(D) = Pr(H) Pr(D│H) / Pr(H) Pr(D│H) + Pr(~H) Pr(D│~H) = (0.01)(0.80) / ((0.01)(0.80) + (0.99)(0.096)), or 7.8 per cent.

 

2 When estimated from natural frequency formats or formats expressing numbers of chances, because they entail the sample and effect sizes, posterior probabilities can be calculated in a way that does not require that the probabilities be multiplied by the base-rates.  The following simple form can be used to calculate the probability of a hypothesis (H) given datum (D): , where is the number of cases having the datum in the presence of the hypothesis, and  is the number of cases having the datum in the absence of the hypothesis.  This form requires that the respondent attend only to the  and the , whereas estimating posteriors with percentages requires that transforming percentage values into conditional probabilities by incorporating base-rates, making the calculation more complex than under natural frequency formats. 

 

3 There may be an important relation between sensitivity to nested-set structure and the law of the excluded middle that appears in logic.  By this rule, all propositions of the form ‘p or not-p’ hold.  We apply the rule, for example, to infer that everyone either has a disease or does not have the disease.  We use it again to infer that everyone has some symptom or does not have it.  Thus, the logical trees cited by natural frequency theorists are consistent with this fundamental logical rule (Over, in press).   

 

4 These authors point out that the chance representation of probability is commonly employed in everyday situations, such as when someone says, “A tossed coin has 1 out of 2 chances of landing head up” or that “there are 1 out of a million chances of winning the lottery.”  Chances preserve information about the size of the reference class (i.e., the total population of chances).  Hoffrage et al. (2002) argue that chances are just frequencies.  This is false (see Girotto & Gonzalez’s, 2002).  Chances refer to the probability of a single event and are based on the total population of chances rather than a finite sample of observations.  The chances, for example, of drawing an ace from a standard deck of playing cards are “4 out of 52”:  There are 4 ways that an ace can be drawn from the deck of 52 cards.  In contrast to natural frequencies, the size of the reference class represents the total population (i.e., the deck of 52 cards).  We might observe, for example, that 1 out of 10 cards randomly drawn from the deck is an ace, but this method of “natural sampling” would not represent the chance or number of ways of drawing an ace from the full deck.  Chances cannot be directly assessed by “counting occurrences of events as they are encountered and storing the resulting knowledge base for possible use later” (i.e., natural sampling; Brase, 2002, p. 384).  Chances are thus distinct from natural frequencies. 

 

5 The mind-as-Swiss-army-knife, natural frequency algorithm, and natural frequency heuristic theories do not concern the encoding of event frequencies under naturalistic settings in general, but focus only on event frequencies that have a partitive structure.  Thus, these approaches do not address the encoding of non-partitive event frequencies (e.g., the event frequency of naturally occurring independent events).  Given that both frequencies exist in nature, it is unclear why only frequencies of the latter type are deemed important.

 

6 Bayes’ theorem in odds form refers to the probability in favor of a hypothesis (H) over the probability of an alternative hypothesis (~H) given observed datum (D) (i.e., the posterior odds: (Pr(HD) / (Pr(~HD)).  To compute the posterior odds Bayes’ theorem incorporates two factors: the likelihood ratio and the prior odds.  The likelihood ratio is a measure of whether the datum is diagnostic with respect to the hypothesis (H).  If the evidence is diagnostic then the likelihood ratio will be positive, demonstrating that the observed datum is more likely to occur under the presence of the hypothesis (H) than under the alternative hypothesis (~H).  The prior odds is the ratio of base-rate probabilities (Pr(H) / Pr(~H)).  Bayes’ theorem in odds form states that the product of these quantities yields the posterior odds, Pr(H│D) / (Pr(~HD) = (Pr(DH) / Pr(D│~H)) * ((Pr(H) / Pr(~H)).  To directly estimate the relative weight of the likelihood ratios and prior odds, Bayes’ theorem in odds form can be logarithmically transformed to yield log [Pr(HD)/ Pr(~HD)] = log [(Pr(DH) / Pr(D│~H)] + log [(Pr(H) / Pr(~H)].  Under this formulation, the likelihood ratios and prior odds can be treated as independent variables in a regression analysis to assess the relative contribution of each factor in Bayesian inference.