To be published in Behavioral and Brain Sciences (in press)
© Cambridge University Press 2007
![]()
Below is the unedited, uncorrected final draft of a BBS target article that has been accepted for publication. This preprint has been prepared for potential commentators who wish to nominate themselves for formal commentary invitation. Please DO NOT write a commentary until you receive a formal invitation. If you are invited to submit a commentary, a copyedited, corrected version of this paper will be posted.
![]()
Base-rate
Respect: From Ecological Rationality to Dual Processes
Running
head: Base-rate respect
Aron
K. Barbey
Department
of Psychology
(404)
727-7386
abarbey@emory.edu
http://www.psychology.emory.edu/cognition/abarbey/index.html
Steven
A. Sloman
Cognitive
and Linguistics Science
(401)
863-7595
Steven_Sloman@brown.edu
http://www.cog.brown.edu/~sloman/
Abstract: The
phenomenon of base-rate neglect has elicited much debate. One arena of debate concerns how people make
judgments under conditions of uncertainty.
Another more controversial arena concerns human rationality. In this paper, we attempt to unpack the
perspectives in the literature on both kinds of issues and evaluate their
ability to explain existing data and their conceptual coherence. We will conclude that the best account of the
data should be framed in terms of a dual-process model of judgment that
attributes base-rate neglect to associative judgment strategies that fail to
adequately represent the set structure of the problem. Base-rate neglect is reduced when problems
are presented in a format that affords accurate representation in terms of
nested sets of individuals.
1.0. Introduction
Diagnosing
whether a patient has a disease, predicting whether a defendant is guilty of a
crime, and other everyday as well as life-changing decisions in part reflect
the decision-maker’s subjective degree of belief in uncertain events. Intuitions about probability frequently
deviate dramatically from the dictates of probability theory (e.g., Gilovich et
al., 2002). One form of deviation is
notorious: People’s tendency to neglect
base-rates in favor of specific case data.
A number of theorists (e.g., Cosmides & Tooby, 1996; Brase, 2002a;
Gigerenzer & Hoffrage, 1995) have argued that such neglect reveals little
more than experimenters’ failure to ask about uncertainty in a form that naïve
respondents can understand, specifically in the form of a question about
natural frequencies. The brunt of our
argument will be that this perspective is far too narrow. After surveying the theoretical perspectives
on the issue, we will show that both data and conceptual considerations demand
that judgment be understood in terms of dual processing systems, one that is
responsible for systematic error and another that is capable of reasoning not just
about natural frequencies, but about relations among any kind of set
representation.
Base-rate neglect has been extensively studied in the context of Bayes’ theorem, which provides a normative standard for updating the probability of a hypothesis in light of new evidence. Research has evaluated the extent to which intuitive probability judgment conforms to the theorem by employing a Bayesian inference task in which the respondent is presented a word problem and has to infer the probability of a hypothesis (e.g., the presence versus absence of breast cancer) on the basis of an observation (e.g., a positive mammography). Consider the following Bayesian inference problem motivated by Eddy (1982; cf. Gigerenzer & Hoffrage, 1995):
The probability of breast cancer is 1% for a woman at age forty who participates in routine screening [base-rate]. If a woman has breast cancer, the probability is 80% that she will get a positive mammography [hit-rate]. If a woman does not have breast cancer, the probability is 9.6% that she will also get a positive mammography [false-alarm rate]. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer? __%
According to Bayes’ theorem[1], the
probability that the patient has breast cancer given that she has a positive
mammography is 7.8 per cent. Evidence
that people’s judgments on this problem accord with Bayes’ theorem would be
consistent with the claim that the mind embodies a calculus of probability,
whereas the lack of such a correspondence would demonstrate that people’s
judgments can be at variance with sound probabilistic principles and, as a
consequence, that people can be led to make incoherent decisions (Savage, 1954;
Ramsey, 1964). Thus, the extent to which
intuitive probability judgment conforms to the normative prescriptions of
Bayes’ theorem has implications for the nature of human judgment (for a review
of the theoretical debate on human rationality, see Stanovich, 1999). In the case of Eddy’s study, fewer than 5 per
cent of the respondents generated the Bayesian solution.
Early studies evaluating Bayesian inference under single-event probabilities also showed systematic deviations from Bayes’ theorem. Hammerton (1973), for example, found that only 10 per cent of the physicians tested generated the Bayesian solution, with the median response approximating the hit-rate of the test. Similarly, Casscells, Schoenberger, and Grayboys (1978) and Eddy (1982) found that a low proportion of respondents generated the Bayesian solution: 18 per cent in the former and 5 per cent in the latter, with the modal response in each study corresponding to the hit-rate of the test. All of this suggests that the mind does not normally reason in a way consistent with the laws of probability theory.
1.1. Base-rate
facilitation
However,
this conclusion has not been drawn universally.
Eddy’s (1982) problem concerned a single event, the probability that a
particular woman has breast cancer. In
some problems, when probabilities that refer to the chances of a single event
occurring (e.g., 1 %) are reformulated and presented in terms of natural frequency
formats (e.g., 10 out of 1,000), people more often draw probability estimates
that conform to Bayes theorem. Consider
the following mammography problem presented in a natural frequency format by
Gigerenzer and Hoffrage (1995).
10
out of every 1,000 women at age forty who participate in routine screening have
breast cancer [base-rate]. 8 out of
every 10 women with breast cancer will get a positive mammography
[hit-rate]. 95 out of every 990 women
without breast cancer will also get a positive mammography [false-alarm
rate]. Here is a new representative
sample of women at age forty who got a positive mammography in routine
screening. How many of these women do
you expect to actually have breast cancer?
___ out of ___.
The proportion of responses conforming
to Bayes’ theorem increased by a factor of about three in this case, 46 per
cent under natural frequency formats versus 16 per cent under a single-event
probability format. The observed
facilitation has motivated researchers to argue that coherent probability
judgment depends on representing events in the form of natural frequencies
(e.g., Cosmides & Tooby, 1996; Brase, 2002a; Gigerenzer & Hoffrage,
1995).
Cosmides and
Tooby (1996) also conducted a series of experiments that employed Bayesian
inference problems that had previously elicited judgmental errors under
single-event probability formats. In
Experiment 1, they replicated Casscells et al. (1978), demonstrating that only
12 per cent of their respondents produced the Bayesian answer when presented
single-event probabilities. Cosmides and
Tooby then transformed the single-event probabilities into natural frequencies,
resulting in a remarkably high proportion of Bayesian responses: 72 per cent of respondents generated the
Bayesian solution, supporting the author’s conclusion that Bayesian inference
depends on the use of natural frequencies.
Gigerenzer
(1996b) explored whether physicians, who frequently assess and diagnose medical
illness, would demonstrate the same pattern of judgments as clinically untrained
college undergraduates. Consistent with
the judgments drawn by college students (e.g., Gigerenzer & Hoffrage,
1995), Gigerenzer found that the sample of 48 physicians tested generated the
Bayesian solution in only 10 per cent of the cases under single-event
probability formats whereas 46 per cent did with natural frequency
formats. Physicians spent about 25 per
cent more time on the single-event probability problems, suggesting that they
found these problems more difficult to solve than problems presented in a
natural frequency format. Thus, the
physician’s judgments were consistent with those of non-physicians, suggesting
that formal training in medical diagnosis does not lead to more accurate
Bayesian reasoning and that natural frequencies facilitate probabilistic
inference across populations.
Further
studies have demonstrated that the facilitory effect of natural frequencies on
Bayesian inference observed in the laboratory has the potential for improving
the predictive accuracy of professionals in important real-world settings. Gigerenzer and his colleagues have shown, for
example, that natural frequencies facilitate Bayesian inference in AIDS
counseling (Gigerenzer et al., 1998), in the assessment of statistical
information by judges (Lindsey et al., 2003), and in teaching Bayesian
reasoning to college undergraduates (Sedlmeier & Gigerenzer, 2001;
Kuzenhauser & Hoffrage, 2002). In summary, the reviewed findings
demonstrate facilitation in Bayesian inference when single-event probabilities are
translated into natural frequencies, consistent with the view that coherent
probability judgment depends on natural frequency representations.
Explanations of facilitation in
Bayesian inference can be grouped into five types that can be arrayed along a
continuum of cognitive control, from accounts that ascribe facilitation to
processes that have little to do with strategic cognitive processing to those
that appeal to general-purpose reasoning procedures. The five accounts we discuss can be
contrasted at the coarsest level on five dimensions (see Table 1). We do not claim that theorists have
consistently made these distinctions in the past, only that these distinctions
are in fact appropriate ones.
Table 1. Prerequisites for reduction of base-rate
neglect according to 5 theoretical frameworks.
|
Mind
as Swiss army knife |
Natural
frequency algorithm |
Natural
frequency heuristic |
Non-evolutionary
natural frequency heuristic |
Nested
sets and dual processes |
|
|
Cognitive
impenetrability |
X |
|
|
|
|
|
Informational
encapsulation |
X |
X |
|
|
|
|
Appeal
to evolution |
X |
X |
X |
|
|
|
Cognitive
process uniquely sensitive to natural frequency formats |
X |
X |
X |
X |
|
|
Transparency
of nested set relations |
X |
X |
X |
X |
X |
Note.
The prerequisites of each theory are indicated by an ‘X’.
A parallel taxonomy for theories of
categorization can be found in Sloman, Lombrozo, and Malt (in press). We briefly introduce the theoretical
frameworks here. The discussion of each
will be elaborated as required to reveal assumptions and derive predictions in
the following sections in order to compare and contrast them.
1.2.1. Mind as Swiss army knife
Several theorists have argued that the
human mind consists of a number of specialized modules (Cosmides & Tooby,
1995; Gigerenzer & Selten, 2001).
Each module is assumed to be unavailable to conscious awareness or
deliberate control (cognitively impenetrable) and able to process only a
specific type of information (informationally encapsulated; see Fodor,
1983). One module in particular is
designed to process natural frequencies.
This module is thought to have evolved because natural frequency
information is what was available to our ancestors in the environment of
evolutionary adaptiveness. On this view,
facilitation occurs because natural frequency data are processed by a
computationally effective processing module.
Two arguments have been advanced in
support of the ecological validity of natural frequency data. First, as natural frequency information is
acquired it can be “easily, immediately, and usefully incorporated with past
frequency information via the use of natural sampling, which is the method of
counting occurrences of events as they are encountered and storing the
resulting knowledge base for possible use later” (Brase, 2002, p. 384). Second, information stored in a natural
frequency format preserves the sample size of the reference class (e.g., 10 out
of 1,000 women have breast cancer), and are arranged into subset relations
(e.g., of the 10 women that have breast cancer, 8 are positively diagnosed)
that indicate how many cases of the total sample there are in each subcategory
(i.e., the base-rate, the hit-rate, and false-alarm rate). Because natural frequency formats entail the
sample and effect sizes, posterior probabilities consistent with Bayes’ theorem
can be calculated without explicitly incorporating base-rates, thereby allowing
simple calculations[2]
(Kleiter, 1994). Thus proponents of this
view argue that the mind has evolved to process natural frequency formats over
single-event probabilities and, in particular, includes a cognitive module that “maps frequentist representations of
prior probabilities and likelihoods onto a frequentist representation of a
posterior probability in a way that satisfies the constraints of Bayes’
theorem” (Cosmides & Tooby, 1996, p. 60).
Theorists who take this position uniformly
motivate their hypothesis via a process of natural selection. However, the cognitive and evolutionary
claims are in fact conceptually independent.
The mind could consist of cognitively impenetrable and informationally
encapsulated modules whether or not any or all of those modules evolved for the
specific reasons offered.
1.2.2. Natural frequency algorithm
A weaker claim is that the mind
includes a specific algorithm for effectively processing natural frequency
information (Gigerenzer & Hoffrage, 1995).
Unlike the mind-as-Swiss-army-knife view, this hypothesis makes no
general claim about the architecture of mind.
Despite this difference in scope, these theories adopt the same
computational and evolutionary commitments.
Consistent with the mind-as-Swiss-army-knife
view, this approach proposes that coherent probability judgment derives from a
simplified form of Bayes’ theorem. The
proposed algorithm computes the number of cases where the hypothesis and
observation co-occur, N(H and D), out of the total number of cases where the
observation occurs, N(H and D) + N(not-H and D) = N(D) (Kleiter, 1994;
Gigerenzer & Hoffrage, 1995).
Because this form of Bayes’ theorem expresses a simple ratio of
frequencies, we refer to it as “the Ratio.”
Following the mind-as-Swiss-army knife
view, proponents of this approach have ascribed the origin of the Bayesian
ratio to evolution. Gigerenzer and
Hoffrage (1995, p. 686), for example, state “The evolutionary argument that
cognitive algorithms were designed for frequency information, acquired through
natural sampling, has implications for the computations an organism needs to
perform when making Bayesian inferences….
Bayesian algorithms are computationally simpler when information is encoded
in a frequency format rather than a standard probability format.” As a consequence, this view predicts that
“Performance on frequentist problems will satisfy some of the constraints that
a calculus of probability specifies, such as Bayes’ rule. This would occur because some inductive reasoning
mechanisms in our cognitive architecture embody aspects of a calculus of
probability” (Cosmides & Tooby, 1996, p. 17).
The proposed algorithm is necessarily
informationally encapsulated as it operates on a specific information format,
natural frequencies, but it is not necessarily cognitively impenetrable as no
one has claimed that other cognitive processes can’t affect or use the
algorithm’s computations. The primary
motivation for the existence of this algorithm has been computational (Kleiter,
1994; Gigerenzer & Hoffrage, 1995).
As reviewed above, the value of natural frequencies is that these
formats entail the sample and effect sizes and, as a consequence, simplify the
calculation of Bayes’ theorem:
Probability judgments are coherent with Bayesian prescriptions even
without explicit consideration of base-rates.
1.2.3. Natural frequency heuristic
A claim that puts facilitation under
more cognitive control is that people use heuristics to make judgments (Tversky
& Kahneman, 1974; Gigerenzer & Selten, 2001) and that the Ratio is one
such heuristic (Gigerenzer, Todd & the ABC research group, 1999). According to this view, “heuristics can
perform as well, or better, than algorithms that involve complex
computations…. The astonishingly high
accuracy of these heuristics indicates their ecological rationality; fast and
frugal heuristics exploit the statistical structure of the environment, and
they are adapted to this structure” (Gigerenzer, 2006). Advocates of this approach motivate the
proposed heuristic by pointing to the ecological validity of natural frequency
formats, as Gigerenzer further states (p. 52), “To evaluate the performance of
the human mind, one needs to look at its environment and, in particular, the
external representation of the information.
For most of the time during which the human mind evolved, information
was encountered in the form of natural frequencies…” Thus, this view proposes that the mind
evolved to process natural frequencies and that this evolutionary adaptation gave
rise to the proposed heuristic that computes the Bayesian Ratio from natural
frequencies.
1.2.4. Non-evolutionary natural frequency heuristic
Evolutionary arguments about the
ecological validity of natural frequency representations provide part of the
motivation for the preceding theories.
In particular, proponents of the theories argue that throughout the
course of human evolution natural frequencies were acquired via natural
sampling (i.e., encoding event frequencies as they are encountered and storing
them in the appropriate reference class).
In contrast, the non-evolutionary
natural frequency theory proposes that natural sampling is not necessarily an
evolved procedure for encoding statistical regularities in the environment, but
a useful sampling method that, one way or another, people can appreciate and
use. The natural frequency
representations that result from natural sampling, on this view, simplify the
calculation of Bayes’ theorem and, as a consequence, facilitate Bayesian
inference (Kleiter, 1994). Thus, this
view differs from the preceding accounts by resting on a purely computational
argument that is independent of any commitments to which cognitive processes
have been selected for by evolution.
This theory proposes that the computational
simplicity afforded by natural frequencies gives rise to a heuristic that
computes the Bayesian Ratio from natural frequencies. The proposed heuristic implies a higher
degree of cognitive control than the preceding modular proposed algorithms.
1.2.5. Nested sets and dual processes
The most extreme departure from the
modular view claims that facilitation is a product of general-purpose reasoning
processes (Evans et al., 2000; Fox & Levav, 2004; Girotto & Gonzales, 2001;
Johnson-Laird et al., 1999; Kahneman & Frederick, 2002, 2005; Over, 2003;
Sloman et al., 2003). On this view,
people use two systems to reason (Evans & Over, 1996; Kahneman &
Frederick, 2002, 2005; Sloman, 1996; Stanovich & West, 2000), often called
Systems 1 and 2 but in an effort to use more expressive labels, we will employ
Sloman’s terms “associative” and “rule-based.”
The dual-process model attributes
responses based on associative principles like similarity or retrieval from
memory to a primitive associative judgment system. It attributes responses based on more
deliberative processing that involves working memory such as the elementary set
operations that respect the logic of set inclusion and facilitate Bayesian
inference to a second rule-based system.
Judgmental errors produced by cognitive heuristics are generated by
associative processes, whereas the induction of a representation of category
instances that makes nested set relations transparent also induces use of rules
about elementary set operations, operations of the sort perhaps described by
Fox and Levav (2004) or Johnson-Laird et al. (1999).
According to this theory, base-rate
neglect results from associative responding and facilitation occurs when people
correctly use rules to make the inference.
Rule-based inference is more cognitively demanding than associative
inference and is therefore more likely to occur when participants have more
time, more incentives, or more external aids to make a judgment and are under
fewer other demands at the moment of judgment.
It is also more likely for people who have greater skill employing the
relevant rules. This last prediction is
supported by Stanovich and West (2000) who find correlations between
intelligence and use of base-rates.
Rules are effective devices for solving
a problem to the extent that the problem is represented in a way compatible
with the rules. For example, long
division is an effective method for solving division problems but only if
numbers are represented using Arabic numerals; division with Roman numerals
requires different rules. By analogy,
the reason that natural frequencies facilitate use of base-rates on this view
is that the rules that people have access to and are able to use to solve the
specific kind of problem studied in the base-rate neglect literature are more
compatible with natural frequency formats than single-event probability
formats.
Specifically, people are adept at using
rules consisting of simple elementary set operations. But these operations are only applicable when
problems are represented in terms of sets, as opposed to single events. According to this view, facilitation in
Bayesian inference occurs under natural frequencies because these formats are
an effective cue to the representation of the set structure underlying a
Bayesian inference problem. This is the
nested sets hypothesis of Tversky & Kahneman (1983). On this view, natural frequency formats
prompt the respondent to adopt an outside view by inducing a representation of
category instances (e.g., 10 out of 1,000 women have breast cancer) that
reveals the set structure of the problem and makes the nested set relations
transparent for problem solving[3]. We refer to this hypothesis as the nested
sets theory (Ayton & Wright, 1994; Evans et al., 2000; Fox & Levav,
2004; Girotto & Gonzalez, 2001, 2002; Johnson-Laird et al., 1999; Kahneman
& Tversky, 1983; Macchi, 2000; Mellers & McGraw, 1999; Sloman et al.,
2003). Unlike the other theories, it
predicts that facilitation should be observable in a variety of different
tasks, not just posterior probability problems, when nested set relations are
made transparent.
2.0. Overview of empirical and conceptual issues
reviewed
We now turn to an evaluation of these
five theoretical frameworks. We evaluate
a range of empirical and conceptual issues that bear on the validity of these
frameworks.
The theories
are evaluated with respect to the empirical predictions summarized in Table
2. The predictions of each theory derive
from (i) the degree of cognitive control attributed to probability judgment
(see Table 1), and (ii) the proposed cognitive operations that underlie
estimates of probability.
Theories that
adopt a low degree of cognitive control — proposing cognitively impenetrable
modules or informationally encapsulated algorithms — restrict Bayesian
inference to contexts that satisfy the assumptions of the processing module or
algorithm. In contrast, theories that
adopt a high degree of cognitive control — appealing to a natural frequency
heuristic or a domain general capacity to perform set operations — predict
Bayesian inference in a wider range of contexts. The latter theories are distinguished from one
another in terms of the cognitive operations they propose: The evolutionary and non-evolutionary natural
frequency heuristics depend on structural features of the problem like question
form and reference class. They imply the
accurate encoding and comprehension of natural frequencies and an accurate
weighting of the encoded event frequencies to calculate the Bayesian
ratio. In contrast, the nested sets
theory does not rely on natural frequencies and instead predicts facilitation
in Bayesian inference, and in a range of other deductive and inductive
reasoning tasks, when the set structure of the problem is made transparent,
thereby promoting use of elementary set operations and inferences about the
logical (i.e., extensional) properties they entail.
Table 2. Empirical predictions of the five theoretical
frameworks.
|
|
Mind as
Swiss army knife |
Natural
frequency algorithm |
Natural
frequency heuristic |
Non-evolutionary
natural frequency heuristic |
Nested sets
and dual processes |
|
Facilitation
with natural frequencies (information format and judgment domain) |
X |
X |
X |
X |
X |
|
Facilitation
with questions that prompt the respondent to compute the Bayesian ratio
(question form) |
|
|
X |
X |
X |
|
Facilitation
with statistical information organized in a partitive structure (reference
class) |
|
|
X |
X |
X |
|
Facilitation
with diagrammatic representations that highlight the set structure of the
problem |
|
|
X |
X |
X |
|
Inaccurate
frequency judgments |
|
|
|
|
X |
|
Equivalent
comprehension of natural frequencies and single-event probabilities |
|
|
|
|
X |
|
Non-normative
weighting of likelihood ratio and prior odds |
|
|
|
|
X |
|
Facilitation
with set representations in deductive and inductive reasoning |
|
|
|
|
X |
Note. The predictions of each theory are indicated
by an ‘X.’
2.1. Information
format and judgment domain
The preceding review
of the literature found that natural frequencies formats consistently reduced
base-rate neglect relative to probability formats. However, the size of this effect varied
considerably across studies (see Table 3).
Cosmides and
Tooby (1996), for example, observed a 60-point percent difference between the
proportions of Bayesian responses under natural frequencies versus single-event
probabilities, whereas Gigerenzer and Hoffrage (1995) reported a difference
only half that size. The wide
variability in the size of the effects makes it clear that in no sense do
natural frequencies eliminate base-rate neglect, though they do reduce it.
Sloman, Over,
Slovak, and Stibel (2003) conducted a series of experiments that attempted to
replicate the effect sizes observed by the previous studies (e.g., Cosmides
& Tooby, 1996, Experiment 2, Condition 1).
Although Sloman et al. found facilitation with natural frequencies, the
size of the effect was smaller than that observed by Cosmides and Tooby: The
percent of Bayesian solutions generated under single-event probabilities (20%)
was comparable to Cosmides and Tooby (12%), but the percentage of Bayesian
answers generated under natural frequencies was smaller (i.e., 72% versus 51%
for Sloman et al.). In a further replication,
Sloman et al. found that only 31 per cent of their respondents generated the
Bayesian solution, a statistically non-significant advantage for natural
frequencies.
Table 3. Percent correct for Bayesian inference problems
reported in the literature (sample sizes in parentheses)
Information
format and judgment domain
Study Probability Frequency
Casscells et al., (1978) 18
(60) ---
Cosmides & Tooby (1996; Exp. 2) 12 (25) 72
(25)
Eddy (1988) 5 (100) ---
Evans et al., (2000; Exp. 1) 24 (42) 35 (43)‡
Gigerenzer (1996) 10 (48) 46 (48)
Gigerenzer & Hoffrage (1995) 16
(30) 46 (30)
Macchi (2000) 6 (30) 40 (30)
Sloman et al., (2003) (Exp.1) 20 (25) 51 (45)
Sloman et al., (2003) (Exp. 1b) --- 31 (48)‡
Note.
Probability problems require that the respondent compute a
conditional-event probability from data presented in a non-partitive form,
whereas frequency problems include questions that prompt the respondent to
evaluate the two terms of the Bayesian ratio and present data that is
partitioned into these components.
‡ p > 0.05
Evans,
Handley, Perham, Over, and Thompson, (2000; Experiment 1) similarly found only
a small effect of information format. They
report 24 per cent Bayesian solutions under single-event probabilities and 35
per cent under natural frequencies, a difference that was not reliable.
Brase,
Fiddick, and Harries (in press) examined whether methodological factors
contribute to the observed variability in effect size. They identified two factors that modulate the
facilitory effect of natural frequencies in Bayesian inference: (1) the
academic selectivity of the university the participants attend, and (2) whether
or not the experiment offered a monetary incentive for participation. Experiments whose participants attended a
top-tier national university and were paid reported a significantly higher
proportion of Bayesian responses (e.g., Cosmides & Tooby, 1996) than
experiments whose participants attended a second-tier regional university and
were not paid (e.g., Brase et al., in press, Experiments 3 and 4). These results suggest that a higher
proportion of Bayesian responses is observed in experiments that (a) select
participants with a higher level of general intelligence, as indexed by the
academic selectivity of the university the participant attends (Stanovich &
West, 1998), and (b) increase motivation by providing a monetary
incentive. The former observation is
consistent with the view that Bayesian inference depends on domain general
cognitive processes to the degree that intelligence is domain general. The latter suggests that Bayesian inference
is strategic, and not supported by automatic (e.g., modularized) reasoning
processes.
2.2. Question form
One
methodological factor that may mediate the effect of problem format is the form
of the Bayesian inference question presented to participants (Girotto &
Gonzalez, 2001). The Bayesian solution
expresses the ratio between the size of the subset of cases where the
hypothesis and observation co-occur and the total number of observations. Thus, it follows that the respondent should
be more likely to arrive at this solution when prompted to adopt an outside
view by utilizing the sample of category instances presented in the problem
(e.g., “Here is a new sample of patients who have obtained a positive test
result in routine screening. How many of
these patients do you expect to actually have the disease? ___ out of ___”)
versus a question that presents information about category properties (e.g., “…
In the
preceding studies, however, information format and judgment domain were
confounded with question form: Only problems that presented natural frequencies
prompted use of the sample of category instances presented in the problem to
compute the two terms of the Bayesian solution (an outside view), whereas
single-event probability problems prompted the use of category properties to
compute a conditional probability.
To dissociate
these factors, Girotto and Gonzalez (2001) proposed that single-event
probabilities (e.g., 1%) can be represented as chances[4]
(e.g., “1 chance out of 100”). Under the
chance formulation of probability, the respondent can be asked either for the
standard conditional probability or for values that correspond more closely to
the ratio expressed by Bayes’ theorem.
The latter question asks the respondent to evaluate the chances that
To evaluate the role of question form in
Bayesian inference, Girotto and Gonzalez (2001, Study 1) conducted an
experiment that manipulated question form independently of information format
and judgment domain. The authors
presented the following Bayesian inference scenario to 80 college
undergraduates of the
A
person who was tested had 4 chances out of 100 of having the infection. 3 of the 4 chances of having the infection
were associated with a positive reaction to the test. 12 of the remaining 96 chances of not having
the infection were also associated with a positive reaction to the test.
Half of the respondents were then asked
to compute a conditional probability (i.e., “If Pierre has a positive reaction,
there will be ___ chance(s) out of ___ that the infection is associated with
his positive reaction.”), whereas the remaining respondents were asked to
evaluate the ratio of probabilities expressed in the Bayesian solution (i.e.,
“Imagine that Pierre is tested now. Out
of the total 100 chances,
Girotto and
Gonzalez (2001) found that only 8 per cent of the respondents generated the
Bayesian solution when asked to compute a conditional probability, consistent
with the earlier literature. But the
proportion of Bayesian answers increased to 43 per cent when the question
prompted the respondent to evaluate the two terms of the Bayesian
solution. The same pattern was observed
with the natural frequency format problem.
Only 18 per cent of the respondents generated the Bayesian solution when
asked to compute a conditional frequency, whereas this proportion increased to 58
per cent when asked to evaluate the two terms separately. This level of performance is comparable to
that observed under standard natural frequency formats (e.g., Gigerenzer &
Hoffrage, 1995), and supports Girotto and Gonzalez’s claim that the two-step
question approximates the question asked with standard natural frequency
formats. In further support of Girotto
and Gonzalez’s predictions, there were no reliable effects of information
format or judgment domain across all the reported comparisons.
These findings
suggest that people are not predisposed against using single-event
probabilities but instead appear to be highly sensitive to the form of the
question: When asked to reason about
category instances to compute the two terms of the Bayesian ratio, respondents
were able to draw the normative solution under single-event probabilities. Facilitation in Bayesian inference under
natural frequencies need not imply that the mind is designed to process these
formats, but instead can be attributed to the facilitory effect of prompting
use of the sample of category instances presented in the problem to evaluate
the two terms of the Bayesian ratio.
2.3.
Reference class
To assess the
role of problem structure in Bayesian inference, we review studies that have
manipulated structural features of the problem.
Girotto and Gonzalez (2001) report two experiments that systematically
assess performance under different partitionings of the data: Defective frequency partitions and
non-partitive frequency problems.
Consider the following medical diagnosis problem, which presents natural
frequencies under what Girotto and Gonzalez (2001, Study 5) term a defective
partition:
4
out of 10 people tested were infected. 3
of the 4 infected people had a positive reaction to the test. 84 of the 96 uninfected people did not have a
positive reaction to the test. Imagine
that a group of people is now tested. In
a group of 100 people, one can expect ___ individuals to have a positive reaction,
___ of whom will have the infection.
In contrast to
the standard partitioning of the data under natural frequencies, here the
frequency of uninfected people who did not have a positive reaction to the test
is reported, instead of the frequency of uninfected, positive reactions. As a result, to derive the Bayesian solution,
the first value must be subtracted from the total population of uninfected
individuals to obtain the desired value (96 – 84 = 12), and the result can be
used to determine the proportion of infected, positive people out of the total
number of people who obtain a positive test (e.g., 3 / 12 = 1 / 4). Although this problem exhibits a partitive
structure, Girotto and Gonzalez predicted that the defective partitioning of
the data would produce a greater proportion of errors than observed under the
standard data partitioning, because the former requires an additional
computation. Consistent with this
prediction, only 35 per cent of respondents generated the Bayesian solution,
whereas 53 per cent did under the standard data partitioning. Nested set relations were more likely to
facilitate Bayesian reasoning when the data were partitioned into the
components that are needed to generate the solution.
Girotto and
Gonzalez (2001, Study 6) also assessed performance under natural frequency
formats that were not partitioned into nested set relations (i.e.,
unpartitioned frequencies). As in the
case of standard natural frequency format problems (e.g., Cosmides & Tooby,
1996), these multiple-sample problems employed natural frequencies and prompted
the respondent to compute the two terms of the Bayesian solution[5]. Such a problem must be treated in the same
way as a single-event probability problem (i.e., using the conditional
probability and additivity laws) to determine the two terms of the Bayesian
ratio. Girotto and Gonzalez therefore
predicted that performance under multiple samples would be poor, approximating
that observed under standard probability problems. As predicted, none of the respondents
generated the Bayesian solution under the multiple sample or standard
single-event probability frames. Natural
frequency formats facilitate Bayesian inference only when they partition the
data into components needed to draw the Bayesian solution.
Converging evidence
is provided by Macchi (2000), who presented Bayesian inference problems in
either a partitive or non-partitive form. Macchi found that only 3 per cent of
respondents generated the Bayesian solution when asked to evaluate the two
terms of the Bayesian ratio with non-partitive frequency problems. Similarly, only 6 per cent of the respondents
generated the Bayesian solution when asked to compute a conditional probability
under non-partitive probability formats (see also Sloman et al., 2003,
Experiment 4). But when presented under
a partitive formulation and asked to evaluate the two terms of the Bayesian
ratio the proportions increased to 40 per cent under partitive natural
frequency formats, 33 per cent under partitive single-event probabilities, and
36 per cent under the modified partitive single-event probability problems. The
findings reinforce the nested sets view that information structure is the
factor determining predictive accuracy.
To further
explore the contribution of information structure and question form in Bayesian
inference, Sloman et al. (2003) assessed performance using a conditional chance
question. In contrast to the standard conditional probability question that
presents information about a particular individual (e.g., “…
In summary,
the reviewed findings suggest that when the data are partitioned into the
components needed to arrive at the solution and participants are prompted to
use the sample of category instances in the problem to compute the two terms of
the Bayesian ratio, the respondent is more likely to (1) understand the
question, (2) see the underlying nested set structure by partitioning the data
into exhaustive subsets, and (3) select the pieces of evidence that are needed
for the solution. According to the nested sets theory, accurate probability
judgments derive from the ability to perform elementary set operations whose
computations are facilitated by external cues.
Facilitation does not require prompting to compute the two terms of the
Bayesian ratio, but by any cue that increases the transparency of the relevant
set relations.
2.4. Diagrammatic representations
Sloman et al.
(2003, Experiment 2) explored whether Euler circles, which were employed to
construct a nested set structure for standard non-partitive single-event
probability problem (e.g., Cosmides & Tooby, 1996), would facilitate
Bayesian inference (see Figure 1). These
authors found that 48 per cent of the 25 respondents tested generated the
Bayesian solution when presented non-partitive single-event probability
problems with an Euler diagram that depicted the underlying nested set
relations. This finding demonstrates
that the set structure of standard non-partitive single-event probability
problems can be represented by Euler diagrams to produce facilitation.
Supporting data can be found in Yamagishi (2003) who used diagrams to make
nested set relations transparent in other inductive reasoning problems. Similar evidence is provided by Bauer and
Johnson-Laird (1993) in the context of deductive reasoning.


2.5. Accuracy of frequency judgments
Theories based
on natural frequency representations (i.e., the mind-as-Swiss-army-knife,
natural frequency algorithm, natural frequency heuristic, and non-evolutionary
natural frequency heuristic theories) propose that “the mind is a frequency
monitoring device” and that the cognitive algorithm that computes the Bayesian
ratio encodes and processes event frequencies in naturalistic settings
(Gigerenzer, 1993, p. 300). The
literature that evaluates the encoding and retrieval of event frequencies is
large and extensive and includes assessments of frequency judgments under
well-controlled laboratory settings based on relatively simple and distinct
stimuli (e.g., letters, pairs of letters, or words), and naturalistic settings
in which respondents report the frequency of their own behaviors (e.g., the
medical diagnosis of patients).
Laboratory studies tend to find that frequency judgments are
surprisingly accurate (see Zacks & Hasher, 2002, for a recent review),
whereas naturalistic studies often find systematic errors in frequency
judgments (see Bradburn et al., 1987).
Recent efforts have been made to integrate these findings under a
unified theoretical framework (e.g., Sedlmeier & Betsch, 2002; Schwartz
& Sudman, 1994; Schwartz & Wanke, 2002;).
Are frequency
judgments relatively accurate under the naturalistic settings described by
standard Bayesian inference problems?
Bayesian inference problems tend to involve hypothetical situations
that, if real, would be based on autobiographical memories encoded under naturalistic
conditions, such as the standard medical diagnosis problem in which a
particular set of patients is hypothetically encountered (cf. Sloman &
Over, 2003). Thus, the present review
focuses on the accuracy of frequency judgments for the autobiographical events
alluded to by standard Bayesian inference problems (see Sections 2.1, 2.2, and
2.3) to assess whether Bayesian inference depends on the accurate encoding of
autobiographical events.
Gluck and
Bower (1988) conducted an experiment that employed a learning paradigm to
assess the accuracy of frequency judgments in medical diagnosis. The respondent learned to diagnose a rare
(25%) or a common (75%) disease on the basis of four potential symptoms
exhibited by the patient (e.g., stomach cramps, discolored gums). During the learning phase, the respondent
diagnosed 250 hypothetical patients and in each case was provided feedback on
the accuracy of their diagnosis. After
the learning phase, the respondent estimated the relative frequency of patients
who had the diseases given each symptom.
Gluck and Bower found that relative frequency estimates of the disease
were determined by the diagnosticity of the symptom (the degree to which the
respondent perceived that the symptom provided useful information in diagnosing
the disease) and not the base-rate frequencies of the disease. These findings were replicated by Estes,
Campbell, Hatsopoulos, and Hurwirtz (1989, Experiment 1) and Nosofsky,
Kruschke, and McKinley (1992, Experiment 1).
Bradburn,
Rips, and Shevell (1987) evaluated the accuracy of autobiographical memory for
event frequencies by employing a range of surveys that assessed quantitative
facts, such as “During the last 2 weeks, on days when you drank liquor, about
how many drinks did you have?” These questions require the simple recall of
quantitative facts, in which the respondent “counts up how many individuals
fall within each category” (Cosmides & Tooby, 1996, p. 60). Recalling the frequency of drinks consumed
over the last 2 weeks, for example, is based on counting the total number of
individual drinking occasions stored in memory.
Bradburn et
al. (1987) found that autobiographical memory for event frequencies exhibits
systematic errors characterized by (a) the failure to recall the entire event
or the loss of details associated with a particular event (e.g., Linton, 1975,
Wagenaar, 1986), (b) the combining of similar distinct events into a single
generalized memory (e.g., Linton, 1975, 1982), or (c) the inclusion of events
that did not occur within the reference period specified in the question (e.g.,
Pillemer et al., 1986). As a result,
Bradburn et al. propose that the observed frequency judgments do not reflect
the accurate encoding of event frequencies, but instead entail a more complex
inferential process that typically operates on the basis of incomplete,
fragmentary memories that do not preserve base-rate frequencies.
These findings
suggest that the observed facilitation in Bayesian inference under natural
frequencies cannot be explained by an (evolved) capacity to encode natural
frequencies. Apparently, people don’t
have that capacity.
2.6. Comprehension of formats
Advocates of
the nested sets view have argued that the facilitation of Bayesian inference
under natural frequencies can be fully explained via a simple computation that
delivers the same result as Bayes’ theorem that is afforded by transparent
nested set relations, without appealing to (an evolved) capacity to process
natural frequencies (e.g., Johnson-Laird et al., 1999). The question therefore arises whether the
ease of processing natural frequencies goes beyond the reduction in
computational complexity of Bayes’ theorem that they provide (Brase, 2002a). To assess this issue, we review evidence that
evaluates whether natural frequencies are understood more easily than
single-event probabilities.
Brase (2002b)
conducted a series of experiments to evaluate the relative clarity and ease of
understanding a range of statistical formats, including natural frequencies
(e.g., 1 out of 10) and percentages (e.g., 10%). Brase distinguished natural frequencies that
have a natural sampling structure (e.g., 1 out 10 have the property, 9 out of
10 do not) from “simple frequencies” that refer to single numerical relations
(e.g., 1 out of 10 have the property).
This distinction, however, is not entirely consistent with the
literature as natural frequency theorists have often used single numerical
statements for binary hypotheses to express natural frequencies (e.g.,
Gigerenzer & Hoffrage, 1995). In any
case, for binary hypotheses the natural sampling structure can be directly
inferred from simple frequencies. If we
observe, for example, that I win the weekly poker game “1 out of 10 nights,” we
can infer that I lose “9 out of 10 nights” and construct a natural sampling
structure that represents the size of the reference class and is arranged into
subset relations. Thus, single numerical
statements of this type have a natural sampling structure and therefore we
refer to Brase’s “simple frequencies” as natural frequencies in the following
discussion.
Percentages
express single-event probabilities in that they are normalized to an arbitrary
reference class (e.g., 100) and can refer to the likelihood of a single-event
(Brase, 2002b, Gigerenzer & Hoffrage, 1995). We therefore examine whether natural
frequencies are understood more easily and have a greater impact on judgment
than percentages.
To test this
prediction, Brase (2002b, Experiment 1) assessed the relative clarity of
statistical information presented in a natural frequency format versus
percentage format at small, intermediate, and large magnitudes. Each respondent received four statements in
one statistical format, each at a different magnitude, and rated the clarity,
impressiveness, and “monetary pull” of the presented statistics according to a
5-point scale. Example questions are
shown in Table 4.
Table
4. Example questions presented by Brase
(2002b) ______
It is
estimated that by the year 2020, 1 of every 100 Americans will have been
exposed to Flu strain X [natural frequency format of low magnitude]
It is
estimated that by the year 2020, 33 % of all Americans will have been
exposed to Flu strain X [single-event probability of intermediate magnitude]
1.
How clear and easy to understand is the statistical information
presented in the above sentence?
[Clarity rating]
2.
How serious do you think the existence of virus X is [Impressiveness
rating]
3.
If you were in
charge of the annual budget for the U.S Department of Health, how much of every
$100 would you dedicate to dealing with virus X? ___ out of every $100 [Monetary pull rating]
______________________________________________________________________________
Brase (2002b)
found that across all statements and magnitudes both natural frequencies and
percentages were rated as “Very Clear,” with average ratings of 3.98 and 3.89,
respectively. These ratings were not
reliably different, demonstrating that percentages are perceived as clearly and
are as understandable as natural frequencies.
Furthermore, Brase found no reliable differences in the impressiveness
ratings (from question 2) of natural frequencies and percentages at
intermediate and large statistical magnitudes, suggesting that these formats are
typically viewed as equally impressive.
A significant difference between these formats was observed, however, at
low statistical magnitudes: On average,
natural frequencies were rated as “Impressive,” whereas percentages were viewed
as “Fairly Impressive.” The observed
difference in the impressiveness ratings at low statistical magnitudes did not
accord with the respondent’s monetary pull ratings, their willingness to
allocate funds to support research studying the issue at hand, which were
approximately equal for the two formats across all statements and magnitudes,
hence the difference in the impressiveness ratings at low magnitudes does not
denote differences in people’s willingness to act.
These data are
consistent with the conclusion that percentages and natural frequency formats
(a) are perceived equally clearly and are equally understandable, (b) are
typically viewed as equally impressive (i.e., at intermediate and large
statistical magnitudes), and (c) have the same degree of impact on behavior. Natural frequency formats do apparently
increase the perceptual contrast of small differences. Overall, however, the two formats are
perceived similarly, suggesting that the mind is not designed to process
natural frequency formats over single-event probabilities.
2.7. Are base-rates
and likelihood ratios equally weighted?
Does the facilitation of Bayesian
inference under natural frequencies entail that the mind naturally incorporates
this information according to Bayes’ theorem or that elementary set operations
can be readily computed from problems that are structured in a partitive
form? Natural frequencies preserve the
sample size of the reference class and are arranged into subset relations that
preserve the base-rates. As a result,
judgments based on these formats will entail the sample and effect sizes; the
respondent need not calculate them. To
assess whether the cognitive operations that underlie Bayesian inference are
consistent with the application of Bayes’ theorem, studies that evaluate how
the respondent derives Bayesian solutions are reviewed.
To assess
whether the observed increase in base-rate usage reflects the operation of a
Bayesian algorithm that is designed to process natural frequencies,
Further
support for this conclusion is provided by Evans, Handley, Over, and Perham
(2002) who conducted a series of experiments demonstrating that probability
judgments do not reflect equal weighting of the prior odds and likelihood
ratio. Evans et al., (2002, Experiment
5) employed a paradigm that extended the classic lawyer-engineer experiments by
assessing Bayesian inference under conditions where the base-rates are supplied
by commonly held beliefs and only the likelihood ratios are explicitly
provided. These authors found that when
prior beliefs about the base-rate probabilities were rated immediately before
the presentation of the problem, the prior odds (i.e., the base-rates) were
weighted more heavily than the likelihood ratios, with corresponding regression
weights (β values) of 0.43 and 0.19.
Additional
evidence supporting this conclusion is provided by Kleiter, Krebs, Doherty,
Garavan, Chadwick, and Blake (1997) who found that participants assessing event
frequencies in a medical diagnosis setting employed statistical evidence that
is irrelevant to the calculation of Bayes’ theorem. Kleiter et al. (1997, Experiment 1) presented
a list of event frequencies to respondents, which included those that were
necessary for the calculation of Bayes’ theorem (e.g., Pr(D│H))
and other statistics that were irrelevant (e.g., Pr(~D)). Participants were then asked to identify the
event frequencies that were needed to diagnose the probability of the disease
given the symptom (i.e., the posterior probability). Of the 4 college faculty and 26 graduate
students tested, only 3 made the optimal selection by identifying only the
event frequencies required to calculate Bayes’ theorem.
These data
suggest that the mind does not utilize a Bayesian algorithm that “maps
frequentist representations of prior probabilities and likelihoods onto a
frequentist representation of a posterior probability in a way that satisfies
the constraints of Bayes’ theorem” (Cosmides & Tooby, 1996, p. 60). Importantly, the findings that the prior
odds and likelihood ratio are not equally weighted according to Bayes’ theorem
(Griffin & Buehler, 1999; Evans et al., 2002) imply that Bayesian inference
does not rely on Bayesian computations per se.
Thus, the
findings are inconsistent with the mind-as-Swiss-army-knife, natural frequency
algorithm, natural frequency heuristic, and non-evolutionary natural frequency
heuristic theories, which propose that coherent probability judgment reflects
the use of the Bayesian ratio. The
finding that base-rate usage increases under frequentist representations
(Griffin & Buehler, 1999; Evans et al., 2002) supports the proposal that
the facilitation in Bayesian inference from natural frequency formats is due to
the property of these formats to induce a representation of category instances
that preserves the sample and effect sizes and, as a consequence, clarifies the
underlying set structure of the problem, making the relevance of base-rates
more obvious without providing an equation that generates Bayesian quantities.
A unique characteristic of the dual
process position is that it predicts that nested sets should facilitate
reasoning whenever people tend to rely on associative rather than extensional,
rule-based processes; facilitation should be observed beyond the context of
Bayesian probability updating. The natural
frequency theories expect facilitation only in the domain of probability
estimation.
In support of the nested sets position,
facilitation through nested set representations has been observed in a number
of studies of deductive inference.
Grossen and Carnine (1990) and Monaghan and Stenning (1998) report
significant improvement in syllogistic reasoning when participants were taught
using Euler circles. The effect was restricted to participants that are
‘learning impaired’ (Grossen & Carnine, 1990) or have a low GRE score
(Monaghan & Stenning, 1998). Presumably those that did not show
improvement did not require the Euler circles because they were already
representing the nested set relations.
Newstead (1989; Experiment 2) evaluated
how participants interpreted syllogisms when represented by Euler circles
versus quantified statements. Newstead found that although Gricean errors
of interpretation occurred when syllogisms were represented by Euler circles
and quantified statements, the proportion of conversion errors, such as
converting “Some A are not B” to “Some B are not A,” was significantly reduced
in the Euler circle task. For example, less than 5% of the participants
generated a conversion error for “Some… not” on the Euler circle task, whereas
this error occurred on 90% of the responses for quantified statements.
Griggs and Newstead (1982) tested
participants on the THOG problem, a difficult deductive reasoning problem
involving disjunction. They obtained a
substantial amount of facilitation by making the problem structure explicit
using trees. According to the authors,
the structure is normally implicit due to negation and the tree structure
facilitates performance by cuing formation of a mental model similar to that of
nested sets.
Facilitation has also been obtained by
making extensional relations more salient in the domain of categorical
inductive reasoning. Sloman (1998) found
that people who were told that all members of a superordinate have some
property, e.g., all flowers are susceptible to thrips, did not conclude that
all members of one of its subordinates inherited the property, e.g., they did
not assert that this guaranteed that all roses are susceptible to thrips. This was true even for those people who
believed that roses are flowers. But if
the assertion that roses are flowers was included in the argument, then people
did abide by the inheritance rule, assigning a probability of 1 to the
statement about roses. Sloman argued
that this occurred because induction is mediated by similarity and not by class
inclusion unless the categorical – or set – relation is made transparent within
the statement composing the argument (for an alternative interpretation, see
Calvillo & Revlin, 2005).
Facilitation in other types of
probability judgment can also be obtained by manipulating the salience and
structure of set relations. Sloman et
al. (2003) found that almost no one exhibited the conjunction fallacy when the
options were presented as Euler circles, a representation that makes set
relations explicit. Fox and Levav (2004)
and Johnson-Laird et al. (1999) also improved judgments on probability problems
by manipulating the set structure of the problem.
In summary, the
empirical review supports five main conclusions. First, the facilitory effect of natural
frequencies on Bayesian inference varied considerably across the reviewed
studies (see Table 3), potentially resulting from differences in the general
intelligence level and motivation of participants (Brase et al., in
press). These findings support the
nested sets hypothesis to the degree that intelligence and motivation reflect
the operation of domain general and strategic—rather than automatic (i.e.,
modular) cognitive processes.
Second, questions
that prompt use of category instances and divide the solution into the sets
needed to compute the Bayesian ratio facilitate probability judgment,
suggesting that facilitation depends on cues to the set structure of the
problem rather than (an evolved) capacity to process natural frequencies. In further support of this conclusion,
partitioning the data into nested sets facilitates Bayesian inference
regardless of whether natural frequencies or single-event probabilities are
employed (see Table 5).
Third, frequency
judgments are guided by inferential strategies that reflect incomplete,
fragmentary memories that do not entail the base-rates (e.g., Gluck &
Bower, 1988; Bradburn et al., 1987), suggesting that Bayesian inference does
not derive from the accurate encoding and retrieval of natural
frequencies. In addition, natural
frequencies and single-event probabilities are rated similarly in their
perceived clarity, understandability, and impact on the respondent’s behavior
(Brase, 2002b), further suggesting that the mind does not embody inductive
reasoning mechanisms (that are designed) to process natural frequencies.
Fourth, people
(a) do not accurately weight and combine event frequencies, and (b) utilize
event frequencies that are irrelevant in the calculation of Bayes’ theorem
(e.g., Griffin & Buehler, 1999; Kleiter et al., 1997), suggesting that the
cognitive operations that underlie Bayesian inference do not conform to Bayes’
theorem. Furthermore, base-rate usage
increases under frequentist representations (e.g., Griffin & Buehler,
1999), suggesting that facilitation results from the property of natural
frequencies to represent the sample and effect sizes, which highlight the set
structure of the problem and make transparent what is relevant for problem
solving.
Table 5. Percent correct for Bayesian inference
problems reported in the literature (sample sizes in parentheses)
Information
structure
Non-partitive Partitive
Study Probability Frequency Probability Frequency
Girotto & Gonzalez (2001, Study 5) --- --- --- 53 (20)
Girotto & Gonzalez (2001, Study 6) 0 (20) 0
(20) --- ---
Macchi (2000) 6
(30)* 3 (30) 36 (30) 40 (30)
Sloman et al., (2003, Exp. 1) 20 (25)* --- 48 (48)* 51
(45)
Sloman et al., (2003, Exp. 2) --- --- 48 (25)* ---
Sloman et al., (2003, Exp. 4) --- 21
(33) --- ---
Note. Studies that present questions that require
the respondent to compute a conditional-event probability are indicated by
*. The remaining studies present
questions that prompt the respondent to compute the two terms of the Bayesian
solution.
Finally, nested set
representations facilitate reasoning in a range of classic deductive and
inductive reasoning tasks, supporting the nested set hypothesis that the mind
embodies a domain general capacity to perform elementary set operations and
that these operations can be induced by cues to the set structure of the
problem to facilitate reasoning in any context where people tend to rely on
associative rather than extensional, rule-based processes.
This section provides a conceptual
analysis that addresses (1) the plausibility of the natural frequency assumptions,
and (2) whether natural frequency representations support properties that are
central to human inductive reasoning competence, including reasoning about
statistical independence, estimating the probability of unique events, and
reasoning on the basis of similarity, analogy, association, and causality.
The natural sampling framework was
established by the seminal work of Kleiter (1994), who assessed “the
correspondence between the constraints of the statistical model of natural
sampling on the one hand, and the constraints under which human information is
acquired on the other" (p. 376).
Kleiter proved that under natural sampling and other conditions (e.g.,
independent identical sampling), the frequencies corresponding to the
base-rates are redundant and can be ignored. Thus conditions of natural
sampling can simplify the calculation of the relevant probabilities and, as a
consequence, facilitate Bayesian inference (see footnote 2). Kleiter’s computational argument does not
appeal to evolution and was advanced with careful consideration of the
assumptions upon which natural sampling are based. Kleiter noted, for example, that the natural
sampling framework (a) is limited to hypotheses that are mutually exclusive and
exhaustive, and (b) depends on collecting a sufficiently large sample of event
frequencies to reliably estimate population parameters.
Although people may sometimes treat hypotheses
as mutually exclusive (e.g., “this person is a Democrat, so they must be
anti-business”), this constraint is not always satisfied: Many hypotheses are nested (e.g., “she has
breast cancer” vs. “she has a particular type of breast cancer”) or overlapping
(e.g., “this patient is anxious or depressed”).
People’s causal models typically provide a wealth of knowledge about
classes and properties, allowing consideration of many kinds of hypotheses that
do not necessarily come in mutually exclusive, exhaustive sets. As a consequence, additional principles are
needed to broaden the scope of the natural sampling framework to address
probability estimates drawn from hypotheses that are not mutually exclusive and
exhaustive. In this sense, the nested
sets theory is more general: It can
represent nested and overlapping hypotheses by taking the intersection (e.g.,
“she has breast cancer and it is type X”) and union (e.g., “the patient
is anxious or depressed) of sets, respectively.
As Kleiter further notes, inferences
about hypotheses from encoded event frequencies are warranted to the extent
that the sample is sufficiently large and provides a reliable estimate of the
population parameters. The efficacy of
the natural sampling framework therefore depends on establishing (1) the
approximate number of event frequencies that are needed for a reliable
estimate, (2) whether this number is relatively stable or varies across
contexts, and (3) whether or not people can encode and retain the required
number of events.
3.2. Representing qualitative relations
In contrast to single-event
probabilities, natural frequencies preserve information about the size of the
reference class and, as a consequence, do not directly indicate whether an
observation and hypothesis are statistically independent. For example, probability judgments drawn from
natural frequencies do not tell us that a symptom present in (a) 640 out of 800
patients with the disease and (b) 160 out of 200 patients without the disease
is not diagnostic because 80% have the symptom in both cases (Over 2000a,
2000b; Over & Green, 2001; Sloman & Over, 2003). Thus, probability estimates drawn from
natural frequencies do not capture important qualitative properties.
Furthermore, in contrast to the cited
benefits of non-normalized representations (e.g., Gigerenzer & Hoffrage,
1995), normalization may serve to simplify a problem. For example, is someone offering us the same
proportion if he tries to pay us back with 33 out of 47 nuts he has gathered
(i.e., 70%), after we have earlier given him 17 out of 22 nuts we have gathered
(i.e., 77%)? This question is trivial
after normalization, as its transparent that 70 out of 77 out of 100 are nested
sets (Over, in press).
One objection
to the claim that the encoding of natural frequencies supports Bayesian
inference is that intuitive probability judgment (a) often concerns beliefs
regarding single events or (b) the assessment of hypotheses about novel or partially
novel contexts, for which prior event frequencies are unavailable. For example, the estimated likelihoods of
specific outcomes are often based on novel and unique one-time events, such as
the likelihood that a particular constellation of political interests will lead
to a coalition. Thus, Kahneman and
Tversky (1996, p. 589) argue that the subjective degree of belief in hypotheses
derived from single events or novel contexts “cannot be generally treated as a
random sample from some reference population, and their judged probability
cannot be reduced to a frequency count.”
Furthermore,
theories based on natural frequency representations do not allow for the widely
observed role of similarity, analogy, association, and causality in human
judgment (for recent reviews of the contribution of these factors, see Gilovich
et al., 2002 and Sloman, 2005). The
nested sets hypothesis presupposes these determinants of judgment by appealing
to a dual-process model of judgment (Evans & Over, 1996; Sloman, 1996;
Stanovich & West, 2000), a move that natural frequency theorists are not
(apparently) willing to make (Gigerenzer & Regier, 1996). The dual-process model attributes responses
based on associative principles like similarity or responses based on retrieval
from memory like analogy to a primitive associative judgment system. It attributes responses based on more
deliberative processing involving rule-based inference such as the elementary
set operations that respect the logic of set inclusion and facilitate Bayesian
inference to a second deliberative system.
However, this second system is not limited to analyzing set
relations. It can also, under the right
conditions, do the kinds of structural analyses required by analogical or
causal reasoning.
Within this
framework, natural frequency approaches can be viewed as making claims about
rule-based processes (i.e., the application of a psychological plausible rule
for calculating Bayesian probabilities), without addressing the role of
associative processes in Bayesian inference.
In light of the substantial literatures that demonstrate the role of
associative processes in human judgment, Kahneman and Tversky (1996, p. 589)
conclude, “there is far more to inductive reasoning and judgment under
uncertainty than the retrieval of learned frequencies.”
The conclusions drawn from the diverse
body of empirical and conceptual issues addressed by this review consistently
challenge theories of Bayesian inference that depend on natural frequency
representations (see Table 2), demonstrating that coherent probability
estimates are not derived according to an equational form for calculating
Bayesian posterior probabilities that requires the use of natural frequency
representations.
The evidence
instead supports the nested sets hypothesis that judgmental errors and biases
are attenuated when Bayesian inference problems are represented in a way that
reveals underlying set structure, thus demonstrating that the cognitive
capacity to perform elementary set operations constitutes a powerful means of
reducing associative influences and facilitating probability estimates that
conform to Bayes’ theorem. An
appropriate representation can induce people to substitute reasoning by rules
for reasoning by association. In
particular, the review demonstrates that judgmental errors and biases were
attenuated when (a) the question induced an outside view by prompting the
respondent to utilize the sample of category instances presented in the problem
and (b) the sample of category instances were represented in a nested set
structure that partitioned the data into the components needed to compute the
Bayesian solution.
Although we disagree with the various theoretical
interpretations one could attribute to natural frequency theorists regarding
the architecture of mind, we do believe that they have focused on and
enlightened us about an important phenomenon.
Frequency formulations are a highly efficient way to obtain drastically
improved reasoning performance in some cases.
Not only is this an important insight to improve and teach reasoning,
but it also focuses theorists on a deep and fundamental problem: What are the conditions that compel people to
overcome their natural associative tendencies in order to reason extensionally?
Acknowledgements
This work was supported by National Science Foundation Grants DGE-0536941 and DGE-0231900 to Aron K. Barbey. We are grateful to Gary Brase, Jonathan Evans, Vittorio Girotto, Philip Johnson-Laird, Gernot Kleiter, and David Over for their very helpful comments on prior drafts of this paper. AKB would also like to thank Lawrence W. Barsalou, Sergio Chaigneau, Brian R. Cornwell, Pablo A. Escobedo, Shlomit R. Finkelstein, Corey Kallenberg, Robert N. McCauley, Richard Patterson, Diane Pecher, Philippe Rochat, Ava Santos, W. Kyle Simmons, Irwin Waldman, Christine D. Wilson, and Phillip Wolff for their encouragement and support while writing this paper.
References
Ayton, P. & Wright, G. (1994). Subjective probability: What should we
believe. In G.
Wright & P. Ayton
(Eds.), Subjective Probability (pp. 163-183).
Barwise, J. & Etchemendy, J.
(1989). Model-theoretic semantics. In Posner M. (Ed.)
Foundations of
cognitive science (pp. 207-243).
Bauer, M.I. & Johnson-Laird, P.N.
(1993). How diagrams can improve
reasoning.
Psychological
Science, 4, 372-378.
questions: The impact
of memory and inference on surveys. Science,
236, 157-
161.
Brase, G. (2002a). Ecological and evolutionary validity:
Comments on Johnson-Laird,
Legrenzi, Girotto,
Legrenzi, and Caverni’s (1999) mental-model theory of
extensional
reasoning. Psychological Review, 109,
722-728.
Brase, G. (2002b). Which statistical formats facilitate what
decisions? The perception
and influence of
different statistical information formats.
Journal of Behavioral
Decision
Making, 15, 381-401.
Brase, G., Fiddick, L. & Harries,
C. (in press). Participant recruitment
methods and
statistical reasoning
performance. The Quarterly Journal of
Experimental
Psychology (in
press).
Brown, N.R., Rips, L.J. & Shevell,
S.K. (1985). The subjective dates of
natural events
in long-term memory. Cognitive Psychology, 17, 139-177.
Calvillo, D.P. & Revlin, R.
(2005). The role of similarity in
deductive categorical
inference. Psychonomic Bulletin and Review, 12, 938–944.
Casscells, W., Schoenberger, A. &
Graboys, T.B. (1978). Interpretation by
physicians
of clinical
laboratory results. The
1000.
Cosmides, L. & Tooby, J.
(1996). Are humans good intuitive
statisticians after all?
Rethinking some
conclusions from the literature on judgment under uncertainty.
Cognition, 58,
1-73.
Eddy, D.M. (1982). Probabilistic reasoning in clinical medicine:
Problems and
opportunities. In D. Kahneman, P. Slovic, & A. Tversky
(Eds.), Judgment Under
Uncertainty:
Heuristics and Biases (pp. 249-267).
Estes, W.K.,
effects in category
learning: A comparison of parallel network and memory
storage-retrieval
models. Journal of Experimental Psychology:
Learning,
Memory, and
Cognition, 15, 556-571.
Evans, J.St.B.T., Handley, S.J.,
Perham, N, Over, D.E. & Thompson, V.A. (2000).
Frequency versus
probability formats in statistical word problems. Cognition, 77,
197-213.
Evans, J.St.B.T., Handley, S. J., Over,
D. E. & Perham, (2002). Background
beliefs in
Bayesian
inference. Memory and Cognition, 30,
179-190.
Evans, J.St. B. T. & Over, D. E.,
(1996). Rationality and reasoning.
Press.
Fodor, J.A. (1983). Modularity of Mind.
Fox, C. & Levav, J. (2004). Partition-edit-count: Naïve extensional reasoning in
judgment of
conditional probability. Journal of
Experimental Psychology:
General,
133, 626-642.
Gigerenzer, G. (1991). How to make cognitive illusions disappear:
Beyond “heuristics
and biases.” European Review of Social Psychology, 2,
83-115.
Gigerenzer, G. (1993). The superego, the ego, and the id in
statistical reasoning. In G.
Keren & G. Lewis
(Eds.), A Handbook of Data Analysis in the Behavioral
Sciences (pp.
331-339).
Gigerenzer, G. (1996a). On narrow norms and vague heuristics: A reply
to Kahneman
and Tversky
(1996). Psychological Review, 103, 592-596.
Gigerenzer, G. (1996b). The psychology of good judgment: Frequency
formats and
simple
algorithms. Medical Decision Making,
16, 273-280).
Gigerenzer, G. (1998). Ecological intelligence: An adaptation for
frequencies. In
Cummins & C.
Allens (Eds.), The Evolution of Mind.
University Press.
Gigerenzer, G. (2000). Adaptive Thinking: Rationality in the Real
World.
NY:
Gigerenzer, G. (2002). Calculated Risks.
Gigerenzer, G. (2006). Center for Adaptive Behavior and Cognition
summary of
research area
II: Ecological rationality. Retrieved October 1, 2006, from the
Center for Adaptive
Behavior and Cognition Web site:
http://www.mpib-
berlin.mpg.de/en/forschung/abc/forschungsfelder/feld2.htm
Gigerenzer, G., Hell, W. & Blank,
H. (1988). Presentation and content: The
use of base-
rates as a continuous
variable. Journal of Experimental
Psychology: Human
Perception and
Performance, 14, 513-525.
Gigerenzer, G. & Hoffrage, U.
(1995). How to improve Bayesian
reasoning without
instruction: Frequency formats. Psychological Review, 102, 684-704.
Gigerenzer,
G. & Regier, T.P. (1996). How do we
tell an association from a rule?
Psychological
Bulletin, 119, 23-26.
Gigerenzer, G. & Selten, R. (Eds.)
(2001). Bounded Rationality: The
Adaptive Toolbox.
Gigerenzer, G., Todd, P. & the ABC
research group (1999). Simple
Heuristics That
Make Us Smart.
Gilovich, T.,
Psychology of
Intuitive Judgment.
Press.
Girotto, V. & Gonzalez, M.
(2001). Solving probabilistic and
statistical problems: A
matter of information
structure and question form. Cognition,
78, 247-276.
Girotto, V. & Gonzalez, M.
(2002). Chances and frequencies in
probabilistic reasoning:
rejoinder to
Hoffrage, Gigerenzer, Krauss, and Martignon.
Cognition, 84, 353-
359.
Gluck, M.A. & Bower, G.H. (1988). From conditioning to category learning: An
adaptive network
model. Journal of Experimental
Psychology: General, 117,
227-247.
to cognitive
illusions? Cognitive Psychology, 38,
48-78.Griggs, R. A., and
Griggs, R. & Newstead, S.
(1982). The role of problem structure in
a deductive
reasoning task. Journal of Experimental Psychology: Learning, Memory,
and Cognition,
8, 297-307.
Grossen, B. & Carnine, D.
(1990). Diagramming a logic strategy: Effects on difficult
problem types and
transfer. Learning Disability Quarterly 13: 168-182.
Hammerton, M. (1973). A case of radical probability
estimation. Journal of
Experimental
Psychology, 101, 252-254.
Hertwig, R. & Gigerenzer, G.
(1999). The ‘conjunction fallacy’
revisited: How
intelligent
inferences look like reasoning errors. Journal
of Behavioral Decision
Making, 12,
275-305.
Hoffrage, U., Gigerenzer, G., Krauss,
S. & Martignon, L. (2002). Representation
facilitates
reasoning: What natural frequencies are and what they are not.
Cognition, 84,
343-352.
Johnson-Laird, P.N., Legrenzi, P.,
Girotto, V., Sonino Legrenzi, M. & Caverni, J.
(1999). Naïve probability: A mental model theory of extensional
reasoning.
Psychological
Review, 106, 62-88.
Kahneman, D. & Frederick, S.
(2002). Representativeness revisited:
Attribute
substitution in
intuitive judgment. In T. Gilovich, D.
(Eds.). Heuristics and Biases: The Psychology of
Intuitive Judgment.
Kahneman, D. & Frederick, S.
(2005). A model of heuristic
judgment. In K. J. Holyoak
& R. G. Morris
(Eds.), The
Kahneman, D. & Tversky, A.
(1973). On the psychology of
prediction. Psychological
Review, 80,
237-251.
Kahneman, D. & Tversky, A.
(1983). Can rationality be intelligently
discussed?
Behavioral and
Brain Sciences, 6, 509-510.
Kahneman, D. & Tversky, A.
(1996). On the reality of cognitive
illusions.
Psychological
Review, 103, 582-591.
Kahneman, D., Slovic, P. & Tversky,
A. (1982). Judgment Under
Uncertainty:
Heuristics and
Biases.
Keren, K. & Thijs, L.J.
(1996). The base-rate controversy: Is the glass half-full or half
empty?, Behavioral
and Brain Sciences, 19, 26.
Kleiter, G.D. (1994). Natural sampling: Rationality without
base-rates. In G.H. Fischer
& D. Laming
(Eds.) Contributions to Mathematical Psychology, Psychometrics,
and
Methodology (pp. 375-388).
Kleiter, G.D., Krebs, M.,
(1997). Do subjects understand base-rates? Organizational Behavior and Human
Decision
Processes, 72, 25-61.
Koehler, J.J. (1996). The base-rate fallacy reconsidered:
Descriptive, normative, and
methodological
challenges. Behavioral and Brain
Sciences, 19, 1-53.
Kurzenhauser, S. & Hoffrage, U.
(2002). Teaching Bayesian reasoning: An
evaluation
of a classroom
tutorial for medical students. Medical
Teacher, 24, 516-521.
Lagnado,
D. &
Koehler and
Decision
Making, pp. 157-176.
Lichtenstein, S., Slovic, P., Fischoff,
B., Layman, M. & Combs, B.
(1978). Judged
frequency of lethal
events. Journal of Experimental
Psychology: Human
Learning and
Memory, 4, 551-578.
Lindsey, S., Hertwig, R. &
Gigerenzer, G. (2003). Communicating
statistical DNA
evidence. Jurimetrics, 43, 147-163.
Linton, M. (1975). Memory for real-world events. In
(Eds.) Explorations
in Cognition. (pp. 376-404).
Linton, M. (1982). Transformations of memory in everyday
life. In Neisser, U. (Ed.)
Memory
Observed (pp. 77-91).
Macchi, L. (2000). Partitive formulation of information in
probabilistic problems:
Beyond heuristics and
frequency format explanations. Organizational
Behavior
and Human
Decision Processes, 82, 217-236.
Mellers, B. & McGraw, A.P.
(1999). How to improve Bayesian reasoning:
Comments
on Gigerenzer &
Hoffrage (1995). Psychological
Review, 106, 417-424.
Monaghan, P. & Stenning, K.
(1998). Effects of representational modality and thinking
style on learning to
solve reasoning problems. Proceedings of the 20th Annual
Meeting of the
Cognitive Science Society, 716-721.
Associates,
Newstead, S.E. (1982) The role of problem structure
in a deductive
reasoning task,
Journal of Experimental Psychology Learning, Memory, and Cognition, 8, 297-
307.
Newstead, S.E. (1989).
Interpretational errors in syllogistic reasoning. Journal of
Memory and Language,
28: 78-91.
Nosofsky, R.M., Kruschke,
based category
representations and connectionist learning rules. Journal of
Experimental
Psychology: Learning, Memory, and Cognition, 18, 211-233.
Over, D.E. (2000a). Ecological rationality and its
heuristics. Thinking and Reasoning,
6,
182-192.
Over, D.E. (2000b). Ecological issues: A reply to Todd, Fiddick,
& Krauss. Thinking
and Reasoning,
6, 385-388.
Over, D.E. (2003). From massive modularity to
meta-representation: The evolution of
higher
cognition. In D.E. Over (Ed.) Evolution
and the Psychology of Thinking:
The Debate (pp.
121-144).
Over, D.E. (in press). Content-independent conditional
inference. In Maxwell J.
Roberts (Ed.), Integrating
the Mind: Domain General versus Domain Specific
Processes in
Higher Cognition.
Over, D.E. & Green, D.W.
(2001). Contingency, causation, and
adaptive inference.
Psychological
Review, 108, 682-684.
Over, D.E. (Ed.) (2003). Evolution and the Psychology of Thinking:
The Debate. New
Pillemer, E.D., Rhinehart, E. D. & White,
S. H. (1986). Memories of life
transitions:
The first year in
college. Human Learning: Journal of
Practical Research &
Applications,
5, 109-123.
Ramsey, F.P. (1964). Truth and probability. In H.E. Kyburg, Jr., &
Studies in
Subjective Probability (pp. 61-92).
Ross, M. & Sicoly, F. (1979). Egocentric biases in availability and
attribution. Journal
of Personality
and Social Psychology, 37, 322-336.
Savage L.J. (1954). The Foundations of Statistics.
Schwartz, N. & Wanke, M.
(2002). Experimental and contextual
heuristics in frequency
judgment: Ease of
recall and response scales. In P.
Sedlmeier, & T. Betsch (Eds.)
Etc. Frequency
Processing and Cognition (pp. 89-108).
University Press.
Schwartz, N. & Sudman, S.
(1994). Autobiographical memory and
the validity of
retrospective
reports.
Sedlmeier, P. & Betsch T. (2002) Etc.
Frequency Processing and Cognition (pp. 89-
108).
Sedlmeier, P. & Gigerenzer, G.
(2001). Teaching Bayesian reasoning in
less than two
hours. Journal of Experimental Psychology:
General, 130, 380-400.
Bulletin, 1, 3-22.
hierarchies. Cognitive Psychology, 35, 1-33.
alternatives.
D.E. Over (Ed.) Evolution
and the Psychology of Thinking: The Debate (pp. 145-
170).
specific
categorization. In M. J. Roberts (Ed.).
Integrating the mind.
Psychology Press.
other fallacies. Organizational Behavior and Human Decision
Processes, 91,
296-309.
Stanovich, K.E. (1999). Who is Rational? Studies of Individual Differences in
Reasoning. Mahwah, N. J.:
Stanovich, K.E. & West, R.F.
(1998). Individual differences in
rational thought.
Journal of
Experimental Psychology: General, 127, 161-188.
Stanovich, K.E. & West, R.F.
(2000). Individual differences in
reasoning:
Implications for the
rationality debate. Behavioral and
Brain Sciences, 23, 645-
726.
Stanovich, K.E. & West, R.F.
(2003). Evolutionary versus instrumental
goals: How
evolutionary
psychology misconceives human rationality.
In D. E. Over (Ed.)
Evolution and
the Psychology of Thinking (pp. 171-230).
Psychology Press.
Stenning, K. (2002). Seeing Reason: Image and Language in
Learning to Think.
Tversky, A. & Kahneman, D.
(1973). Availability: A heuristic for
judging frequency
and probability. Cognitive Psychology, 5, 207-232.
Tversky, A. & Kahneman, D.
(1974). Judgment under uncertainty:
Heuristics and
biases. Science, 185, 1124-1131.
Tversky, A. & Kahneman, D.
(1983). Extensional versus intuitive
reasoning: The
conjunction fallacy
in probability judgment. Psychological
Review, 90, 293-315.
Tversky, A. & Koehler, D.
(2002). Support theory: A nonexistential
representation of
subjective
probability. In T. Gilovich, D.
Heuristics and
Biases: The Psychology of Intuitive Probability Judgment. (pp.
441-473)
Vranas, P.B.M. (2000). Gigerenzer’s normative critique of Kahneman
and Tversky.
Cognition, 76,
179-193.
Vranas, P.B.M. (2001). Single-case probabilities and content-neutral
norms: A reply to
Gigerenzer. Cognition, 81, 105-111.
Wagenaar,
W.A. (1986). My memory: A study of
autobiographical memory over six years.
Cognitive
Psychology, 18, 225-252.
Yamagishi,
K. (2003). Facilitating normative
judgments of conditional probability: Frequency
or nested sets? Experimental Psychology, 50, 97-106.
Zacks, R.T. & Hasher, L.
(2002). Frequency processing: A
twenty-five year
perspective. In P. Sedlmeier, & T. Betsch (Eds.) Etc.
Frequency Processing and
Cognition (pp.
21-36).
Footnotes
1 The respondent’s subjective degree of belief in the hypothesis (H) that
the patient has breast cancer given the observed datum (D) that she has a
positive mammography (i.e., the posterior probability, Pr(H│D)) can be
expressed numerically as the ratio between (a) the probability that the patient
has the disease and obtains a positive mammogram (Pr(H ∩D)), and (b) the
probability that the patient obtains a positive mammogram (Pr(D)). To calculate this ratio, Bayes’ theorem
incorporates two axioms of mathematical probability theory: the conditional
probability and additivity laws.
According to the former, (a) can be expressed by the probability that
the patient has the disease (i.e., the base-rate of the hypothesis) multiplied
by the probability that the patient obtains a positive mammogram given that she
has the disease (i.e., the hit-rate of the test): Pr(H ∩D) = Pr(H) Pr(D│H). The additivity rule is then applied to
express (b) as the probability that the patient has the disease and obtains a
positive mammogram, plus the probability that the patient does not have the
disease and obtains a positive mammogram: Pr(D) = Pr(H ∩D) + Pr(~H ∩D). The conditional probability rule can be
further applied to express this latter quantity as the complement of the
base-rate multiplied by the probability that the patient obtains a positive
mammogram given that she does not have the disease (i.e., the false alarm rate
of the test): Pr(~H ∩D) = Pr(~H)
Pr(D│~H). Thus, according to Bayes’
theorem, the probability that the patient has breast cancer given that she has
a positive mammography equals Pr(H│D) = Pr(H│D) / Pr(D) = Pr(H)
Pr(D│H) / Pr(H) Pr(D│H) + Pr(~H) Pr(D│~H) = (0.01)(0.80) /
((0.01)(0.80) + (0.99)(0.096)), or 7.8 per cent.
2 When estimated from natural frequency formats or
formats expressing numbers of chances, because they entail the sample and
effect sizes, posterior probabilities can be calculated in a way that does not require
that the probabilities be multiplied by the base-rates. The following simple form can be used to
calculate the probability of a hypothesis (H) given datum (D):
, where
is the number of cases having the datum in the presence of
the hypothesis, and
is the number of cases
having the datum in the absence of the hypothesis. This form requires that the respondent attend
only to the
and the
, whereas estimating posteriors with percentages requires
that transforming percentage values into conditional probabilities by
incorporating base-rates, making the calculation more complex than under
natural frequency formats.
3 There may be an important relation between sensitivity to nested-set structure and the law of the excluded middle that appears in logic. By this rule, all propositions of the form ‘p or not-p’ hold. We apply the rule, for example, to infer that everyone either has a disease or does not have the disease. We use it again to infer that everyone has some symptom or does not have it. Thus, the logical trees cited by natural frequency theorists are consistent with this fundamental logical rule (Over, in press).
4 These authors point out that the chance representation of probability is
commonly employed in everyday situations, such as when someone says, “A tossed
coin has 1 out of 2 chances of landing head up” or that “there are 1 out of a
million chances of winning the lottery.”
Chances preserve information about the size of the reference class
(i.e., the total population of chances).
Hoffrage et al. (2002) argue that chances are just frequencies. This is false (see Girotto & Gonzalez’s,
2002). Chances refer to the probability of
a single event and are based on the total population of chances rather than a
finite sample of observations. The
chances, for example, of drawing an ace from a standard deck of playing cards
are “4 out of 52”: There are 4 ways that
an ace can be drawn from the deck of 52 cards.
In contrast to natural frequencies, the size of the reference class
represents the total population (i.e., the deck of 52 cards). We might observe, for example, that 1 out of
10 cards randomly drawn from the deck is an ace, but this method of “natural
sampling” would not represent the chance or number of ways of drawing an ace
from the full deck. Chances cannot be
directly assessed by “counting occurrences of events as they are encountered
and storing the resulting knowledge base for possible use later” (i.e., natural
sampling; Brase, 2002, p. 384). Chances
are thus distinct from natural frequencies.
5 The mind-as-Swiss-army-knife, natural frequency algorithm, and natural frequency heuristic theories do not concern the encoding of event frequencies under naturalistic settings in general, but focus only on event frequencies that have a partitive structure. Thus, these approaches do not address the encoding of non-partitive event frequencies (e.g., the event frequency of naturally occurring independent events). Given that both frequencies exist in nature, it is unclear why only frequencies of the latter type are deemed important.
6 Bayes’ theorem in odds form refers to the
probability in favor of a hypothesis (H) over the probability of an
alternative hypothesis (~H) given observed datum (D) (i.e., the
posterior odds: (Pr(H│D) / (Pr(~H│D)). To compute the posterior odds Bayes’ theorem
incorporates two factors: the likelihood ratio and the prior odds. The likelihood ratio is a measure of whether
the datum is diagnostic with respect to the hypothesis (H). If the evidence is diagnostic then the
likelihood ratio will be positive, demonstrating that the observed datum is
more likely to occur under the presence of the hypothesis (H) than under
the alternative hypothesis (~H).
The prior odds is the ratio of base-rate probabilities (Pr(H) / Pr(~H)). Bayes’ theorem in odds form states that the
product of these quantities yields the posterior odds, Pr(H│D) / (Pr(~H│D)
= (Pr(D│H) / Pr(D│~H)) * ((Pr(H)
/ Pr(~H)). To directly
estimate the relative weight of the likelihood ratios and prior odds, Bayes’
theorem in odds form can be logarithmically transformed to yield log [Pr(H│D)/
Pr(~H│D)] = log [(Pr(D│H) / Pr(D│~H)]
+ log [(Pr(H) / Pr(~H)].
Under this formulation, the likelihood ratios and prior odds can be
treated as independent variables in a regression analysis to assess the
relative contribution of each factor in Bayesian inference.