To be published in Behavioral and Brain Sciences (in press)

© Cambridge University Press 2007

 

Below is the unedited, uncorrected final draft of a BBS target article that has been accepted for publication. This preprint has been prepared for potential commentators who wish to nominate themselves for formal commentary invitation. Please DO NOT write a commentary until you receive a formal invitation. If you are invited to submit a commentary, a copyedited, corrected version of this paper will be posted.

 

Darwin’s mistake: Explaining the discontinuity between human and nonhuman minds

PREVIOUS TITLE: "The relational reinterpretation hypothesis: Explaining the discontinuity between human and nonhuman minds"

 

by Derek C. Penn, Keith J. Holyoak and Daniel J. Povinelli

 

Derek C. Penn, dcpenn@ucla.edu (corresponding author)
Cognitive Evolution Group
University of Louisiana at Lafayette
New Iberia, LA, 705
60 USA

 

Keith J. Holyoak, holyoak@lifesci.ucla.edu
Department of Psychology
University of California, Los Angeles
Los Angeles, CA 90095 USA

 

Daniel J. Povinelli, ceg@louisiana.edu
Cognitive Evolution Group
University
of Louisiana at Lafayette
New Iberia, LA, 70560 USA


short abstract

There is a profound functional discontinuity between human and nonhuman minds. We argue that this discontinuity pervades nearly every domain of cognition and runs much deeper than even the spectacular scaffolding provided by language or culture can explain.  We hypothesize that the cognitive discontinuity between human and nonhuman animals is largely due to the degree to which human and nonhuman minds are able to approximate the higher-order, systematic, relational capabilities of a physical symbol system.  We conclude by suggesting that recent symbolic-connectionist models of cognition shed new light on the mechanisms that underlie the gap between human and nonhuman minds.

long abstract

Over the last quarter-century, the dominant tendency in comparative cognitive psychology has been to emphasize the similarities between human and nonhuman minds and to downplay the differences as “one of degree and not of kind” (Darwin 1871). In the present paper, we argue that Darwin was mistaken: the profound biological continuity between human and nonhuman animals masks an equally profound discontinuity between human and nonhuman minds.  To wit, there is a significant discontinuity in the degree to which human and nonhuman animals are able to approximate the higher-order, systematic, relational capabilities of a physical symbol system (Newell 1980).  We show that this symbolic-relational discontinuity pervades nearly every domain of cognition and runs much deeper than even the spectacular scaffolding provided by language or culture alone can explain. We propose a representational-level specification of where human and nonhuman animals’ abilities to approximate a PSS are similar and where they differ.  We conclude by suggesting that recent symbolic-connectionist models of cognition shed new light on the mechanisms that underlie the gap between human and nonhuman minds.

keywords

animals, causal reasoning, language of thought, propositional representations,

reinterpretation hypothesis, relational reasoning, theory of mind

introduction

Human animals—and no other—build fires and wheels, diagnose each other’s illnesses, communicate using symbols, navigate with maps, risk their lives for ideals, collaborate with each other, explain the world in terms of hypothetical causes, punish strangers for breaking rules, imagine impossible scenarios, and teach each other how to do all of the above. At first blush, it might appear obvious that human minds are qualitatively different from those of every other animal on the planet. Ever since Darwin, however, the dominant tendency in comparative cognitive psychology has been to emphasize the continuity between human and nonhuman minds and to downplay the differences as “one of degree and not of kind” (Darwin 1871).  Particularly in the last quarter century, many prominent comparative researchers have claimed that the traditional hallmarks of human cognition—e.g., complex tool use, grammatically-structured language, causal-logical reasoning, mental state attribution, metacognition, analogical inferences, mental time travel, culture, etc…—are not nearly as unique as we once thought (see, for example, Bekoff et al. 2002; Call 2006; Clayton et al. 2003; de Waal & Tyack 2003; Matsuzawa 2001; Pepperberg 2002; Rendell & Whitehead 2001; Savage-Rumbaugh et al. 1998; Smith et al. 2003; Tomasello et al. 2003a).  Pepperberg (2005) aptly sums up the comparative consensus as follows: “for over 35 years, researchers have been demonstrating through tests both in the field and in the laboratory that the capacities of nonhuman animals to solve complex problems form a continuum with those of humans.”

Of course, many scholars continue to claim that there is something qualitatively different about at least some human faculties, particularly those associated with language and a representational theory of mind (see, for example, Bermudez 2003; Carruthers 2002; Donald 2001; Mithen 1996; Premack in press; Suddendorf & Corballis in press). Nearly everyone agrees that there is something uniquely human about our ability to represent and reason about our own and others’ mental states (e.g., Tomasello et al. 2005). And most linguists and psycho-linguists argue that there is a fundamental discontinuity between human and nonhuman forms of communication (e.g., Chomsky 1980; Jackendoff 2002; Pinker 1994). But the trend among comparative researchers is to construe the uniquely human aspect of these faculties in increasingly narrow terms.  Hauser, Chomsky et al. (2002a), for example, continue to claim that grammatically-structured languages are unique to the human species but suggest that the only component of the human language faculty which is, in fact, uniquely human is the computational mechanism of recursion.  The rest of our “conceptual-intentional” system, they argue, differs from that of nonhuman animals only in “quantity rather than kind” (p. 1573).  Similarly, Tomasello and Rakoczy (2003) argue that the ability to participate in cultural activities with shared goals and intentions is uniquely human, but claim that the cognitive skills of a human child born onto a desert island and somehow magically kept alive by itself until adulthood “would not differ very much—perhaps a little, but not very much” from those of other great apes (see also Tomasello et al. 2003a; Tomasello et al. 2005). 

Notwithstanding the broad comparative consensus arrayed against us, the hypothesis we will be proposing in the present paper is that Darwin was mistaken: the profound biological continuity between human and nonhuman animals masks an equally profound functional discontinuity between the human and nonhuman mind[1].  Indeed, we will argue that the functional discontinuity between human and nonhuman minds pervades nearly every domain of cognition—from reasoning about spatial relations to deceiving conspecifics—and runs much deeper than even the spectacular scaffolding provided by language or culture alone can explain.  

At the same time, we know from Darwin’s more well-grounded principles that there are no unbridgeable gaps in evolution. Thus, one of the most important challenges confronting cognitive scientists of all stripes, in our view, is to explain how the manifest functional discontinuity between extant human and nonhuman minds could have evolved in a biologically plausible manner.

The first—and probably most important—step in answering this question is to clearly identify the similarities and the dissimilarities between human and nonhuman cognition from a purely functional point of view.  We thus spend the bulk of the paper reexamining the evidence for ‘human-like’ cognitive abilities among nonhuman animals at a functional level before speculating as to how these processes might be implemented.  We cover a wide variety of domains, species, and experimental protocols ranging from spatial relations and mental state reasoning in the lab to dominance relations and transitive inferences in the wild. Across all these disparate cases, a consistent pattern emerges: while there is a profound similarity between human and nonhuman animals’ abilities to learn about and act on the perceptual relations between events, properties and objects in the world, only humans appear capable of reinterpreting the higher-order relation between these perceptual relations in a structurally systematic and inferentially productive fashion.  In particular, only humans form general categories based on structural rather than perceptual criteria, find analogies between perceptually disparate relations, draw inferences based on the hierarchical or logical relation between relations, cognize the abstract functional role played by constituents in a relation as distinct from the constituents’ perceptual characteristics, or postulate relations involving unobservable causes such as mental states and hypothetical physical forces.  There is not simply a consistent absence of evidence for any these higher-order relational operations in nonhuman animals; there is compelling evidence of an absence.  

In the last part of the paper, we argue for the representational-level implications of our analysis.  Povinelli and colleagues have previously proposed that humans alone are able to ‘reinterpret’ the world in terms of unobservable, hypothetical entities such as mental states and causal forces and that our ability to do so relies on a unique representational system that has been grafted onto the cognitive architecture we inherited from our nonhuman ancestors (Povinelli 2000, 2004; Povinelli & Giambrone 2001; Povinelli & Preuss 1995; Povinelli & Vonk 2003, 2004; Vonk & Povinelli 2006). Independently, Holyoak, Hummel and colleagues have argued that the ability to reason about higher-order relations in a structurally systematic and inferentially productive fashion is a defining feature of the human mind and requires the distinctive representational capabilities of a “biological symbol system” (Holyoak & Hummel 2000, 2001; Hummel & Holyoak 1997, 2001, 2003; Kroger et al. 2004; Robin & Holyoak 1995).  Herein we combine, revise and substantially expand on the hypotheses proposed by these two research groups. 

We argue that most of the salient functional discontinuities between human and nonhuman minds—including our species’ unique linguistic, mentalistic, cultural, logical and causal reasoning abilities—result in part from the difference in degree to which human and nonhuman cognitive architectures are able to approximate the higher-order, systematic, relational capabilities of a physical symbol system (Newell 1980; Newell & Simon 1976).  Although human and nonhuman animals share many similar cognitive mechanisms, our ‘Relational Reinterpretation’ hypothesis is that only human animals possess the representational processes necessary for systematically reinterpreting first-order perceptual relations in terms of higher-order, role-governed relational structures akin to those found in a physical symbol system. We conclude by suggesting that recent advances in symbolic-connectionist models of cognition provide one possible explanation for how our species’ unique ability to approximate the higher-order relational capabilities of a physical symbol system might have been grafted onto the proto-symbolic cognitive architecture we inherited from our nonhuman ancestors in a biologically plausible manner.

similarity

perceptual vs. relational similarity

We begin our review of the similarities and differences between human and nonhuman cognition with what William James (1890/1950) called “the very keel and backbone of our thinking”: sameness. The ability to evaluate the perceptual similarity between stimuli is clearly the sine qua non of biological cognition, subserving nearly every cognitive process from stimulus generalization and Pavlovian conditioning to object recognition, categorization, and inductive reasoning. Humans, however, are not limited to evaluating the similarity between objects based on perceptual regularities alone.  Humans not only recognize when two physical stimuli are perceptually similar, they can also recognize that two ideas, two mental states, two grammatical constructions or two causal-logical relations are similar as well. Even preschool-age children understand that the relation between a bird and its nest is similar to the relation between a dog and its doghouse despite the fact that there is little “surface” or “object” similarity between the relations’ constituents (Goswami & Brown 1989, 1990).   Indeed, as numerous researchers have shown, the propensity to evaluate the similarity between states of affairs based on the causal-logical and structural characteristics of the underlying relations rather than on their shared perceptual features appears quite early and spontaneously in all normal humans—as early as 2-5 yrs of age depending on the domain and complexity of the task (Gentner 1977; Goswami 2001; Halford 1993; Holyoak et al. 1984; Namy & Gentner 2002; Rattermann & Gentner 1998; Richland et al. 2006).

In short, there appears to be at least two kinds of similarity judgments at work in human thought: judgments of perceptual similarity based on the relation between observed features of stimuli, and judgments of non-perceptual relational similarity based on logical, functional and/or structural similarities between relations and systematic correspondences between the abstract roles that elements play in those relations (Gentner 1983; Gick & Holyoak 1980, 1983; Goswami 2001; Markman & Gentner 2000).  The question we are interested in here is whether or not there is any evidence for non-perceptual relational similarity judgments in nonhuman animals as well.

same-different relations

Among comparative researchers, the most widely replicated test of relational concept learning over the last quarter-century has been the simultaneous same/different (“S/D”) task in which the subject is trained to respond one way if two simultaneously presented stimuli are the same and to respond a different way if the two stimuli are different.  In the more challenging relational match-to-sample (RMTS) task, the subject must select the choice display in which the perceptual similarity among elements in the display is the same as the perceptual similarity among elements in the sample stimulus. For example, presented with a pair of identical objects, AA, as a sample stimulus, the subject should select BB rather than CD; presented with a pair of dissimilar objects, EF, as the sample stimulus, the subject should select GH rather than JJ (see Thompson & Oden 2000 for a seminal discussion). 

While Premack (1983a; 1983b) initially reported that only language-trained chimpanzees passed S/D and RMTS tasks, success on two-item S/D tasks has since been demonstrated in parrots (Pepperberg 1987), dolphins (Herman et al. 1993; Mercado et al. 2000), baboons (Bovet & Vauclair 2001) and pigeons (Blaisdell & Cook 2005; Katz & Wright 2006) among others. Thompson et al. (1997) showed that language-naive chimpanzees with some exposure to token-based symbol systems are able to pass a two-item RMTS task (cf. Premack 1988).  Vonk (2003) has reported that three orangutans and one gorilla were able to pass a complex two-item RMTS task without any explicit symbol or language training at all.  Fagot et al. (2001) have shown that language-naïve baboons can pass an RMTS task involving arrays of elements (see discussion below); and Cook and Wasserman (in press) have reported successful results on an array-based RMTS task with pigeons. So passing S/D and RMTS tasks does not appear to be limited to language-trained apes or even primates.

Regardless of which nonhuman species are capable of passing S/D and RMTS tasks, the more critical and largely overlooked point is this: both of these experimental protocols lack the power, even in principle, of demonstrating that a subject cognizes sameness and difference as abstract, relational concepts which are 1) independent of any particular source of stimulus control, and 2) available to serve in a variety of further higher-order inferences in a systematic fashion. A functional decomposition of the S/D and RMTS protocols reveals that the minimum cognitive capabilities necessary to pass these tests are much more modest. 

The fundamental problem is that the ‘same-different’ relation at stake in the classic S/D task can be reduced to a continuous, analog estimate of the degree of perceptual variability between the elements in each display.  Halford et al. (1998b) refer to this type of cognitive trick as “conceptual chunking.”  Chunking reduces the complexity of processing a relation at the cost of losing the original structure and components of the relation itself, but suffices when the task does not require the structure of the relation itself to be taken into account.  A cognizer could pass a classic S/D task by calculating an analog estimate of the variability between items in the sample display and then employ a simple conditional discrimination to select the appropriate behavioral response to this chunked result.  Thus, success on an S/D task may imply that a subject can generalize a rule-like discrimination beyond any particular feature in the training stimuli; but it cannot be taken as evidence that the subject has understood sameness and difference as structured relations that are mutually exclusive or that can be freely generalized beyond the modality-specific rule the subject used in a particular learning context.

The same deflationary functional analysis applies, mutatis mutandis, to the RMTS task.  The apparent relational complexity of the RMTS task can be significantly reduced by segmenting the task into separate chunked operations that are evaluated sequentially.  First, the subject can evaluate the variability within the first-order relations by chunking them into analog variables. Second, the subject can employ a straightforward conditional discrimination to select the appropriate choice display: e.g., <if the variability of the sample display is low, select the choice display with a low variability>.  While this may qualify as a ‘higher-order’ operation, it does not qualify as a higher-order relational operation since the constituent structure of the first-order relations are no longer relevant or available to the higher-order process (see again Halford et al. 1998b). At best, the RMTS task demonstrates that nonhuman animals can select the choice display that has the same degree of between-item variability as the sample display.  But the task says nothing about nonhuman animals’ ability to evaluate the non-perceptual relational similarity between those relations.

The preceding functional decomposition of the S/D and RMTS tasks is not merely a hypothetical possibility. There is now good experimental evidence that chunking and segmentation are precisely the tactics that nonhuman animals employ when they succeed at S/D and RMTS tasks. Wasserman and colleagues, for example, have shown that both pigeons and baboons have much less difficulty passing S/D tasks when there are 16 items in each set than when there are only 2 items in each set (Wasserman et al. 2001; Young & Wasserman 1997). Wasserman  et al. showed that  a simple measure of item variability, based on Shannon and Weaver’s (1949) measure of informational entropy, nicely captures the functional pattern of nonhuman subjects’ discriminations across a variety of experimental conditions (reviewed in Wasserman et al. 2004).  Nonhuman animals’ performance on S/D tasks differs markedly from the categorical, logical distinction that humans make between sameness and difference. Human subjects’ responses to S/D tasks are also influenced by the degree of variability in the stimuli (Castro et al. in press; Young & Wasserman 2001); but most human subjects exhibit a categorical distinction between displays with no item variability (i.e., same) and those with any item variability at all (i.e., different).

An analogous discontinuity between human and nonhuman judgments of similarity has also been documented on RMTS tasks.  Fagot et al. (2001) presented 2 adult baboons and 2 adult human subjects with an RMTS task using arrays of 16 visual icons that were either all alike or all different.  Both baboon and human subjects learned to pass the RMTS test and successfully generalized to novel sets of stimuli. When the authors reduced the number of items in the sample set from 16 to 2 icons, the difference between the two species, however, was notable.  The impact on the human subjects’ responses was insignificant.  The baboons’ performance, however, fell to chance on different trials whereas their performance on same trials remained unchanged.  This markedly asymmetric effect is exactly what one would expect if the baboons were discriminating between second-order same and different relations by comparing the amount of variability (e.g., entropy) in the two displays.  That is, same trials with 2 icons continue to yield 0 entropy, but different trials now yield a small entropy value that is more difficult to discriminate from 0. 

Entropy is certainly not the only factor modulating nonhuman subjects’ judgments of sameness and difference.  Stimulus oddity as well as spatial organization and degree of similarity also play an important role (see Cook & Wasserman 2006 for an important review). Vonk (2003) has shown that language-naive apes can judge variability along specific perceptual dimensions (e.g., color rather than size or shape). And Bovet and Vauclair (2001) have shown that baboons can pass a ‘conceptual’ S/D task in which pairs of objects are to be treated as ‘same’ if they share a similar learning history or biological significance (e.g., objects-I-have-eaten vs. objects-I-have-not-eaten).  These results demonstrate that nonhuman animals—and not just language-trained chimpanzees—are capable of learning novel, sophisticated, rule-governed discriminations that generalize beyond any specific perceptual cue.  But in all of the results reported to date, the relevant discriminations are bound to a particular source of stimulus control (e.g., entropy, oddity, edibility).  There is no evidence that nonhuman animals understand what ‘sameness’ in one task has in common with ‘sameness’ in another. For example, after passing a ‘perceptual’ S/D task and having been trained to categorize objects as either “food” or “not food”, Bovet and Vauclair’s (2001) baboons nevertheless required an average of 14,576 additional trials on the ‘conceptual’ S/D task before their responses were correct 80% of the time on trials involving novel pairs of objects. 

The available evidence thus results suggests that the formative discontinuity in same-different reasoning does not lie between monkeys and apes, as Thompson and Oden (2000) proposed, but between nonhumans and humans. Chimpanzees and other nonhuman apes can pass RMTS tasks with only 2 items in the sample display (e.g., Thompson et al. 1997; Vonk 2003).  Baboons can pass RMTS tasks with as few as 3-4 items in each sample (Fagot et al. 2001); and pigeons can pass RMTS tasks with 16 items in each sample (Cook & Wasserman in press).  The difference between the performance of language-naïve pigeons and language-trained chimps on these tasks often comes down to a question of the number of items in each set and the number of trials necessary to reach criterion. As Katz and Wright (2006; 2002) point out, this strongly suggests that there is a difference in degree between various nonhuman species’ sensitivity to similarity discriminations (influenced by training regimen), not a difference in kind between their conceptual abilities to predicate same-different relations.  

The performance of human subjects, on the other hand, contrasts sharply with the performance of all other animal species.  Humans manifest an abrupt, categorical distinction between displays in which there is no variability between items and displays in which there is any variability at all (Cook & Wasserman 2006; Wasserman et al. 2004).  More importantly, human subjects appear to possess a “qualitatively distinct learning system” (Castro et al. in press) for reinterpreting sameness and difference in a logical and abstract fashion that generalizes beyond any particular source of stimulus control. Thus, even with respect to the most basic and ubiquitous of all cognitive phenomena—judgments of similarity—there is already a distinctive seam between human and nonhuman minds.

analogical relations

Premack (1983a p. 357) suggested that the RMTS task is an implicit form of analogy and claimed that “animals that can make same/different judgments should be able to do analogies.” Indeed, it is still widely accepted that the ability to pass an RMTS task is the ‘cognitive primitive’ for analogical reasoning (see, for example, Thompson & Oden 2000 p. 378). We disagree.  While recognizing perceptual similarities is certainly a necessary condition for making analogical inferences (inter alia), there is a qualitative difference between the kind of cognitive processes necessary to pass an S/D or RMTS task and the kind of cognitive processes necessary to reason in an analogical fashion. The relations at issue in S/D and RMTS tasks are based solely on the perceptual features of the constituents; and the constituents play undifferentiated and symmetrical roles in those relations (e.g., two objects are symmetrically either the same or different). Most true analogies, on the other hand, are based on relations in which the constituents play asymmetrical, causal-logical roles (e.g., the role that John plays in forming the relation, John loves Mary, is not equivalent to the role that Mary plays, perhaps to John’s dismay). Furthermore, genuine analogical inferences are made by finding systematic structural similarities between perceptually disparate relations, allowing the cognizer to draw novel inferences about the target domain independently from the perceptual similarity between the relations’ constituents (Gentner 1983; Gentner & Markman 1997; Holyoak & Thagard 1995). Accordingly, analogical relations sensu strictu cannot be reduced via chunking and segmentation, but require the cognizer to evaluate the abstract, higher-order relations at stake in a structurally systematic and inferentially productive fashion.

Analogical reasoning is a fundamental and ubiquitous aspect of human thought.  It is at the core of creative problem solving, scientific heuristics, causal reasoning and poetic metaphor (Gentner 2003; Gentner et al. 2001; Holyoak & Thagard 1995, 1997; Lien & Cheng 2000). But it is also central to the more prosaic ways that typical human children learn about the world and each other (Goswami 1992, 2001; Halford 1993; Holyoak et al. 1984). To date, however, the only evidence that any nonhuman animal is capable of analogical reasoning sensu strictu comes from the unreplicated feats of a single chimpanzee, Sarah, reported more than twenty-five years ago by Gillan et al. (1981). 

Sarah reportedly constructed and completed two distinct kinds of analogies.  The first was based on judging whether or not two geometric relationships were the same or different (e.g., large blue triangle is to small blue triangle as large yellow crescent is to small yellow crescent).  The second was based on judging the similarity between two “functional” relationships (e.g., padlock is to key as tin can is to can opener). Gillan et al. (1981) reported that Sarah was successful on both tests.

Savage-Rumbaugh was the first to point out that Sarah’s performance on the geometric version of the original tests could have been the result of a simple, feature-matching heuristic (cited by Oden et al. 2001).  In response, Oden et al. (2001) followed up Gillan et al.’s original experiment on geometric analogies with a series of more carefully constructed tests designed to flesh out Sarah’s actual cognitive strategy. These new experiments used geometric forms that varied along one or more featural dimensions (e.g., size, color, shape and/or fill).  After extensive testing, Oden et al. showed that Sarah was actually tracking the number of within-pair featural differences rather than the kind of relation between pairs of figures.  For example, whereas a human would see a color plus a shape change as differing from a size plus a fill change, Sarah saw these two transformations as equivalent because they both entailed two featural changes. 

Oden et al. (2001) argued that this strategy still demonstrates Sarah’s ability to reason about the “relation between relations”.  But there is a profound difference between the feature-based heuristic Sarah apparently adopted and the role-based structural operations that are the basis of analogical inference sensu strictu.  To be sure, keeping track of the number of within-pair featural changes certainly requires quite sophisticated representational processes. But the fact that Sarah apparently ignored the structure of the relation between pairs of figures suggests that she represented any featural change as an undifferentiated chunk for the purposes of this task.  Thus her strategy on this task appears to be computationally equivalent to the kind of chunking and segmentation strategies other nonhuman primates use to solve RMTS tasks. According to Oden et al.’s (2001) own analysis, Sarah failed to demonstrate a systematic sensitivity to the higher-order structural relation between relations.  It is this systematic sensitivity to higher-order structural relations which is, as Gentner (1983) has long argued, the hallmark of analogical reasoning in humans .  

Thus, the claim that nonhuman animals are capable of analogical inferences rests solely on Sarah’s performance in the test of functional analogies reported by Gillan et al. (1981).  There are many reasons to be skeptical of these results as well.  For one, Sarah’s performance on these analogies has never been replicated either by Sarah herself or by any other nonhuman subject.  Second, of the two experiments (3A and 3B) devoted to functional analogies, the authors themselves admit that the first, 3A, is open to an alternative feature-based account. Furthermore, the second experiment, 3B, did not require Sarah to complete or construct analogies.  It merely required her to respond to the relation between two pairs of objects with one of two plastic tokens that her experimenters interpreted as meaning ‘same’ and ‘different’.  Sarah’s extensive prior exposure to the objects used in this experiment, however, makes it very difficult to judge how she learned to cognize the relation between these objects (for example, how exactly did Sarah understand that the relation between “torn cloth” and “needle and thread” is the ‘same’ as the relation between “marked, torn paper” and “tape”?).  Indeed, the authors themselves admit that Sarah’s “unique experimental history” may have contributed to her success on these tasks (Gillan et al. 1981 p.11).

In short, what is sorely needed is a more extensive series of tests, like those carried out by Oden et al. (2001), to systematically tease apart the salient parameters in Sarah’s cognitive strategy. Until then, Sarah’s remarkable and unreplicated success on experiment 3B of Gillan et al. (1981) constitutes thin support for claiming that nonhuman animals are capable of analogical reasoning.

rules

One of the hallmarks of human cognition is our ability to freely generalize abstract relational operations to novel cases beyond the scope in which the relation was originally learned (see Marcus 2001 for a lucid exposition).  It is widely recognized, for example, that the ability to freely generalize relational operations over role-based variables is a necessary condition for using human languages (Gomez & Gerken 2000). Furthermore, experiments in artificial grammar learning (AGL) have shown that human subjects’ ability to learn and generalize abstract relations over role-based abstractions is not limited to natural languages (e.g., Altmann et al. 1995; Gomez 1997; Marcus et al. 1999; Reber 1967).  While it is quite controversial how the human cognitive architecture performs these rule-like feats (see, for example, Marcus 1999; McClelland & Plaut 1999; Seidenberg & Elman 1999), the fact that human subjects manifest these rule-like generalizations is “undisputed” (Perruchet & Pacton 2006).  The question we want to focus on here is whether or not this undisputable behavioral “fact” also holds for nonhuman animals.   

To date, the strongest positive evidence that nonhuman animals are able to generalize novel rules in a systematic fashion comes from an experiment with tamarin monkeys (Hauser et al. 2002b), which replicated an AGL experiment Marcus et al. (1999) had previously performed on seven-month old children.  In this “ga ti ga” protocol, subjects were habituated to sequences of nonsense syllables in one of two patterns (e.g., AAB vs. ABB).  Following habituation, the subjects were presented with test sequences drawn from an entirely novel set of syllables. Some of the test sequences followed the grammatical pattern presented during habituation and some did not.  Hauser et al. (2002b) showed that tamarin monkeys, like human children, were more likely to dishabituate to the novel, “ungrammatical” pattern.

In our view, the claim that this experiment provides evidence for ‘rule-learning’ in a nonhuman species is not entirely unfounded; but it needs to be carefully qualified since the kind of rule that tamarin monkeys learned in this experiment is qualitatively different from the kind of rules that are characteristic of human language and thought. Many early AGL experiments failed to distinguish between tasks that required subjects to learn perceptually-bound relations from tasks that required subjects to learn non-perceptual structural relations over role-based variables (for a critical review see Redington & Chater 1996).  Tunney and Altmann (1999), for example, point out that there are at least two forms of sequential dependencies that might be learned in an AGL experiment: “repeating” dependencies in which the occurrence of an element in one position determines the occurrence of the same element in a subsequent position, and “nonrepeating” dependencies in which the occurrence of an element in one position determines the occurrence of a different element in a subsequent position. Repeating elements share a higher-order perceptual regularity (i.e., perceptual similarity), whereas purely structural dependencies between nonrepeating elements do not.  Thus, sensitivity to sequential dependencies between repeating elements does not necessarily imply sensitivity to sequential dependencies between nonrepeating elements.  Indeed, Tunney and Altmann (2001) demonstrate that adult human subjects appear to have distinct and dissociable mechanisms for learning each kind of dependency. At best, Hauser, Weiss et al.’s (2002b) results demonstrate that tamarin monkeys possess the ability to learn repeating, perceptually-based dependencies.

Similarly, Gomez and Gerken (2000) distinguish between “pattern-based” and “category-based” rules. In the former case, the rule is abstracted from the sequence of perceptual relations between elements in a given array of training stimuli; in the latter case, the rule is based on the structural relation between abstract functional roles.  The AAB and ABB patterns learned by tamarin monkeys in Hauser, Weiss et al. (2002b) are an example of the former, pattern-based type of rule; the NOUN-VERB-NOUN pattern learned by human language users is an example of the latter, role-based type of rule.  Both kinds of operations may qualify as ‘rule-like’ in the sense that they generalize a given relation beyond the feature set on which it was originally trained.  But it is role-based (i.e., “algebraic”) rules, as Marcus (2001) points out, that are the hallmarks of human thought and language. To date, there is no evidence for this kind of rule learning in any nonhuman animal.

higher-order spatial relations

All normal adult humans are capable of using allocentric representations of spatial relations and of reasoning about the higher-order relation between spatial relations at different scales. The ubiquity of maps, diagrams, graphs, gestures and artificial spatial representations of all sorts in human culture speaks for itself.  Indeed, by the age of 3, all normal humans are able to reason about the higher-order relation between small-scale artificial spatial models and large-scale spatial relations in the real world (see Gattis 2005 for a review). DeLoache (2004) has argued that this ability represents a crucial step in children’s progress towards becoming “symbol minded”.  The question at hand is whether there is any evidence that nonhuman animals can reason about the higher-order relation between spatial relations in a similar fashion.

The best evidence to date for higher-order spatial reasoning in a nonhuman animal comes from the work of Kuhlmeier and colleagues (Kuhlmeier & Boysen 2001, 2002; Kuhlmeier et al. 1999).  Kuhlmeier et al.(1999) first instructed seven captive chimpanzees to associate the miniature and the full-sized versions of four distinct objects by drawing their attention to the association “verbally and gesturally” (p. 397). After this initial training, the chimpanzees watched as the experimenter hid a miniature can of soda behind a miniature version of one of the four objects within a 1:7 scale model of a full-sized room or outdoor enclosure. Then the chimpanzees were given the opportunity to find the real can of soda in the adjacent full-sized space.  When the chimpanzees were tested on a version of the task in which they were only rewarded if they retrieved the can of soda on the first search attempt (Kuhlmeier & Boysen 2001), six out of the seven subjects performed above chance.

These results demonstrate that chimpanzees are able to learn to associate two objects  (the real object and its miniature) that are highly similar perceptually and to locate a reward based on this association. But this is a far cry from being able to reason about the higher-order relation between a scale model and its real-world referent. Indeed, Kuhlmeier et al. (1999 p. 397) reported that one chimpanzee was able to locate the food rewards simply upon being shown the miniature version of the hiding place without referring to the scale model at all.  In short, this first protocol did not require the chimpanzees to reason about the higher-order spatial relation between the scale model and full-sized room.  A simple, learned association between two arbitrary cues sufficed. 

In a follow-up experiment designed to eliminate purely associative cues, Kuhlmeier et al. (2002) varied the congruency of the color, shape or position of the miniatures relative to the full-sized version of the hiding site.  As a group, the chimpanzees were successful when positional cues were absent.  However, when all the hiding sites were visually identical and the correct one had to be found based on its relative location within the scale model alone, only two of the seven chimpanzees performed above chance.

It is clear from these results that reasoning in terms of relative spatial locations alone is significantly more difficult for chimpanzees than reasoning in terms of object-based cues alone.  But it must be noted that even the successful performance of two out of the seven subjects does not demonstrate higher-order relational abilities since the four locations in which the hiding sites were placed remained constant across all of these experiments (Kuhlmeier, personal communication).  Thus, it is impossible to know whether or not the two successful chimpanzees were reasoning on the basis of a general, systematic understanding of the analogy between spatial locations in the scale model and spatial locations in the outdoor enclosure or, more modestly, had simply learned over the course of their long experimental history with this particular protocol to associate a particular location in the scale model with a particular location in the enclosure. 

It remains to be seen whether chimpanzees, or any other nonhuman animal, could succeed in this protocol if the hiding sites were randomly relocated on each trial. In the meantime, there is a conspicuous absence of evidence that any nonhuman animal can reason about scale models, maps or higher-order spatial relations in a human-like fashion.

transitive inference

Ever since Piaget (1928; 1955), the ability to make systematic inferences about unobserved transitive relations has been taken as a litmus test of logical-relational reasoning (but see Wright 2001).  For example, told that “Bill is taller than Charles” and “Abe is taller than Bill”, human children can infer that “Abe is taller than Charles” without being given any information about the absolute heights of Abe, Bill or Charles (Halford 1984).  Over the last quarter-century, comparative researchers have persistently claimed that nonhuman animals are capable of making transitive inferences in a purely logical-relational fashion as well.  Upon closer examination of the evidence, however, it becomes apparent that the kinds of transitive inferences made by nonhuman animals do not require a systematic, domain-general logical-relational competence but can be made using much more prosaic, domain-specific and egocentric information-processing mechanisms. 

transitive choices in the lab

For many decades now, the classic comparative test of transitive inference has been a nonverbal 5-item task developed by Bryant and Trabasso (1971) in which subjects are incrementally trained on pairs of stimuli  (i.e., A+B-, B+C-, C+D-, D+E-) and then tested on non-adjacent untrained pairs.  The discriminative relation between the stimuli used in most of these studies is not, in fact, transitive; it is the subjects’ choices that become transitive as a result of the pattern of differential reinforcement: i.e., repeated reinforcement of the choice of A over B and B over C eventually leads to the subject preferring A over C.  As Halford et al. (1998a) pointed out, a subject’s preferences can become transitive through incremental reinforcement without there being a transitive relation between the underlying task elements themselves and thus, without requiring the subject to understand anything about transitivity as a logical property.  Indeed, many researchers have shown that successfully selecting B over D in the traditional 5-item incremental protocol can be achieved using purely associative operations (De Lillo et al. 2001; Wynne 1995).

To be sure, reinforcement history cannot be the whole story, as Lazareva et al. (2004) have recently demonstrated.  Lazareva et al. (2004) trained eight hooded crows in a clever variation on Bryant and Trabasso’s 5-item protocol.  Five colored cards were used to represent the elements in the series A through E. The color on one side of the card served as the choice stimulus and a circle of the same color on the underside of the card served as the post-choice feedback stimulus.  The crows were asked to choose one of two simultaneously presented cards.  Importantly, the colored circles on the underside of the cards were only displayed to the crows after they had selected one of the two choice stimuli.  The crows were divided into two experimental groups. In the ordered-feedback group, the diameter of the circles associated with the choice stimuli became progressively smaller from A to E.  In the constant-feedback group, the diameter of the feedback circles did not change. After initial training, Lazareva et al. (2004) over-exposed both groups of crows to D+ E- pairings.  Under traditional associative models, massive over-exposure to D+ E- pairings should lead to preferentially selecting D over B.  Nevertheless, the crows in the ordered-feedback group selected B over D in the BD pairings, whereas the crows in the constant feedback group either chose at random or preferred D over B.

Lazareva et al.’s (2004) results show that reinforcement history alone cannot account for the emergence of choice transitivity among nonhuman animals.  Moreover, we agree with Lazareva et al.(2004) that these results are consistent with some kind of ‘spatial representation’ hypothesis (Gillan 1981).  But what is not often noted by comparative researchers is that evidence for an integrated representation of an ordered series is not in and of itself evidence for transitive reasoning or relational integration in a logical-deductive sense. There is more to making logically-underpinned transitive inferences than constructing an ordered representation of one’s choices.

As Lazareva et al. (2004) themselves point out, in order to claim evidence for logically-underpinned transitive inferences, one must show that the organism can, in fact, distinguish between transitive and nontransitive relations and that it makes its choices on the basis of this logical relation independently of other non-logical factors such as reinforcement history and training regime (see also Halford et al. 1998b; Wright 2001). The results reported by Lazareva et al. (2004) do not provide evidence for either of these criteria.

In a follow-up experiment, Lazareva and Wasserman (2006) showed that pigeons select B over D stimuli in the same protocol employed by Lazareva et al. (2004) even when the size of the post-choice cues is constant—which demonstrates that the transitive perceptual relation between the post-choice cues is not, in fact, computationally necessary for successfully passing this particular protocol.  It is unclear why crows—but not pigeons—were unable to pass the test in the constant-feedback condition.  There are many possible explanations. For example, Lazareva et al. (2004) did not rule out the possibility that it was simply the variability between post-choice cues that encouraged the crows’ successful responses rather than their transitivity per se.  In any case, in order to warrant the claim that the crows were reasoning on the basis of the logical relation between post-choice stimuli independently of other non-logical factors, it would be necessary to show that the crows could systematically generalize to novel stimuli on a first trial basis:  e.g., trained to associate a novel choice stimulus, X, with a colored circle of a given diameter, could the crows correctly choose between X and any stimulus from the set, A through E, on a first-trial basis in a systematic manner?  To date, there is no evidence that crows, or any other nonhuman animal, could pass such a test.

transitive inferences in the wild

Many researchers have argued that animals’ full transitive reasoning capabilities are most likely to manifest themselves in inferences involving social  relations (e.g., Bond et al. 2003; Grosenick et al. 2007; Kamil 2004; Paz et al. 2004).  Much of the early field work focused on nonhuman primates (see Tomasello & Call 1997 for a review).  The strongest evidence to date for transitive social inferences in a nonhuman animal does not come from primates, however, but from birds (see review by Kamil 2004) and fish (see Grosenick et al. 2007).  Paz et al. (2004), for example, showed that male pinyon jays can anticipate their own subordinance relation to a stranger after having witnessed the stranger win a series of confrontations with a familiar but dominant conspecific.  Similarly, Grosenick et al. (2007) allowed territorial A. burtoni male fish to observe pairwise fights between five rivals  (i.e., AB, BC, CD, DE) with the outcomes implying a dominance ordering of A > B > C > D > E.  When subsequently given a choice between B and D, observers preferred to spend more time adjacent to D rather than B.

Results such as these demonstrate that the ability to keep track of the dominance relations between tertiary dyads is not limited to nonhuman primates or even mammals (cf. Tomasello & Call 1997). Furthermore, fish and birds, in addition to nonhuman primates, can apparently use this information to make rational (i.e., ecologically adaptive) choices about how to respond to potential rivals (see also Bergman et al. 2003; Bond et al. 2003; Hogue et al. 1996; Silk 1999).  The accumulated evidence thus rules out a traditional associative explanation and strongly supports a more complex, information-processing account of how nonhuman animals keep track of and respond to dominance relations among conspecifics.  

But none of the available comparative evidence suggests that nonhuman animals are able to process transitive inferences in a systematic or logical fashion, even in the social domain.  The experiments reported by Paz et al. (2004) and Grosenick et al. (2007) only provide evidence for one particular kind of transitive inference: an inference from watching a series of agonistic interactions between conspecifics to an egocentric prediction about how to respond to a potentially dominant rival. Neither experiment provides any evidence that these subjects would also be able to systematically predict the relation between unobserved third-party dyads or could use their own interactions with a conspecific to predict that conspecific’s relation to other rivals—let alone answer the kind of omni-directional queries of which humans are manifestly capable: e.g., what individuals are dominant to B?, what is the relation between C and A?, is A dominant to C to a greater or lesser extent than B is dominant to C? (Goodwin & Johnson-Laird 2005; Halford et al. 1998b).

In short, while at least some nonhuman animals are clearly able to make transitive inferences about their own relation to potential rivals to a degree that rules out purely associative learning mechanisms, the comparative evidence accumulated to date is nevertheless consistent with the hypothesis that nonhuman animals’ understanding of transitive relations is punctate, egocentric, non-logical and context-specific.

hierarchical relations

Being able to process recursive operations over hierarchical relations is unarguably a key prerequisite for using a human language (Hauser et al. 2002a). And indeed, most normal human children are capable of reasoning about hierarchical class relations in a systematic and combinatorial fashion by the age of five (Andrews & Halford 2002; cf. Inhelder & Piaget 1964).  Given the ubiquity and importance of hierarchical relations in human thought, the lack of any similar ability in nonhuman animals would constitute a marked discontinuity between human and nonhuman minds. 

seriated cups and hierarchical reasoning

A number of comparative researchers have reinterpreted the behavior of nonhuman animals in hierarchical terms (e.g., Byrne & Russon 1998; Greenfield 1991; Matsuzawa 1996).  In each of these cases, however, there is no evidence that the nonhuman animals themselves cognized the task in hierarchical terms or employed hierarchically-structured mental representations to do so.  The most widely cited case of hierarchical reasoning among nonhuman animals, for example, has come from experiments involving seriated cups.  It has been claimed that “subassembly” (i.e., combining two or more cups as a subunit with one or more other cups) requires the subject to represent these nested relations in a combinatorial and “reversible” fashion (Greenfield 1991; Westergaard & Suomi 1994).  Indeed, Greenfield (1991) argued that children’s ability to nest cups develops in parallel with their ability to employ hierarchical phonological and grammatical constructions, and thus that the ability of nonhuman primates to seriate cups is the precursor to comprehending hierarchical grammars (see Matsuzawa 1996 for claims of a similar “isomorphism” between symbol and tool use). 

But is it actually necessary to cognize hierarchically-structured relations in order to assemble nested cups?  To date, Johnson-Pynn, Fragaszy and colleagues have provided the most convincing evidence that a nonhuman animal can use subassembly to assemble seriated cups (Fragaszy et al. 2002; Johnson-Pynn & Fragaszy 2001; Johnson-Pynn et al. 1999).  Yet, Johnson-Pynn and Fragaszy themselves dispute the claim that this behavior requires hierarchical relational operations of the kind suggested by Greenfield (1991).

Fragaszy et al. (2002), for example, presented seriated cups to adult capuchin monkeys, chimpanzees and 11-, 16- and 21-month old children.  Children of all three ages created five-cup sets less consistently than the nonhuman subjects and were rarely able to place a sixth cup into a seriated set. Bizarrely, at least for a purely relational interpretation of the results, monkeys were more successful than either apes or human children on the more challenging six-cup trials, yet were also the most inefficient (in terms of number of moves) of the three populations. 

Fragaszy et al.’s (2002) explanation for these anomalous results is quite sensible (see also Fragaszy & Cummins-Sebree 2005):  they hypothesize that the seriation task does not, in fact, require the subject to reason about combinatorial, hierarchical relations per se, but depends more simply on situated, embodied sensorimotor skills that are experientially, rather than conceptually, driven. Apes and monkeys do better than children because they are more physically adept than 11-21 month old children—not because they have a more sophisticated representation of the combinatorial and hierarchical relations involved.  While subassembly may be a more physically ‘complex’ strategy than other methods of seriation, it does not necessarily require the subject to cognize the spatial-physical relations involved as hierarchical; and thus there is no reason to claim an isomorphism between the embodied manipulation of nested cups and the cognitive manipulation of symbolic-relational representations (cf. Greenfield 1991; Matsuzawa 1996).

hierarchical relations in the wild

The strongest evidence to date in support of the claim that nonhuman animals can reason about hierarchically-structured relations in the social domain comes from Bergman et al.’s  (2003) study of free-ranging baboons. Bergman et al. designed an elegant playback experiment in which female baboons heard a sequence of recorded calls mimicking a fight between two other females. Mock agonistic confrontations were created by playing the “threat-grunt” of one individual followed by the subordinate screams of another.  On separate days, the same subject heard one of three different call sequences:  1) an anomalous sequence mimicking a rank reversal between members of the same matrilineal family (i.e., sisters, mothers, daughters or nieces); 2) an anomalous sequence mimicking a between-family rank reversal (i.e., between members of two different matrilineal families in which one of the families is dominant to the other);  and 3) a control sequence replicating an existing dominant-subordinate relationship (i.e., no rank reversal) using between-family or within-family dyads. As predicted, there was a significant difference in the focal subjects’ responses to the three different kinds of call sequences.  Subjects looked longest at between-family rank reversals. There was no significant difference between within-family reversals and no-reversal control sequences.  According to Bergman et al., the reason the baboons responded more strongly to between-family rank reversals than within-family sequences is because the baboons recognized that the former imply a superordinate reorganization of matrilineal subgroups.  Bergman et al. (2003) conclude: “Our results suggest that baboons organize their companions into a hierarchical, rule-governed structure based simultaneously on kinship and rank” (see also Seyfarth et al. 2005).

In our view, the evidence reported by Bergman et al. (2003) does not support this conclusion. Even if baboons do make a categorical distinction between kin and non-kin dyads based on interaction history, familiarity, spatial proximity, phenotypic cues or some other observable regularity (see Silk 2002 for a review of the possibilities), this does not necessarily mean that they represent the entire matrilineal social structure as an integrated relational schema in which non-kin relations are logically superordinate to between-kin relations.  As Bergman et al. (2003) themselves point out, between-family rank reversals are much more disruptive to baboon social life than within-family rank reversals.  Thus, Bergman et al.’s (2003) results are consistent with the hypothesis that female baboons have learned that rank reversals among non-kin are more salient (i.e., associated with greater social turmoil and personal risk) than within-kin rank reversals occurring in someone else’s family (notably, Bergman et al. did not test rank reversals within the focal subject’s own family). While baboons clearly recognize particular conspecifics’ vocalizations and represent dominance and kin relations in a combinatorial manner, there is nothing in Bergman et al.’s data that remotely suggests a higher-order, hierarchical relation among these representations. 

Once again, there is not simply an absence of evidence; there is evidence of an absence.  Bergman et al. (2003) note that the subjects’ responses to apparent rank reversals were unrelated to the rank distance separating the two signalers:  i.e., subjects paid as much attention to mock rank reversals involving closely ranked opponents as those involving more distantly ranked opponents.  Bergman et al. use this fact to rebut the hypothesis that the baboons were responding more strongly to between-family rank reversals simply because the individuals involved had more disparate ranks.  However, the data cuts both ways: if the baboons did cognize the relation between female conspecifics as an integrated matrilineal dominance hierarchy, ceteris paribus, they should have been more surprised at a rank reversal between a very low ranking and a very high ranking individual than by a rank reversal between two individuals of adjacent ranks.  Ironically, Bergman et al.’s results provide some of the strongest evidence to date that female baboons do not, in fact, cognize the structure of their conspecifics’ matrilineal social relationships in a systematic or hierarchical fashion. 

causal relations

There is ample evidence that traditional associationist models are inadequate to account for nonhuman causal cognition; but the available comparative evidence also suggests that there is a critical and qualitative difference between the way that human and nonhuman animals reason about causal relations (see Penn & Povinelli 2007a for a more extensive review and discussion). Humans explicitly reason in terms of unobservable and/or hidden causes (Hagmayer & Waldmann 2004; Kushnir et al. 2005; Saxe et al. 2005), distinguish between ‘genuine’ and ‘spurious’ causes (Lien & Cheng 2000), reason diagnostically from effects to their possible causes (Waldmann & Holyoak 1992), and plan their own interventions in a quasi-experimental fashion to elucidate ambiguous causal relations (Hagmayer et al. in press). Numerous researchers have argued that normal humans—not just scientists or philosophers—form “intuitive theories” or “mental models” about the unobservable principles and causal forces that shape relations in a specific domain (e.g., Carey 1985; Gopnik & Meltzoff 1997; Keil 1989; Murphy & Medin 1985).  These tacit systems of higher-order relations at various levels of generality modulate how human subjects judge and discover novel relations within those domains by a process akin to analogical inference (Goldvarg & Johnson-Laird 2001; Lee & Holyoak 2007; Lien & Cheng 2000; Tenenbaum et al. in press).  In short, the ability to reason about higher-order, analogical relations in a systematic and productive fashion appears to be an integral aspect of human causal cognition.

In stark contrast to the human case, there is no compelling evidence that nonhuman animals form tacit theories about the unobservable causal mechanisms at work in the world, seek out explanations for anomalous causal relations, reason diagnostically about unobserved causes or distinguish between genuine and spurious causal relations on the basis of the subject’s prior knowledge of abstract causal mechanisms[2].  Indeed, there is consistent evidence of an absence across a variety of protocols (see, for example, Penn & Povinelli 2007a; Povinelli 2000; Povinelli & Dunphy-Lelii 2001; Visalberghi & Tomasello 1998).

A variety of nonhuman animal species—and certainly not primates alone (Emery & Clayton 2004)—are able to construct and use tools in a flexible and adaptive fashion. But a series of seminal experiments, initiated by Visalberghi and colleagues (see Visalberghi & Limongelli 1996 for a review), provides a particularly compelling example of how nonhuman animals’ remarkable use of tools nevertheless belies a fundamental discontinuity with our human understanding of causal relations.

Visalberghi and Limongelli (1994) tested capuchin monkeys’ ability to retrieve a piece of food placed inside a transparent tube using a straight stick. In the middle of the tube, there was a highly visible hole with a small transparent cup attached.  If the subject pushed the food over the hole, the food fell into the cup and was inaccessible. Visalberghi et al. tested four capuchin monkeys to see if they would understand that they needed to push the food out the end of the tube away from the hole. After about 90 trials, only one out of the four capuchin monkeys learned to push the food away from the hole, and even this one learned the correct behavior through trial and error.  Worse, once the experimenters rotated the tube so that the trap-hole was now facing up and causally irrelevant, the only successful capuchin still persisted in treating the hole as if it needed to be avoided—making it obvious that even this subject misunderstood the causal relation between the trap hole and the retrieval of the reward.

Povinelli and colleagues (2000) subsequently replicated Visalberghi’s trap-tube protocol with seven chimpanzees, including both the inverted trap condition and a number of novel variations.  Povinelli performed the experiments once when the chimpanzees were juveniles (five to six years old) and again when they were young adults (ten years old).  Out of 100 trials, only a single chimp, Megan, performed above chance on the normal trap tube condition.   When tested on the inverted trap condition, Megan —like the single successful capuchin in Visalberghi’s original experiment—failed to differentiate between conditions in which the trap was up or down. By way of comparison, it should be noted that children as young as 3 years of age successfully solve the trap-tube task after only a few trials (see Limongelli et al. 1995).

Most recently, Mulcahy and Call (2006) tested ten great apes on a modified version of the trap tube task that allowed subjects to choose whether to pull or push the reward through the tube.  Three out of the ten subjects learned to avoid the trap when pulling rather than pushing. Like the “trap table” task with which it is functionally equivalent (Povinelli 2000 chapter 5), the pulling-version of the trap tube task thus appears to be somewhat easier for apes to learn than the traditional push-only version. However, as in the trap table task, the majority of subjects still failed this ‘easier’ task. Indeed, even the three successful subjects took an average of 44 trials to achieve above-chance performance and then continued to fail the original trap tube task despite having mastered the modified version. Thus, these latest results confirm earlier hypotheses that nonhuman apes’ ‘causal knowledge’ in tool-use tasks is severely limited in its depth and generality relative to human children (Povinelli 2000; Visalberghi & Tomasello 1998).

Nonhuman primates are not the only animal that seems to be incapable of cognizing the causal relations at issue in the trap tube task without extensive prior learning.  Seed et al. (2006) recently presented eight rooks with a clever modification to Visalberghi’s trap-tube task.  Seven out of eight rooks learned the initial version of the modified trap-tube task and successfully transferred this solution to a novel but perceptually similar version of the task.  Nevertheless, when presented with a series of transfer tasks in which the visual cues that were associated with success in the initial tasks were absent or confounded, only one of the seven subjects passed.  In a separate follow-up experiment (Tebbich et al. in press), none of the rooks passed the transfer task.

Seed et al.’s (2006) results add to the growing evidence that corvids are quite adept at using stick-like tools (see, for example, Weir & Kacelnik 2007). But as Seed et al. (2006) point out, these results also suggest that rooks share a common cognitive limitation with nonhuman primates: they do not understand “unobservable causal properties” such as gravity and support; nor do they reason about the higher-order relation between causal relations in a systematic, theory-like fashion.  Instead, rooks, like other nonhuman animals, appear to solve tool-use problems based on evolved, domain-specific expectations about what features are likely to be most salient in a given context and a general ability to reason about the causal relation between observable contingencies in a flexible, goal-directed but context-specific fashion  (see also Penn & Povinelli 2007a).

theory of mind

Nonhuman animals certainly manifest many sophisticated social-cognitive abilities.  But having a “Theory of Mind” (ToM) sensu Premack and Woodruff (1978) means something more specific than being a socially savvy animal:  it means being able to impute unobservable, contentful mental states to other agents and then to reason in a theory-like fashion about the causal relation between these unobservable mental states and the agents’ subsequent behavior (see Penn & Povinelli 2007b for a more extensive discussion of this point).  Of course, theory-like inferences are not the only way in which a cognizer might reason about other agents’ mental states (see Carruthers & Smith 1996 for a review of the possibilities). Mentalistic simulation, for example, provides an alternative and popular explanation. However, all but the most radical simulation-oriented theories do not deny that humans represent causal relations involving other agents’ unobservable mental states. They simply propose an alternative, analogical mechanism for how humans do so.

Whiten (1996; 2000) has proposed another, influential hypothesis about how nonhuman apes (and young children) might represent the mental states of their conspecifics without relying on theory-like metarepresentations. Whiten proposed that nonhuman apes use “intervening variables” to stand-in for generalizations about the causal role played by a given mental state in a set of disparate behavioral patterns.  For example, a chimpanzee that encodes the observable patterns, “X saw Y put food in bin A”, “X hid food in bin A”, and “X sees Y glancing at bin A” as members of the same abstract equivalence class could be said, on Whiten’s account, to recognize that “X knows food is in bin A” and thus be capable of “explicit mindreading” (Whiten 1996).  

Notice that Whiten’s example of “explicit mindreading” is a textbook example of analogical reasoning: Whiten’s hypothetical chimpanzee must infer a systematic higher-order relation among disparate behavioral patterns that have nothing in common other than a shared but unobservable causal mechanism:  i.e., what X “knows”.  If this is an “intervening variable”, it is an intervening variable that requires reasoning about the higher-order, role-governed relational similarity between perceptually disparate causal relations in order to be produced.

We believe Whiten is right in this sense:  if a nonhuman animal were capable of inferring that these disparate behavioral patterns were actually instances of the same superordinate causal relation, then the animal would surely have demonstrated that it possessed a ToM and the ability to reason analogically as well.  There is, however, no such evidence on offer.  Indeed, until recently, there has been a fragile consensus that nonhuman animals lacked anything even remotely resembling a ToM (Cheney & Seyfarth 1998; Heyes 1998; Tomasello & Call 1997; Visalberghi & Tomasello 1998).

A few years ago, however, Hare et al. (2000; 2001) reported “breakthrough evidence” that chimpanzees do, in fact, reason about certain psychological states in their conspecifics but not others (see, particularly, Tomasello et al. 2003a, 2003b).  And, since then, there have been a flurry of similar claims on behalf of corvids and monkeys based on similar protocols (Bugnyar & Heinrich 2005; Bugnyar & Heinrich 2006; Dally et al. 2006; Emery & Clayton 2001, in press; Flombaum & Santos 2005; Santos et al. 2006). 

Since Povinelli and colleagues have provided detailed critiques of Hare et al.’s (2000; 2001) protocol and results elsewhere (see Penn & Povinelli 2007b; Povinelli 2004; Povinelli & Vonk 2003, 2004), here we will focus on the best available evidence for a ToM system among non-primates. As will become apparent, our original critique of Hare et al.’s (2000; 2001) protocol applies, mutatis mutandis, to the new claims being made on behalf of corvids as well.

The best evidence for a ToM system in a non-primate comes from work by Dally et al. (2006). Dally et al. had scrub-jays cache food items under one of four conditions: 1) in the presence of a dominant conspecific, 2) in the present of a subordinate;  3) in the presence of the storer’s preferred partner;  or 4) in private.  The storers were allowed to cache the food in two trays, one nearer and one farther away from an observer, and then they were allowed to recover their caches in private three hours later. Dally et al. (2006) showed that birds who had stored food in the presence of a dominant or subordinate competitor tended to predominantly re-cache food from the near tray, and that the proportion of food that was re-cached was greatest for birds who had stored food in the presence of a dominant competitor. Moreover, significantly more food caches were re-cached when a previous observer was present than when the storers were allowed to retrieve their caches in private or in view of a control bird that had not witnessed the original caching. 

Results such as these leave no doubt that corvids are remarkably intelligent creatures, able to keep track of the social context of specific past events as well as the ‘what’, ‘when’ and ‘where’ information associated with those events (Clayton et al. 2001).  But nothing in the results reported to date suggests that corvids actually reason about their conspecifics’ mental states—or even understand that their conspecifics have mental states at all—as distinct from their conspecifics’ past and occurrent behaviors and the subjects’ own knowledge of past and current state of affairs (Penn & Povinelli 2007b; Povinelli et al. 2000; Povinelli & Vonk 2003, 2004)[3]. 

In the case of Dally et al.’s (2006) experiment, for example, it suffices for the subjects to keep track of which competitor was present during which caching event and to formulate strategies on the basis of observable features of the task alone:  e.g., <Re-cache food if a competitor has oriented towards it in the past>, <Try to cache food in sites that are farther away from potential competitors>, <Attempt to pilfer food if the competitor who cached it is not present>, etc… Since none of the protocols required the subjects to reason in terms of the specific contents of the competitor’s epistemic mental states, the additional inference that the subjects acted the way they did because they understood that <the competitor knows where the food is located> does no additional cognitive or explanatory work. This additional mentalistic claim merely satisfies our all-too-human need to posit an explicit, conscious, propositional reason for the birds’ behaviors. But it should be obvious to any cognitive researcher that animals—including humans—do not necessarily need to ‘know’ why they are acting the way they are acting in order for a behavior to be flexible, effective and (biologically) rational (see lucid discussions by Heyes & Papineau 2006; Kacelnik 2006).

Experimental protocols such as these thus lack the power, even in principle, to provide positive evidence for a representational ToM system (for examples of the kind of protocols that could, in principle, provide such evidence, see Penn & Povinelli 2007b).  Ironically, many of the same researchers who claim evidence for ToM abilities in corvids explicitly acknowledge that an explanation based on responding to observed cues alone would be sufficient to account for the existing data.  Dally et al. (2006 p. 1665), for example, acknowledge that scrub jays’ ability to keep track of which competitors have observed which cache sites “need not require a human-like ‘theory of mind’ in terms of unobservable mental states, but […] may result from behavioral predispositions in combination with specific learning algorithms or from reasoning about future risk.”  Similarly, Bugnyar and Heinrich (2006 p. 374) acknowledge that a representation of “states in the physical world” and “responses to subtle behavioral cues given by the competitor” would be sufficient to explain the available evidence concerning the manipulative behaviors of ravens—as well, we would add, as all the other comparative evidence claiming to show ToM-like abilities in nonhuman animals to date.

explaining the discontinuity

Up to this point in the paper, we have focused solely on showing that there is, in fact, a pervasive functional discontinuity between human and nonhuman minds, and that this discontinuity is located specifically in the way that human and nonhuman animals reason about relations.  Now we turn to the daunting question of how to account for this pervasive discontinuity.  Let us first consider the three most influential hypotheses that have been proposed in recent years. 

the Massive Modularity hypothesis

A ‘modular’ explanation for the evolution of human cognition is popular among many evolutionary-minded theorists (e.g., Barkow et al. 1992). Certainly, many central cognitive processes—including almost all of the cognitive mechanisms we share with nonhuman animals—are at least moderately modular once the notion of modularity has been defined in a purely functional sense (see Barrett 2006). But the modular story alone does not provide a satisfying explanation for the disparity between human and nonhuman minds. 

As we have seen in our review of the comparative evidence, the pattern of similarities and differences between human and nonhuman relational reasoning is remarkably consistent across every domain of cognition, from same-different reasoning and spatial relations to tool use and ToM.  Thus, it seems highly implausible that the disparities in each domain are the result of independent, module-specific adaptations.  It seems much more likely (not to mention, parsimonious) that a common set of specializations—perhaps in some more general ‘supermodule’—is responsible for augmenting the relational capabilities of all of the cognitive modules we inherited from our nonhuman ancestors.  Unfortunately, the two most popular supermodules that have been proposed to date—i.e., ToM and language—do not do a good job of accounting for the comparative evidence.

the ToM hypothesis

A number of comparative researchers believe that the discontinuity between human and nonhuman minds can be traced back to some limitation in nonhuman animals’ social-cognitive abilities (e.g., Cheney & Seyfarth 1998; Terrace 2005; Tomasello et al. 2005).  While we certainly agree that nonhuman animals do not appear to possess anything remotely resembling a ToM, the hypothesis that some aspect of our ToM alone is responsible for the disparity between human and nonhuman cognition seems difficult to sustain.  For example, it is very hard to see how a discontinuity in social-cognitive abilities alone could explain the profound differences between human and nonhuman animals’ abilities to reason about causal relations in the physical world or nonhuman animals’ inability to reason about higher-order spatial relations.  Even Tomasello et al. have admitted that trying to explain all the differences between human and nonhuman cognition in terms of a difference in ToM skills is “highly speculative” at best (Tomasello & Call 1997 p. 418). Indeed, in a different context, Tomasello (e.g., Tomasello 2000) has himself argued that human language learners rely on cognitive capacities—such as analogical reasoning and abstract rule learning—that are independent from ToM and absent in nonhuman animals. So while our ability to participate in collaborative activities and to take each others’ mental states into account may be a distinctive feature of the hominid lineage, it is clearly not the only or even the most basic one. 

the Language Only hypothesis

The oldest and still most popular explanation for the wide-ranging disparity between human and nonhuman animals’ cognitive abilities is language (for recent examples of this venerable argument, see Bermudez 2003; Carruthers 2002; Clark 2006).  Dennett (1996 p17) described the extreme version of this hypothesis in characteristically pithy terms:  “Perhaps the kind of mind you get when you add language to it is so different from the kind of mind you can have without language that calling them both minds is a mistake.”

To be sure, language clearly plays an enormous and crucial role in subserving the differences between human and nonhuman cognition. But we believe that language alone is not sufficient to account for the discontinuity between human and nonhuman minds.  In order to make our case, we need to distinguish between three distinct versions of the Language Only hypothesis: 1) the hypothesis that verbalized (or imaged) natural language sentences are responsible for the disparity between human and nonhuman cognition; 2) the hypothesis that some aspect of our internal “language faculty” is responsible for the disparity;  and 3) the hypothesis that the communicative and/or cognitive function of language served as the prime mover in the evolution of the uniquely human features of the human mind. 

are natural language sentences what makes the human mind human?

Natural language tokens clearly play an enormous role in ‘extending’ and even in ‘rewiring’ the human mind (Bermudez 2005; Clark 2006; Dennett 1996).  Gentner and colleagues, for example, have shown that relational labels play an instrumental role in facilitating young human learners’ sensitivity to relational similarities and potential analogies (Gentner & Rattermann 1991; Loewenstein & Gentner 2005).  Our ability to reason about large quantities of countable objects in a generative and systematic fashion seems to require the acquisition of numeric symbols and a linguistic counting system (Bloom & Wynn 1997).  Numerous studies have shown that subjects with language impairment exhibit a variety of cognitive deficits (e.g., Baldo et al. 2005) and that deaf children from hearing families (i.e., ‘late signers’) show persistent deficits in ToM tasks (see Siegal et al. 2001 for a review). Furthermore, there is good evidence that a child’s ability to pass certain kinds of ToM tests is intricately tied to the acquisition of specific sentential structures (de Villiers 2000).   Normal human cognition clearly depends on normal linguistic capabilities. 

But although natural language clearly subserves and catalyzes normal human cognition, there is compelling evidence that the human mind is distinctively human even in the absence of normal natural language sentences (see Bloom 2000; Garfield et al. 2001; Siegal et al. 2001). Varley and Siegal (2000), for example, studied the higher-order reasoning abilities of an agrammatic aphasic man who was incapable of producing or comprehending sentences and whose vocabulary was essentially limited to perceptual nouns.  In particular, he had lost all his vocabulary for mentalistic entities such as “beliefs” and “wants”. Yet this patient continued to take care of the family finances and passed a battery of causal reasoning and ToM tests (see also Varley et al. 2001; Varley et al. 2005). While late signing deaf children’s cognitive abilities may not be ‘normal’, they nevertheless manifest grammatical, logical and causal reasoning abilities far beyond those of any nonhuman subject (Peterson & Siegal 2000). And the many remarkable cases of congenitally deaf children spontaneously ‘inventing’ gestural languages with hierarchical and compositional structure provides further confirmation that the human mind is indomitably human even in the absence of normal linguistic enculturation (see, for example, Goldin-Meadow 2003; Sandler et al. 2005; Senghas et al. 2004). 

Of course, it is possible that the process of learning any rudimentary form of language at all ‘rewires’ the human brain in ways that make certain kinds of cognition possible that would not be possible otherwise, even if the subject subsequently loses the ability to use language later in life.  But this ontogenetic version of the ‘rewiring hypothesis’ (Bermudez 2005) begs the question of what allows language to so profoundly rewire the human mind, but no other.

Over the last 35 years, comparative researchers have invested considerable effort in teaching nonhuman animals of a variety of taxa to use and/or comprehend language-like symbol systems.  Many of these animals have experienced protracted periods of enculturation that rival those of modern (coddled) human children. The stars of these animal language projects have indeed been able to approximate certain superficial aspects of human language, including the ability to associate arbitrary sounds, tokens and gestures with external objects, properties and actions, and a rudimentary sensitivity to the order in which these ‘symbols’ appear when interpreting novel ‘sentences’ (Herman et al. 1984; Pepperberg 2002; Savage-Rumbaugh & Lewin 1994; Schusterman & Krieger 1986).  But even after decades of exhaustive training, no nonhuman animal has demonstrated a clear mastery of abstract grammatical categories, closed-class items hierarchical syntactic structures or any of the other basic features of a human language (cf. Kako 1999).  Furthermore, there is still no evidence that symbol-trained animals are any more adept than symbol-naïve ones at reasoning about unobservable causal forces, mental states, analogical inferences or any of the other tasks that require the ability to cognize higher-order relations in a systematic, structural fashion (cf. Thompson & Oden 2000).

If the history of animal language research demonstrates nothing else, it demonstrates that you cannot create a human mind simply by taking a nonhuman one and teaching it to use language-like symbols.  There must be substantive differences between the human and nonhuman cognitive architectures that allow the former, but not any of the latter, to master grammatically-structured languages to begin with and to have their underlying thoughts ‘rewired’ in the process (cf. Clark 2001).

is some aspect of the human language faculty the key?

A more plausible variation on the Language Only Hypothesis is that some aspect of our internal faculty for language is responsible for our unique cognitive abilities.  In a recent and influential version of this proposal, Hauser et al. (2002a) distinguish between the faculty of language in the broad sense (FLB) and the faculty of language in the narrow sense (FLN).  They define FLN as including only the computational mechanisms specific to “narrow syntax” and to mapping syntactic representations into the systems of phonology and semantics.  FLB, on the other hand, encompasses all the aspects of our sensory and cognitive systems that go into the production and comprehension of language, including the sensory-motor systems responsible for perceiving and producing the perceptual patterns of language, and the conceptual-intentional systems responsible for representing the semantic/conceptual meaning of linguistic expressions and for reasoning about their implications. According to Hauser et al. (2002a), “most, if not all, of FLB is based on mechanisms shared with nonhuman animals” (p. 1573).  On the narrowest and most ambitious version of their hypothesis (i.e., “Hypothesis 3”, p. 1573), the only aspect of human cognition that is qualitatively unique to our species is specific to FLN, and, in particular, to the computational mechanisms responsible for recursion.

We believe the available comparative evidence firmly rules out the narrowest and most ambitious version of Hauser et al.’s (2002a) hypothesis.  While the computational mechanisms responsible for recursion—at least the kind of recursion characteristic of human languages—certainly appear to be unique to the human mind, there are many other aspects of human languages that are also uniquely human but not included in Hauser et al.’s (2002a) construal of FLN (see Pinker & Jackendoff 2005). More generally, over the course of this paper, we have argued that there are many aspects of the human conceptual-intentional system that are unique to human subjects but are not specifically linguistic, ranging from our ability to reason about hierarchical social relations to our ability to theorize about unobservable causal mechanisms and mental states. Some of these cognitive capabilities also seem to require recursive operations over hierarchically-structured representations (see our discussion ‘on the proper treatment of symbols in a nonhuman cognitive architecture’ below), suggesting that recursion is not specific to FLN. Indeed, Hauser et al. (2002a) themselves suggest that recursion evolved first in some noncommunicative domain. So, even according to their own hypothesis, the discontinuity between human and nonhuman minds presumably began before the evolution of the language faculty narrowly construed—though their hypothesis leaves unanswered what exactly changed in the human conceptual-intentional system to allow for the advent of recursive operations over hierarchically-structured representations.

Carruthers (2002; 2005b) has proposed a much broader—and we believe more plausible—role for the language faculty in subserving human cognition.  Carruthers argues that the structured logical form in which natural language sentences are internally rep