The neurophysiological evidence from Miyashita et al.'s experiments on monkeys as well as cognitive experience common to us all suggests that local neuronal spike rate distributions might persist in the absence of their eliciting stimulus. In Hebb's cell-assembly theory, learning dynamics stabilize such self-maintaining reverberations. Quasi-quantitive modelling of the experimental data on internal representations in association-cortex modules identifies the reverberations (delay spike activity) as the internal code (representation). This leads to cognitive and neurophysiological predictions, many following directly from the language used to describe the activity in the experimental delay period, others from the details of how the model captures the properties of the internal representations.
active memory, associative cortex, attractor dynamics, content sensitivity, internal representations, learning, modeling.
então concluiremos que uma palavra, quando dita, dura mais que o som e os sons que a formaram, fica por aí, invisivel e inaudivel para poder guardar o seu proprió segredo...
we may conclude that a word, once said, persists longer than the vibration or the sound that formed it, it stays there, invisible and inaudible so as to preserve its very own secret
Jose Saramago, A Jangada de Pedra 1988 The Stone Raft
Modeling brain function by neural networks can be roughly divided along the lines separating feed-forward networks and feed-back, or attractor networks. A very clear and informative manifestation of the presence of attractors in temporal cortex of primates is presented in the recent studies of Miyashita et al (See also e.g. ref. Here I will suggest that most of the results observed in these seminal experiments could be predicted on the basis of simple observations on common cognitive phenomena, without recourse to any specific model. One can then argue that considerations of such type have, most probably, underlaid Hebb's postulation of synaptic dynamics as a means for stabilizing reverberations in neural assemblies. The first consequence of this discussion is the potential usefulness of such a mechanism of reverberations as an organic way for dissecting the complexity of the connection between sensory input and motor reaction.
Consider the familiar every-day situation in which one is given the task of translating a word from one language to another. The word is known in both languages, i.e. both words are in memory. Suppose further that the task is communicated verbally (spoken). Both the task and the word in the first language are well understood. The acoustic stimulus disappears as soon as its communication is completed. It may be the case that the response, the word in the second language, is produced very rapidly and correctly. It may though be the case (equally familiar) that the first word is recognized; there is a strong sense that also the translated word is known, but the retrieval of the corresponding word is not possible. One knows that one knows, yet one does not know. One can go on with the effort of the retrieval of the translated word for quite a while, despite the absence of the provoking stimulus. Even the situation in which the the entire episode, word-task, withers away from our consciousness, only to surface resolved hours or days later, is quite common.
During this long search period, conscious or unconscious, the first (the given) word must have been available. But it must have been available in a sense that goes beyond the fact that it was contained in our memory, i.e. we knew it. After all there are very many words we know in the first language. What is special about this particular word, of course, is that it has been tagged by the stimulus -- the sound of the spoken word.
This naive resolution of the problem posed by the familiar cognitive situation has quite significant implications. Somewhere in the skull, between the locus of the fully pre-processed stimulus and before the beginning of a generation of a response, there must be stations storing, passively, many memories. Those are the things we know and remember. They are, most likely, stored in the synaptic structure of each station. Each of these stations must be able to maintain one memory out of the passive stock (the one tagged by the stimulus) in a special status, for relatively long times. It must be able to maintain it in a status which will make it available for future attempts to perform the task.
In fact, this type of consideration leads also to the conclusion that the number of such stations must be relatively high. One seems to be able to maintain several items concurrently in the tagged status. Those may be related to several items involved in a given task; items involved in different tasks which interweave during long time intervals of inattentive processing and also possibly the active presence of the tasks themselves to which we return briefly below. Consequently, the cortex must be modular. Both anatomy and physiology provide hints as to the geometry (in terms of cortical volume, localization and number of neurons). See also discussion of the experiments in Section exp.attr.
The attractor picture of a cortical module is, briefly, as follows: The module consists of a relatively large ensemble of neurons (of order $10^5$ cells), in which the probability that any two neurons be connected by a synapse is high (a few percent suffice). Such an ensemble of cells is geometricly localized in about 1mm$^2$ of cortical surface. Experience (training) structures (in a Hebbian way) the set of connections, i.e. the set of synaptic efficacies. The resulting synaptic structure is such that when a stimulus activates a subset of the neurons in the module, i.e. raises their spike rates and then is turned off, the activity of the neurons in the ensemble may, deping on the stimulus, either: Decay rapidly back to spontaneous levels -- stimulus ignored. Or, for other classes of stimuli, maintain for long times a stimulus-selective subset of the neurons at elevated rates in the absence of the stimulus. In other words, the synaptic structure provides sufficient structured feedback so that the afferents (inputs) due to the distribution of activity among neurons with elevated spike rates, maintains the same set of neurons active, leaving the others at spontaneous levels. The collective nature of this dynamical state of affairs, i.e. the fact that the elevated activity of each neuron is maintained by many others, makes the attractors impressively immune to disorder in the synaptic structure as well as to dynamical noise. Both of which are unavoidable in cortical conditions.
What determines which configurations of neurons in the assembly can collectively maintain each other in the elevated activity state, is the synaptic matrix. This is passive memory. What determines which of the possible, self-maintaining configurations actually reverberates in the module, is the stimulus. The short appearance of the stimulus `tags' one of the passive memories by activating the particular attractor associated with this stimulus. The activated configuration in the assembly is an attractor in the sense that each of these configurations is activated by a wide class of stimuli, in some sense close to each other. The active configuration then is the representative of the class. This is what often goes under the name of content addressable memory or, less commably, as associative memory, in contrast to physical address referencing of memory used in digital computers.
The attractor type of memory activation contrasts with the computer in yet another sense. In the computer, when a piece of information is to be acted on, it is taken from its address and put in a special section of the processor for action. This, in some sense, in analogous to the activation of a passive memory. In the neural module, however, not only is the addressing done by content and not by physical address, but the activation leaves the item in the module. In some sense the module is both the memory and the register. Thus while in the computer the activation of a memory item is signalled by the special location (the register), in the cortical module, we argue, the activated memory is distinguished by the activity. The place remains invariant. Moreover, the computer register can hold any information configuration for processing, the attractor module will hold only the representatives of classes that had been learned into the synaptic structure.
I am not emphasizing these distinctions to imply a preference. It simply that in a system like the cortex the register option is not available, because the register will also be made of neurons and to maintain an item for later processing can rely only on synapses and we are back in square one.
It is important to make a clear distinction at this point between the tagged persistent memories and concepts such as short, intermediate and long term memory. The latter refer usually to the stock of passive memories, i.e. to the duration of the synaptic programming or to the duration of its accessibility. They relate to the ability of the system (the brain) to manipulate incoming tasks. The tagged persistent item introduces an additional ``temporal" category: the persistent activity distribution excited by a stimulus. A memory of this type (sometimes referred to as working memory, see e.g. can belong to any of the three temporal categories. This basic distinction is sometimes overlooked, and even by Hebb himself. See e.g. second quote from Hebb and discussion in Section hebb.par.
The simplest conceivable carrier of such a tagging signal is the persistent distribution of elevated spike rates (Hebbian reverberations) among the neurons in the module (Hebb's cell assembly). One may contemplate other stimulus selective taggings of stored memories, but those would be much more difficult to observe. Since persistent spike distributions in the absence of the provoking stimulus are governed mainly by the synaptic structure in the local assembly, the tokens maintained there during the prolonged performance of the task will be prototypal. It is likely that the structure of such reverberations will not dep on details of the stimulus, such as the tone, the pitch or the modulation of the acoustic signal communicating the original word. In theoretical models the dynamics of multi neuron systems, when maintaining activity distributions in the absence of the stimulus, gives rise to a global depence on the stimulus: Large classes of stimuli will provoke the same persistent spike distribution for all stimuli in a class. Stimuli which are different enough provoke different persistent activities. In this sense the activity distributions in the reverberations can be considered as representations of the class of stimuli that provoke it.
It may be useful to clarify the role that is ascribed in the present context to the word internal `representation', given that it is at the center of so much debate in the community of cognitive science. In the computational situation described above, in which the performance of the task on the stimulus (the word to be translated) is to take place long after the stimulus has disappeared, seems to leave little choice. When the task is to be finally carried out, it must have an operand. That token, which survives somewhere in the cortex, is a representation of the set of equivalent stimuli. Such a token seems to be logically required and a candidate for it is experimentally observed, as we recount below in Section exp.attr.
The indepence of the reverberation in the local assembly of the details of the stimulus does not imply that the internal representation does not dep on the task. Since the stimulus contains the word as well as the task, the persistent token may, in principle, dep on both. Yet the fact that the same word can be involved in many different tasks -- rhyming, opposite, synonyms etc, suggests that the task may be represented also, or only, elsewhere. The linking of two separate representations is still an open problem to be investigated both experimentally and theoretically.
Admitting the presence of such reverberating stations in the cortex splits a potential feed-forward picture, describing cognitive (psychological) behavior. The split takes place inside the cortex and the boundary lines may lie at different distances from different sensory mechanisms. Roughly speaking, there is a three-way division:
This split of the cognitive computational machine is advantageous for experimental as well as for theoretical study. The above comments imply that the persistent tagging represents (at least for a while) some abstract feature of the stimulus involved in the task. Hence the neuro-physiologist can search for these local modules (Hebb's assemblies) in the cortex of the performing mammal, in the course of the performance of a prescribed behavioral paradigm. The tentative acceptance of spike rate distributions allows an easy read-out, by the neuro-physiologist, of the relevant representations. This he can do by single unit recordings to be analyzed off-line. He can count on the fact that the internal representations he will observe will not dep on particular details in the manifestation of the stimulus, provided the stimulus has been correctly classified (interpreted) by the subject animal. The latter can be monitored by the animal's response in the course of the experimental paradigm.
From the point of view of cognitive science one representation may be as good as any other. The opportunity provided by the Hebbian cut is that it allows a direct, empirical expression for the representations to replace a metaphorical one. As is argued in Section exp.attr, below, the quantitative properties of the representations discussed here can be measured. A few tens of milliseconds following the disappearance of the stimulus, what is in these attractors is all there is to be for the completion of the mental computation. Given a direct and measurable commitment for the representations the speculations of cognitive science can proceed with reference to a well defined body of data on the neuro-physiological level. The realistic neural underpinning is required to inform the speculation about the potential as well as the limitation of the infrastructure.
The above properties are not common to all modeling paradigms of internal representations. For example, in a feed-forward description of computation, a particular pattern of neural activation persists only for as long as the stimulus is on. A given neural activity distribution, provoked during the imposition of the stimulus, cannot be supposed to be available for computation at a later time. Moreover, the activity distribution may be sensitive to the particular details of the imposed stimulus. Thus the activity distribution may be richer than what is actually used for continuing the computational process. It may therefore not provide sufficient constraints on the computation. The attractor, in contrast, contains measurable information for long times and that information is the same for all stimuli which are classified by the same attractor.
Moreover, the attractor dynamics distinguishes naturally between the course of an unfamiliar stimulus and a familiar one. The former being a stimulus significantly different from the ones that one has learned to classify. Attractor dynamics leaves all neurons in the module at very low activity levels, despite the fact that during the presentation of the stimulus as many neurons may be excited as for a familiar stimulus. This distinction is not naturally available for alternative paradigms of representation.
The collection of such representations and of their depence on the task can provide invaluable information about how computation is organized in the cortex. But before proceeding with the elaboration of the experimental and theoretical account and perspectives, I return to the subject of this essay's , to Donald Hebb.
Allusions to Hebb abound in the preceding text. They have not been explicitly formalized because I have been trying to emphasize the intuitive appeal and the almost imperative nature of the local internal representations. Yet I believe that whatever is valid in this picture must have been clearly perceived by Donald Hebb many years ago. Someone joining the field in the last decade finds innumerable references to Hebb's work. But the general tenor of these references is of a synaptic engineering type. Almost any type of synaptic learning in neural networks genuflects in Hebb's direction. Yet Hebb was not a neuro-chemist nor a neuro-physiologist. He was a psychologist searching for a neurally based infrastructure to psychology, to supplement, or replace, the mythological one.
The Hebbian paradigm is multi-dimensional. It is composed of a
prescription for synaptic modifications: synapses are modified by
afferent stimuli in a way that ts to stabilize the pattern of
activity provoking the synaptic modifications. The stable neural
activity distributions are excited in the local assembly by
each of the learned stimuli. This is an unsupervised
learning mode which aims at producing synaptic structures which
can sustain a selected set of activity distributions in the
local assembly, to use Hebb's language. The role of the resulting
synaptic structure is to sustain the local activity produced by a
stimulus in the absence of the provoking stimulus. To maintain the
activity provoked by the stimulus in the presence of the stimulus
does not require any synaptic modification.
Hebb's paradigm is not about the activity generated in one assembly
by the activity present in another. It is not a feed-forward
picture, for good or for bad. It can be summarized as a process
generating the feed-back connectivity required for maintaining
reverberations (persistent spike distributions) in a local network
by the activity in the same network. The citations below make this
point quite clearly.
Let us assume that the persistence or repetition of a reverberatory
activity (or ``trace") ts to induce lasting cellular changes that add
to its stability... When an axon of cell A is near enough to excite a
cell B and repeatedly or persistently takes part in firing it, some
growth process or metabolic change takes place in one or both cells such
that A's efficiency, as one of the cells firing B, is
increased.
Elsewhere Hebb says "It seems that short-term memory may be a
reverberation in the closed loops of the cell assembly and between cell
assemblies, whereas long term memory is more structural, a lasting
change of synaptic connections. [p. 110, emphases in the original]
This is not inted as a hagiography of Donald Hebb, nor proof by dogma,
nor a decomposition of texts. Rather, it is an attempt to salvage
a profound idea from excessive fragmentation which has obscured the
potential functional three-way split of the cognitive computational
machine. In particular, it has caused an underestimation of
phenomena necessary for mental processing, phenomena which
can relatively easily be observed neuro-physiologically and which
can provide precious information for deciphering some basic
ideas pertinent to cortical computation.
Note that in the second quote the distinction between long
term memory and active memory is implied, yet the terminology is
not adopted. Clearly, the reverberation is an active state of the
assembly, while the structure of the synaptic organization is
not. Moreover, the reverberation must be sustained by the
underlying synaptic structure, and hence it is a particular
expression of the properties of the synaptic structure. A given
synaptic structure may persist for short, intermediate or long
terms.
To conclude this speculative part it may be of value to point out an
additional bonus promised by this outlook. It is known
anatomically, physiologically and neurologically that as one
proceeds along the elaboration path in the cortex, one always
finds back projections, as far back as into the primary sensory
areas. On the other hand, it is a very familiar experience to
have a given sensory power notably improved when the content of
the observed stimulus is known. For example, when vision is
impeded by distance or haze so that a given object cannot be
discerned (or read), receiving a cue as to the nature of the
object (or the written text) often produces a clear perception of
the target. In other words, suppose that the sensory
information about an object is not sufficient to produce
recognition. Suppose further that other information about the
object is given (vocally, for example) some time prior to the
observation of the object. The information in the vocal signal
had been recognized well, and hence has excited a reverberation in
some module. The back-projections from this module to the more
primary areas in which the visual signal is being processed may
provide sufficient additional specific information to make
recognition possible. The special role of the reverberation is to
make the contingent information available long after the signal
producing it has disappeared.
Hebb's idea about reverberation in cortical assemblies seems
to have been motivated by observations of Lorente de
No on neural diagrams of Golgi stained cortical
slices made by Ramon y Cajal. The neural diagrams discussed by
Lorente de No contained a small number of neurons, because
the staining method makes a small fraction of the neurons in the
slice visible. Hence the feed-back circuits observed seemed
relatively simple and suggested simple flows. Hebb was aware of
the fact that the circuits were too simple and would probably not
be able to sustain a reverberation for sufficiently long time.
And that to ensure long living reverberations a much larger number
of neurons would be required.
The fact that local reverberations are almost a logical necessity
does not remove the need to test their existence empirically. This
has been done in the last few years in a particularly convincing
way by Miyashita et al., in a
culmination of a program started some twenty years ago by
Fuster and Niki. In these
experiments monkeys are trained to perform delayed image matching
(delayed match to sample DMS) of visual images. The images are
supposed meaningless for the monkey and mutually uncorrelated in
their geometrical structure. For this purpose the images are
generated by a computer using a graphic procedure with several
stochastic components. Typical images are given in Fig.
miya.image. Following training, testing proceeds according to
the following protocol: 1. an image appears on the screen for a
short period (200$ms$); 2. The screen remains blank for a prolonged
period (as long as 16 seconds); 3. a second image appears briefly;
4. the monkey should react selectively upon whether the first and
second image have been the same or different.
Training is performed, prior to the insertion of recording
electrodes, by presenting the monkeys long sequences of pairs of
images as described above, and rewarding them for correct
responses. In different experiments the sequence of first images
is varied. For example, the first images can be drawn at random
from a store of generated images; or, it can be a fixed sequence
shown in a fixed order; the set of images can be divided into
pairs, the members of each pair follow each other in a fixed order
as first stimulus during training, the pairs are selected at
random. The second stimulus is always randomly selected with about
50% chance of being equal to the first. Note that the existence
and the structure of the attractor deps only on the first
stimulus. The second stimulus, the one used for matching serves to
maintain the attention to the task. The ordering of first stimuli
applies exclusively to the training phase. During testing, first
stimuli are drawn at random.
Figure 1. Several visual images used in the experiment.
By impressive use of circumstantial evidence Miyashita et al.
succeed in identifying a small part (about 1mm$^2$) of anterior
ventral temporal (AVT) cortex where stimulus selective persistent
activity is manifested during the delay period, i.e. in the
interval between the presentations of the first and second images,
in which the stimulus is absent. The fact that the selective
activity distribution can persist for as long as 16 seconds, in a
rather noisy environment, is convincing evidence for the local
maintenance of a reverberation by the feed-back in the synaptic
structure.
Figure 2. Reverberation dynamics. Four types of neuronal
behavior observed in single units. At the top of each window 12 spike rasters demonstrate the
reproducibility of the delay activity on the single unit level,
despite intervening presentations of other stimuli. Each row of
dots is the representation of spike times recorded from a given
neuron in a single trial, with the same image for first stimulus.
The similar density of spikes, in the delay period (the wide
central interval ) in all 12 rasters, despite the fact that other
images intervened as first stimulus between them, is the
reproducibility underpinning the representation concept. Bottom:
spike rate histograms. a) Neuron active in presence of stimulus
and persisting in its absence; b) Same neuron unaffected by
stimulus, active in delay period (a different stimulus leading to
the same attractor); c) Same neuron, third stimulus, neuron active
in presence of stimulus, weakly active in reverberation, a
different attractor; d) Same neuron inhibited in presence of a
fourth stimulus and weakly active in delay period. Same attractor
as in (c). Under each window is the time course of the trial's
protocol: pre-stimulus; warning; first stimulus; delay; second
stimulus.
The main findings of these experiments, as seen by a theorist, are:
Figure 3. Correlated reverberations. Correlation coefficients
(Kal rank coefficients) of spike activities in a neural
population in the delay period as a function of the positional
separation of the stimuli exciting the reverberations in the
training sequence. (From ref. [miya2] Fig. 3c.) Full
circles represent correlations of delay activity distribution for
learned images (used in training). Empty circles refer to
activity distributions provoked by `new' images (not used in
training). The different curves represent different samplings of
neurons in the module selected for the computation of the
correlations. The stars are irrelevant to the present
discussion.
The above presentation of the empirical situation requires several
notes of clarification. On the one hand, we have emphasized the
distributive and collective nature of the internal local
representations and on the other, the reproducibility of
recordings of single units. This apparent contradiction
disappears if one keeps in mind that the enhanced, persistent
activity of any individual neuron can only be sustained due to the
support of its fellow neurons participating in the same
reverberation, via the synaptic local feed-back. The activity of
the single neuron, therefore, carries information about the
stimulus only if it is accidentally selective between the different
distributed reverberations. Among the active neurons there may be
several that have the same activity in different representations.
On the other hand, the fact that the reverberations are dominated
in their detail by the synaptic structure, and not by fine details
of the stimulus, implies that every time the same stimulus is
presented, the same reverberation is aroused. Consequently, the
internal representations, for whatever they represent, may be
perused and catalogued recording one neuron at a time, as is in
fact done in the experiments of Miyashita et al.
This precludes, of course, observing phenomena related to correlations
in spike emission times if those manifest themselves in different
neurons. To observe those, multi-unit probes are
required. Yet, I feel that the information, representational,
computational and cognitive, contained in rate distributions is far
from exhausted. On the other hand, the relative facility of access
to this type of information as well as of its analysis and
modeling, compared to multi-electrode data, makes it a very
attractive subject of investigation.
A significant aspect of the generation of random images in the
Miyashita experiments is that it is quite likely that the
correlations between internal representations of stimuli are due only to
their context depence, i.e. their frequent contiguous appearance in
the training phase. The effort invested in the generation of the
images is recompensed by the fact that in some
sense the registered correlations between the attractors are the
minimal correlations among internal representations: those due purely to
the constraints imposed on the learning process by the sequential
training and the existence of attractors.
The attractor picture and the observed correlations it creates
among internal representations have a rather universal feature.
The dynamical tagging of a given memory may produce persistent
activities in several modules in the cortex. The representation of
one stimulus in different modules may represent different features
of the stimulus, such as color, shape etc... Different modules may
be involved in the generation of different types of reactions. One
gets the impression, from the experiments of the Miyashita group,
that the observed module in AVT is directly related to the pair
association [miya3], while it is not as directly related
to the matching task. The universality inted here is that this
does not matter, in the sense that wherever the attractor related
to the reaction is, it must represent similar correlations of the
internal representations, because those dep only on the fact
that one is learning one attractor while reverberating in a
previous one.
This observation suggests that one may use the lessons of these
experiments to speculate directly about human cognitive
psychology. The the fact that correlations form between the
attractors representing semantically meaningless, uncorrelated
stimuli implies that priming phenomena should be observed among
stimuli of this kind. (Priming is the experimental
observation that the time for recognition of an incomplete pattern
is shortened if the presentation of the stimulus to be recognized
is preceded by the presentation of a cognitvely related stimulus.)
The only condition is that they be presented in an ordered sequence
during training, indepently of whether we can observe the
attractor cortical network involved in the cognitive task or not.
That priming should take place in a situation of correlated
attractors can be concluded intuitively, and is confirmed in model
networks. It is the mere observation that if the assembly is in a
given reverberation, due to the priming stimulus, the test stimulus
to be recognized will find it easier (and hence faster) to provoke
a transition to another reverberation attractor, the more similar
the activity distribution in the latter attractor is to the priming
attractor.
Thus, purely on the phenomenological, pre-theoretical level, given the
observation of correlated attractor representations, one is lead to
consider for example:
Similarly, the attractor interpretation of the experiments suggests
informative extensions of the experiments in primate
`cognitive-neuro-physiology':
Modern physics has had it as a very powerful methodology that very
structured phenomena in a complicated system are investigated in toy
models. In other words, phenomena like super-conductivity, magnetism,
liquid crystals etc., are not searched for by starting from the well
established dynamical laws of systems of nuclei and electrons. Instead,
some essential features of the elements,
deemed relevant to the structured, emergent phenomenon under
investigation, are represented in a simple tractable model. If the
dynamics of the toy model actually produces the expected structure, the
robustness of the phenomenon to the reintroduction of the omitted
complexity of the underlying elements is investigated. This iteration
serves both to justify the toy as well as to study further details of
the emerging structures.
The discussion in the sections above has left us with a double task:
The first task has been solved by the Hopfield model which has
served as the basic toy model in describing the emergence of a
diversity of structured attractors, robust to many types of random
damage and noise. This was done by employing an explicit form for a
synaptic matrix, that has a learning flavor, in the sense that the set
of synaptic efficacies is constructed, in an additive fashion, from the
correlations of activities of neural pairs in afferent patterns that
are to be the attractors of the network. This is in the grain of the
Hebbian paradigm. Namely, external stimuli to be learned impose
activity distributions on an assembly, which in the learning process
develops a set of synaptic efficacies that can maintain, autonomously
such activity distributions as reverberations. It is after all the
synaptic matrix, and only the synaptic matrix that can maintain the
delay activity in the absence of the stimulus. The possibility of
generating a synaptic matrix which ows a network with a large variety
of different robust attractors in a single module, was the main
achievement of this model. Moreover, the formulation of the model
allowed for the detailed computation of many properties of the
dynamical response of the assembly to external afferent stimuli.
Furthermore, this toy model has also served to generate metaphors
for several psychiatric and neurological pathologies.
Hoffmann has used it to describe a distinction
between mania and schizophrenia. Virasoro has used
the properties of this model under a random destruction of
synapses (a lesion) to capture phenomena such as prosopagnosia.
Yet this model has left unanswered a whole set of questions of
detail. A representative list includes:
Early criticisms of the Hopfield model have concentrated on the type of
synaptic matrices used. Symmetric, fully connected matrices, with
random distribution of excitatory and inhibitory synapses on the
axons of each neuron, and infinite analog depth (The ability
to maintain with high precision a large number of different,
closely spaced values.) for each synapse, have made the analysis
by theoretical physicists easier. Those are unrealistic
impositions, since it is unlikely that the cortex will generate
symmetric synaptic structures; cortex is connected at most at a
level of 10% ; excitation and inhibition find
their places on different neurons -- Dale's rule. Yet these
criticisms have been shown to be relatively innocuous. The
synaptic dilution and the limited analog depth of the synapses
have been treated
by Sompolinsky and shown to affect the performance of
the network only mildly. If fact, in some cases the performance
per remaining material resource (such as storage capacity per
surviving synapse) was found even to improve in the less ideal
system.
Also the question of the low coding levels has been found to be
relatively simple. Though its resolution has brought to light the
fact that if one looked for a network with uniform thresholds for
the neurons, the behavior of the network is strongly depent on
the choice of the representation for the neural states. If one
insists on representing neurons by two state variables, there is a
clear advantage, in representing these states by (0,1) over (--1,1).
A more elaborate modification of the dynamics of the original toy
model was required in order to account for the relatively low spike
rates in the attractors observed in experiment. It required a more
detailed treatment of the single neuron dynamics, arriving at a
description of neurons in terms of coupled systems of afferent
currents and efferent spike rates. This description has produced
networks with attractors operating far below the saturation of the
neurons composing the network.
The modified networks, with low (arbitrary) coding levels and low
spike rates preserved the main features of the original toy model:
robust diversity of attractors, classifying stimuli
auto-associatively. Two outstanding issues remained:
auto-associativity and bi-modality.
The difficult problem of modifying a network to form
attractors with correlations of the Miyashita type from
uncorrelated stimuli learned in a fixed order, found a solution
with a flavor of the Miyashita training scenario, at the level of
the toy model. Indicating once again the usefulness of such
models as drawing boards for new ideas. Auto-associative ANN's
are based on the idea that the synaptic matrix codes for the
correlations of activities of pairs of neurons as induced by a
given afferent stimulus. The neural pair correlations in different
stimuli are coded indepently of each other. It was shown
that when synaptic modifications, induced by training on a
sequence of stimuli presented in a fixed order, record
{ also} the correlations of the activities of pairs of neurons
induced by one stimulus with that
of its immediate predecessor in the sequence, then the resulting
attractors display correlations of the Miyashita type.
The resulting attractors, each classified by the uncorrelated
stimulus that had been learned and that excites it, are correlated for as
far as five apart in the training sequence. This theoretic result
has manifested itself in a dramatic way, in that each persistent
delay activity (attractor) has a finite similarity index with
exactly five of the nearest stimuli in the sequence.
It is just the approximate range of significant correlations
observed
in the experiment. But this surprising { five} was deduced in
the rather artificial context of the toy model.
Note that two types of correlations enter this discussion, and
they should not be confused: One is the correlations of activities
of pairs of neurons, during the presentation of stimuli for learning.
They drive the Hebbian learning. Then, when learning had generated
the synaptic matrix, the network's dynamics is controlled by that
matrix. In particular, this synaptic matrix determines the structure
of the attractors (delay activity distributions). The correlations
between these attractors are the second type. They are the ones
measured by Miyashita.
The promise of the result was in that
Figure 4. Confrontation with theoretical model (4000
neurons): Kal rank coefficients of attractors as a function of the
separation of the stimuli in the training sequence. To each stimulus
corresponds an attractor (reverberation), the one excited by it. The
attractors are labeled by the serial position number (SPN) of the
corresponding stimuli in the training sequence. The learned stimuli
forming the synaptic matrix are uncorrelated, since the images
presented for learning were uncorrelated. The activity distributions in
the attractors are. The error bars are standard errors in the sample of
neurons. a. Correlations for regular sample second curve from top, b.
Correlations in enhanced sample, top curve. Full curve, model results;
dotted curve, experiment.
In fact, the phenomenon persists, when the formal neurons are
replaced by quasi-realistic, integrate-and-fire,
neurons. An ANN operating with a synaptic matrix
containing information on temporal contiguity in the training
process, preserves the main features of the attractor correlations
in the systems of discrete neurons. The correlation coefficients
(Kal rank coefficients, as used in the experiments) of the
network of realistic neurons , which also includes reactive,
separated inhibition, agree quantitatively quite well with the
measured values. Moreover, the more
realistic model presents some additional features which brings the
model even closer to the biological experience:
The attractors expressing the Miyashita correlations do not exhibit a
simple bi-modal distribution of spike activity among the neurons in the
assembly. All previous ANN models produced sharp
bi-modality, either because the neurons were discrete,
i.e. quiet or at saturation frequencies, or because
the models implemented auto-associativity. Experiments manifest
attractors, but not simple bi-modality. The model predicts a
large, stimulus selective, peak at very low spike frequencies, and a
wide distribution of rates among the active neurons. Consequently,
the combination of realistic neurons and attractor correlations
(i.e. the departure from auto-associativity) gives a potential
response to the problem of the nature of the rate distribution.
To conclude this discussion we show one more measurement carried
out both on the model and on the performing monkey.
We show the distribution of activities produced in
a given neuron in the reverberations provoked by the presentation
of the complete set of learned stimuli. One sees the rate of
spike emission by this neuron, in the delay period, for each of the
stimuli plotted in the order in which they had been learned.
This is not the right context in which to develop a more detailed
discussion of the confrontation of theory with experiment.
The discussion has been opened
here only to indicate
that the insights gained by the interpretation of the Miyashita
experiments in the language of attractor dynamics is accompanied
by a candidate model which captures the experimental results to a
very impressive degree of detail. Such a model can consequently
serve as a drawing board for the development of future paradigms
in cognitive psychology.
Figure 5. Average delay discharge rate vs serial
position separation on a given cell. (a) Fig. 3a (b) Model. This
displays the level of activation of the particular neuron in the
reverberation stimulated by each of the hundred stimuli in the learned
sequence. The existence of two peaks indicates that this neuron
participates in the representation of two uncorrelated stimuli. The
side wings of each peak are due to the correlations of the attractors,
developed in learning a fixed sequence.
The attractor description as a language and as a set of models is a
strongly predictive framework. It produces
several detailed experimental predictions even prior
to the elaboration of detailed models. The models enlarge the
predictive set of commitments of the approach. The detailed
studies lead to the following predictions:
The list can be continued.
The above picture sounds too good. It may still have to
undergo modifications. The main exposed flanks we perceive at this
stage are:
Even if some modifications are required, the reverberation picture is
too fertile to be rapidly discarded. We have received it from Hebb
aged some fifty years. It has only recently been given a direct
empirical (neuro-physiological) dimension
and has been owed with a precise mathematical
model. One is tempted to start drawing a host
of speculative conclusions from the new framework exposed by
Miyashita's monkeys. The context provides fertile ground for
adventures in cognitive psychology and even for some aspects of
linguistics ranging from binding, which may be related to syntax,
to priming, which may be extrapolated to semantics. It may even
suggest a substrate for psychology itself.
Yet the lessons learned from these experiments include the one
which advises restraint. It is just these experiments which
indicate that our imagination concerning brain computation is still
too much constrained by formal mathematics, by computer languages
and by artificial intelligence. In this connection it is well
worth recalling the wisdom of John Von Neumann, writing 40 years
ago: {quote}
...Thus the outward forms of our mathematics are not
absolutely relevant from the point of view of evaluating what the
mathematical or logical language truly used by the central
nervous system is... the above remarks about reliability and
logical and mathematical depth prove that whatever the system is,
it cannot fail to differ considerably from what we consciously and
explicitly consider as mathematics. [The Computer and the Brain,
Yale 1954 p. 82, emphases in the original]
It is most likely that atting for a while longer to the details of
the contact between modeling and experiment would keep options open
which a premature harvest of speculation would close.
I am indebted to Prof. Peter Hillman for a critical reading of an
earlier version of this manuscript and to an anonymous referee
who has helped me improve the paper very significantly.
Miyashita Y and Chang HS 1988 Neuronal correlate of
pictorial short-term memory in the primate temporal cortex,
Nature, { 331} 68
Miyashita Y 1988 Neuronal correlate of visual associative
long-term memory in the primate temporal cortex, {\em Nature},
{ 335} 817
Sakai K and Miyashita Y 1991 Neural organization
for the long-term memory of paired associates, {\em Nature},
{ 354} 152.
Amit DJ 1993 In defense of single electrode recordings,
{\em NETWORK}, { 3} 385
McNaughton BL, Barnes CA and Andersen P 1981 Synaptic
efficacy and EPSP summation in granule cells of rat fascia dentata in
vitro, {\em J Neurophysiol} { 46} 952
Sayer RJ, Redman SJ, Andersen P 1989 Amplitude
fluctuations in small EPSPs recorded from CA1 pyramidal cells in the
guinea pig hippocampal slice, {\em J Neurosci} { 9} 840
Mason A, Nicoll A and Stratford K 1991 Synaptic
transmission between individual pyramidal neurons of the rat visual
cortex in vitro, {\em J Neurosci} { 11} 72
Tanaka K, 1992 Inferotemporal cortex and higher
visual function, {\em Current Biology} {\em 2} 502
O'Keefe J and Speakman A Single unit activity in the
rat hippocampus during a spatial memory task, {\em Exp. Brain Res.}
{ 68} 1
Amit DJ Brunel N and Tsodyks MV 1993 Correlations of
Hebbian reverberations, {\em Jour. of Neurosci.} in press.
Hebb DO 1949 {\em The Organization of Behavior} (Wiley, NY)
Hebb DO and Donderi DC 1987 {\em Textbook of Psychology}
fourth edition, (Laurence Erlbaum Ass, Hisdale NJ)
{deNo}Lorente de {\'No} 1949 in Fulton JF editor, {\em
Physiology of the Nervous System} (Oxford University press, NY)
Fuster JM 1973 Behavioral electrophysiology of the
prefrontal cortex, {\em J. Neuropysiol.}, { 36} 61
Niki H 1974 Prefrontal unit activity during delay alternation in
the Monkey, {\em Brain Res.} { 68 } 185.
Braitenberg V and Schutz A 1991
{\em Anatomy of Cortex} (Berlin: Springer-Verlag)
Anisfeld M and Knapp M 1968 Association,
synonymity, and directionality in false recognition. {\em Journal
of Experimental Psychology} { 77}, 171
Damasio AR and Damasio H 1991 Cortical systems underlying
knowledge retrieval: evidence from human lesion studies,
Background manuscript for the Dahlem Conference on Exploring Brain
Function: Models in Neuroscience, Berlin
Hopfield JJ 1982 Neural networks and physical
systems with emergent selective computational abilities, {\em Proc.
Natl. Acad. Sci. USA} { 79}, 2554
Amit DJ 1989 {\em Modeling Brain Function}
(Cambridge University Press, NY)
Hoffman RE 1987 Computer simulations of neural
information processing and the schizophrenia-mania dichotomy, {\em
Archives of General Psychiatry} { 44} 178
Abeles M, Vaadia E and Bergman H 1990 Firing patterns
of single units in the prefrontal cortex and neural-network models,
{\em NETWORK}, { 1} 13
Virasoro AM 1988 Categorization in neural networks and
prosopagnosia, {\em Physics Reports} { 184} 99
Tsodyks MV and Feigel'man MV 1988,
The enhanced storage capacity in neural networks with low activity
level, {\em Europhys. Lett.}, { 46} 101
Buhmann J, Divko R and
Schulten K 1989, Associative memory with high information content, {\em
Phys. Rev.} { A39} 2689
Sompolinsky H 1986 Neural networks with
nonlinear synapses and static noise, {\em Phys. Rev.}, { A34},
2571 and The theory of neural networks: The Hebb rule and
beyond, in L. van Hemmen and I. Morgenstern eds. {\em Heidelberg
Colloquium on Glassy Dynamics}, (Springer-Verlag, Heidelberg)
Amit DJ and Tsodyks MV 1991 Quantitative study of
attractor neural network retrieving at low spike rates I:
Substrate -- spikes, rates and neuronal gain {\em NETWORK} { 2} 259
Amit DJ and Tsodyks MV 1991 Quantitative study of
attractor neural network retrieving at low spike rates II:
Low-rate retrieval in symmetric networks {\em NETWORK} { 2} 275
Griniasty M, Tsodyks MV and Amit DJ 1993 Conversion
off temporal correlations between stimuli to Spatial correlations
between attractors, {\em Neural Computation} { 5} 1
Cugliandolo L 1994 Correlated attractors from
uncorrelated stimuli, {\em Neural Computation} { 6} 220
On leave from Racah Institute of Physics, Hebrew University, Jerusalem
5. Experimental evidence for Hebbian reverberations and beyond
FIGURES AVAILABLE ONLY IN HARD COPY
FIGURES AVAILABLE ONLY IN HARD COPY
FIGURES AVAILABLE ONLY IN HARD COPY
6. Empirical and cognitive implications
7. Toy models and realistic modeling
[FIGURES AVAILABLE ONLY IN HARD COPY]
[FIGURES AVAILABLE ONLY IN HARD COPY]
8. Predictive theories
9. Some provisos and defensive outlook
Acknowledgments
References