The Plausibility of a Chaotic Brain Theory
Ichiro Tsuda
Department
of Mathematics, Graduate School of Science,
Hokkaido
University, Sapporo, 060-0810, Japan
Abstract: We consider the significance of
high-dimensional transitory dynamics in the brain and mind. In particular, we
highlight the roles of high-dimensional chaotic dynamical systems as an ``adequate
language" (Gelfand 1989), which should possess both explanatory and
predictive power of description. We discuss the methods of description of
dynamic behavior of the brain. These methods have been adopted to capture the
averaged or deterministic complexity, and further to allow for discussion of a
new approach to capture the complexity of the deviation from such an averaged
complexity and also the complexity of interactive modes. We also give arguments
in defense of our models for dynamic memory with chaotic itinerancy and Cantor
coding. In addition, we give discussion with regard to the reality that a model
of the brain and mind should reflect.
Key words: Adequate language; hermeneutic theory;
dynamic aspects of the brain; asymmetry; feedback code; skew product
transformations; dynamic memory
R1.
What would a theory of the brain be like?
R1.1
Why hermeneutics?
As Érdi correctly points out, the brain is
a hermeneutic device in the sense that it interprets the world through sensory
information processing, and for us to understand the brain, we must develop
some way of interpreting its activity. The brain not only receives and
processes external information but also creates a new ``reality" and
``actuality". According to Bin Kimura (1998; 2000), ``reality" is a
kind of sensation that can be objectively understood, while ``actuality"
is a more subjective experience based on sensation relating to action and
behavior. With the revolutionary finding of Freeman and his colleagues that
took place over a period of over 40 years, we now know with certainty the
following: Animals do not directly respond to external stimuli but rather
respond to internal images they create, and animals' perception is a result of
an autonomic interpretation process. In the case of human cognition, a more
direct interpretation process must exist. In this case, interpretation is a
recursive process evolving in time that acts between pre-understanding and
perceived information. A person's pre-understanding should be altered in
accordance with perceived signals, while the perceived information will also
change in a manner that depends on the change in the pre-understanding.
In
order to understand the global function of the brain we must learn how to
interpret brain activity. Many people have attempted in the laboratory to find
a neuronal representation of information, assuming the existence of neuronal
correlates to cognition. However, it is not possible to obtain such a
representation without knowing precisely the actual effects experienced by
neurons or neural assemblies during the performance of the task in question. On
the other hand, such actual effects themselves are the object of study.
In
order to observe how an artificial brain creates a new actuality in the sense
of Kimura, Ikegami & Tani (see
also Tani, 1998) attempted to interpret with cognitive language the interaction
between perception and action manifested in the behavioral self-control of
robots. Although it is still controversial whether or not the interpretation
provided by Ikegami & Tani
regarding the perception and behavior of a robot with recurrent neural network
(RNN) is plausible, we consider it to be one study in one possible useful
direction. The work of Quoy, Banquet
& Daucé on robot navigation control utilizing random RNN also
represents a promising direction. Actually, these works can be viewed as
implementation of a hermeneutic theory of the brain (Blomfield and Marr 1970;
Marr 1982; Tsuda 1984; T. Winograd and F. Flores 1986; Arbib and Hesse 1986;
Tsuda 1991; Érdi 1996; Arbib, Érdi and Szentágothai 1998; Érdi and Tsuda 2001).
Raffone & van Leeuwen criticize our theory as a
holistic theory expressing opposition to our statements in our target article
that information representation in the brain is dynamically realized as a
whole, and that the precise nature of neural elements, such as neurons and
cortical sub-areas, is irrelevant. They interpret our theory as a top down
theory and thus as a hermeneutic theory. However, as Érdi correctly states and as we emphasized above, a hermeneutic
theory is plausible and even provides an adequate language system (Gelfand
1989) that is sufficient to express an understanding the brain function in
terms of a relation of physiological phenomena to cognitive and psychological
phenomena. A proper theory of the brain must be a hermeneutic theory. It is
correct for Raffone & van Leewen
to point out what their model suggests: In early vision, the dominance of local
features processing and the later processing of global structure. This local
dominance, however, exists only under the condition of tabula rasa in the sense
of Locke. After the development of learning, the processing of global structure
can become dominant. This situation is clearly seen in the cognitive process of
inference in a certain language game that I introduced previously (Tsuda 1991;
Kaneko and Tsuda 2001) as a Shannon test. Shannon invented this game when he
estimated the information content (the number of bit) contained in one
(English) word. In this game, there are two people, A and B. In the beginning,
A has a sentence in mind, of which B has not been informed. B attempts to
determine this sentence and does so by first attempting to determine the first
word, and hence the successive letters. This is done by asking a series of
questions to which A can give only Yes and No answers. Early in the game, B can
only ask questions randomly, in a bottom-up form of processing. As B's
knowledge increases, however, a top-down form of processing that relies on B's
context-dependent judgment becomes increasingly effective.
Let us
further discuss briefly the uncertainty regarding the understanding of neural
elements. Dinse has described recent
developments in the study of dynamic receptive fields (dynamic-RFs). Population
level activity in early sensory cortices expressed with respect to coordinates
of the stimulus space has been studied, and dynamic population-RFs have been
constructed (Jancke et al. 1999). Dinse
has discussed the possibility that these two types of RFs play a functional
role, influenced by the structure and function of surrounding neural networks in
early sensory cortices. In other words, the difference between spatio-temporal
dynamics in dynamic-RFs and those in dynamic population-RFs clearly reflects
the dependence of the identity of the neural element or functional unit on the
nature of the information processing. We would also like to mention Sakurai's
series of studies on the flexible coding and decoding of cortical neurons
(Sakurai 1996; 1998; 1999) as well as the work of Arieli et al., which both Dinse and we (in the target article)
cite. These studies clearly show the dependence of neuron activity on the
activity of the system as a whole or at least of the surrounding networks on a
large scale. Sakurai found that the type of activity displayed by individual
neurons is correlated with global behavior, which was characterized according
to different tasks performed by the experimental subject. Both Kay's comment and Freeman's finding on the olfactory bulb support the existence of
such a relation, although Foster argues against it. The fundamental fact
underlying our point of view is that the identity of the neuronal element
cannot be known a priori.
In
addition to the points raised above, adult neurogenesis (Sakakibara and Okano
1997; Kempermann et al 1997; Eriksson et al 1998; Pincus et al 1998) in wide
areas of the brain, in particular, in the olfactory system, the hippocampus,
and even the neocortex may support the hypothesis of the variability of neural
representation due to dynamic reorganization of neural networks. The term ` as
a whole' that we use in the target article expresses the variability of the
nature of the functional unit in the sense that which acts as the `unit' may
change in a manner that depends on interrelations within the surrounding
network and also the process of functional manifestation. Therefore, the word
`holistic' is not appropriate to characterize our theory. A term like
`relationally dynamic' would be more appropriate.
R1.2
The method of description
Freeman has proposed a mesoscopic level description
in order to identify the level at which functional dynamics emerge. Through his
own studies and other evidence, he has found that this level is not that of a
single neuron level, that is, the microscopic level. In conventional phase
transitions studied in physics, many degrees of freedom at a microscopic level
begin to become correlated with each other in some critical regime of the
system's parameter(s), and as a result macroscopic order that can be described
by a few degrees of freedom emerges. These few degrees of freedom are
represented by so-called order parameters. However, in some complex systems,
the dynamics of these order parameters can be complex and even chaotic, though
chaotic behavior in this case is confined to a low-dimensional attractor. In
more complex systems, the identity of the quantities that act as the order
parameters may change in space and time. In this case, the macroscopic
description looses its descriptive power. It is natural to consider an
intermediate level between the microscopic and the macroscopic levels as a
level of description where dynamically complex behavior can be captured.
Physicists call this level the `mesoscopic' level. Freeman borrows this idea. Freeman's
use of the term `mesoscopic' in the description of inputs-driven chaotic
behavior in the olfactory system is appropriate, because in the brain,
dynamically transient motion is generic, as Breakspear & Friston, Dinse,
Freeman, Heath, and Kay correctly
point out in their commentary.
Several
methods have been employed in the construction of scientific theories. Before
discussing the method we propose in the modeling of the brain, we give some
general discussion. It should be noted, though, that the following distinction
between such methods described below might be controversial, and many other
ways of distinguishing such methods based on different philosophical viewpoints
are possible. However, since we believe there is a difference between the
methods of constructing theories in the study of the brain and the study of
physics, we feel that the manner of thinking we use here is useful.
One
method employed in the construction of theories is that which begins from
`first principles'. Here, by a `first principle' we mean a hypothesis or an
axiom on which a theory is based. An example of a theory obtained using this
method is Newtonian dynamics formalism. However, in the situation that proper
first principles cannot be identified or when a theory based on first
principles is not feasible, a `phenomenological' method is adopted.@The description of fluid using Navier-Stokes
equations is a typical such method. Thermodynamics is also such a
`phenomenological' theory. By `phenomenological' theory we here mean that a
theory is based on experiential rules. In neural systems, a useful method of
modeling that seems to be based on something very different from both a method
of first principles and a phenomenological method has been proposed. This
method consists of modeling in terms of the Hodgkin-Huxley (H-H) equations
(Hodgkin and Huxley 1952). The distinguishing characteristic of this method is
that it includes a set of inductively derived equations that explicitly include
experiential equations. We call this type of model a `semi-experiential model'.
Banerjee's model of spiking neurons
is at this level. Also, Aihara &
Ryeu have studied chaotic neuron model constructed by Aihara et al.
(Aihara, Takabe and Toyoda 1990. See also Fig. 9 in the target article). This
chaotic neuron model is a kind of abstraction of periodically forced
Hodgkin-Huxley equations.
It is
probably true that among models which possess a physiological background there
is no generic model found to this time other than the H-H equations. Freeman's
population model simulates many types of dynamic behavior in the olfactory
system, as described by Freeman and Kozma in their commentary. Through
studies of population models like the KIII model, one can extract the essence
of the dynamics that might be embedded in various areas of the brain. Whether
this type of model can be directly applied also to cortical systems other than
the olfactory system is still unclear. Therefore, it is still not known if we
can use a KIII-type model as a generic model applicable to all areas of the
brain. Nevertheless, we believe that population models, rather than types of
models such as coupled H-H equations, are more suited to describe (macroscopic)
functional manifestations, such as perception and cognition. It is important to
determine the proper variable at a mesoscopic level which can be used to make a
bridge between the physiological level and the psychological level. In other
words, it is important to determine the `adequate language'. It should be noted
that an electric potential or a sequence of impulses as such cannot be
considered a proper variable for the description of cognition.
Instead
of a direct use of a KIII-type model, we have considered another mesoscopic
description in the target article. This description is based on the realization
of self-organization in memory through chaotic dynamics.
We
have also investigated a general method of study for complex dynamical systems,
which will, we believe, provide a high-power description also for the study of
the brain and mind (see also Section R2).
The basic steps of this method are as follows (Kaneko and Tsuda 2001):
(a)
find structural changes of complex behavior from both static and dynamic
viewpoints by means of dissecting phase space;
(b)
find universality, reconstructing structures and relationships, immanent in
various types of complex phenomena;
(c)
construct an artificial system, based on the fundamental conceptual elements of
the universal properties;
(d)
construct a model that describes both top and bottom levels from an
intermediate level which is neither macroscopic nor microscopic;
(e)
construct an adequate language system sufficient to describe complex systems, based
on a mathematical theory for treating the complex dynamics and processes;
(f)
acquire new intuition by formulating contra-intuitive situation and by
observing the simulated variety of complex phenomena.
In the
above described procedure, the method of modeling mentioned in (d) corresponds
to the mesoscopic description discussed by Dinse
and Freeman. The present attempt to
construct a theory of the brain and the modeling given in the target article
represent a realization of this generic method. The above steps (e) and (f) are
deeply related to hermeneutics. It is important to realize the existence of a
dual purpose regarding the predictive and the explanatory power of a model
(Gernert 1998). The above described generic method possesses such a duality.
This dual purpose, proposed by Gernert, is also alluded to Ikegami & Tani. They correctly assert the importance of
dynamical systems as a tool or descriptive language, which is thought to
possess a stronger descriptive power than natural language. We agree with Ikegami & Tani, in particular with
regard to the point that high-dimensional dynamical systems including IFS may
provide an appropriate descriptive language for the brain. Certainly
physiological terminology itself cannot be considered as possessing explanatory
power regarding cognitive phenomena, as psychological terminology itself cannot
be considered as possessing predictive power regarding the physiological
phenomena. We wish to obtain a third language, a language system that is
capable of thoroughly describing brain and mind. At this time, of course, our
chaotic dynamical systems terminology is still primitive to realize such a goal.
Quoy, Banquet & Daucé suggest a similar perspective,
inquiring about the standpoint of chaos theory concerning behaviorism
(stimulus-response) and cognitivism (mental representation). For the modeling
of animal experiments involving higher functions, most of which consist of a
type of stimulus-response, we have attempted to interpret the internal states
of the brain as mathematical functions or distributions by observing the
stimulus-response relations. We employ high-dimensional chaotic dynamical systems
for this inferred internal representation. In order to decrease the ambiguity
of an interpretation of this type of animal experiment, we have constructed
(Tsuda and Hatakeyama 2001; Hatakeyama and Tsuda 2001) a formal theory of the
structure of task-related functional manifestation. We have applied this theory
to a series of experiments conducted by Sakagami et al. (Sakagami and Niki
1994; Sakagami and Tsutsui 1999). The theory was able to predict all possible
types of discrimination of stimuli and conditions which can be represented by
neurons found in the prefrontal cortex. We believe that chaotic dynamical
systems can be used to represent the neural correlates of cognitive processes
that can be detected by mesoscopic level measurements, such as f-MRI, optical
recordings, and (local) electroencephalograph. If a neuronal dynamical system
possesses point attractors and limit cycles only, this neural system lacks
adaptability to varying environments. Thus it will inevitably become
destabilized if we attempt to use it to model a full range of animal behavior.
The emergent dynamics capable of providing this adaptability should possess a
moderate stability that maybe a global stability. Our idea is to use chaotic
itinerancy as a means of guaranteeing both stability and adaptability.
R2.
Reality of the model
R2.1
Falsifiability of the theory
In our
attempt to find a new method of understanding the brain and mind at a
mesoscopic level, we face several difficult problems. First, a hermeneutic
theory seems to lack the falsifiability property demanded by Popper as a
minimum condition which a scientific theory must possess, since this theory can
develop self-consistently through the evolution of the pre-understanding,
allowing for a self-consistent interpretation to be reached. A hermeneutic
theory consists of components, each of which could itself consist of a
quantitative model and its predictions, and these, rather than the theory as a
whole, can possess falsifiability. From another standpoint, we could construct
a theory -- something that could be termed a `qualitative model' -- for the
purpose of providing a plausible story of neurons or neuron assemblies. Such a
theory should be constructed to allow us to carry out a more proper and deeper
understanding of the brain and mind. Thus, as Rowe & Wright state,
a quantitative model like theirs can lead to elemental models supplying such a
qualitative theory. For example, the PDE and coupled ODEs which constitute an
elemental level of Wright's theory (2000) of the brain activity and its
cognitive function can possess falsifiability, while the whole theory can be
justified as a hermeneutic theory.
In
biology, it has been asserted that the correspondence between structure and
function is crucial (Szentágothai and Érdi 1989; Li and Hopfield 1989).
Following this assertion, we have attempted to construct a structure-based
model of biological function. Actually, we constructed a skeletal model, based
on anatomical data that were collected in detailed studies over thirty years by
Szentágothai. In the modeling, we hypothesized that a type of structure that is
common to various areas possesses a common function, and a structure specific
to a given area possesses a specific function (see also Li and Hopfield 1989).
Both models presented in the target article were constructed according to such
a principle, and thus these models are examples of a kind of skeletal model.
Heath has proposed a cognitive model consisting of
dynamic neural networks, which could possess a predictive power. He also
discusses the possibility of converting the principles given in Section 3.5 of
the target article into predictions that can be verified.
One of
the characteristics observed in chaotically itinerant behavior is a long time
tail of the time-dependent mutual information. This tail often exhibits an
algebraic damping. This characteristic exist even when noise is present,
because the frequency of stagnant motion in the vicinity of an attractor ruin
cannot be decreased by such a perturbation. The presence of a long time tail
indicates the presence of recurrence of similar dynamic behavior in the
evolution of the system, and hence it may provide a mechanism capable of
producing short-term memory, like working memory. Nicolis and Tsuda (1985) proposed
a feasible mechanism of magic number seven plus minus two with chaotic dynamics
with large fluctuations, and further demonstrated (Nicolis and Tsuda 1989) that
these long-range correlations may lead to a universal power law known in the
study of natural languages as the Zipf law. The presence of recurrence of
similar dynamic behavior can work effectively when an episode is embedded in
the CA1 region by the use of Cantor coding. During a period of approximately
100-200 msec cortical-hippocampal loop time, only a few events in an episode
will be able to be embedded in a Cantor set. This loop time would not be
sufficient for the transformation of episode from a short-term memory to a
long-term memory. Some kind of recursive dynamic behavior may facilitate this
type of transformation.
At
this point, we must consider the fact that a memory is not independent of its
cortical context, as Heath points
out. Therefore, taking into account context cues in studying the process of
memory dynamics and formation is essential. It is certain that our present
model lacks this feature. Although the significance of context cues has been
emphasized by many authors, no mathematical model that is capable of
incorporating them has yet been proposed, as far as we know. We believe that
such context cues are input into other lower cortical areas in the more
abstract form of codes rather than raw sensory information. Usually, this input
corresponds to a feedback signal. For proper modeling, cortical neural activity
representing codes must differ from that representing raw sensory information.
In our hippocampal model, CA3 activity consists of chaotic itinerancy, but CA1
activity does not. This is because Cantor coding is carried out in the cross
section on which CA3 chaotic activity is constant. Coding hierarchy is
generally limited only by the nonlinearity of the chaos that provides a grammar
of chaotic motion. This limitation can be observed in the present model (see
also Aihara & Ryue). If code
signals strongly affect the chaotic behavior in CA3, the Cantor coding will be
fragile, and this calls into question its realism, as pointed out by Érdi, Freeman and Kay in the case with feedback. We believe that the feedback signal
is generally different in quality from the feedforward signal. Thus we doubt
that the feedforward and the feedback connections can be thoroughly described
in the same form as in coupled oscillators.
It
would be very useful for further development of the study of dynamic memory to
identify those features of our model with stochastic renewal and Heath's model
with chaos control that are similar and those features that are different,
since they have a similar structure of the interacting `modules' (see also
Heath 2000).
R2.2
Could inputs and modifiable synapses be a bifurcation parameter?
The
brain is an open system in both energetic and informatic senses. With respect
to energy, the brain is a far-from-equilibrium system, since it is maintained
in a high energy state by the influx and outflux of energy and matter. With
respect to information, the brain receives external information and dictates
action on the environment in response to this information. However, contrary to
Banerjee's claim, we assert that
such inputs should not take the form of bifurcation parameters.
Banerjee's observation concerning transitory dynamics
made with regard to his treatment of spiking neurons is correct. This
observation is that the attractor created in any cortical `column' is
continually influenced by neighboring `columns', subcortical areas, and the
environment. For this reason, this attractor changes or disappears, and a new
attractor is created. This is the nature of the transitory dynamics
characterizing the system. Through this observation, Banerjee studied these
transitory dynamics using a treatment in which the inputs are represented by a
bifurcation parameter. Quoy, Banquet
& Daucé also consider inputs as bifurcation parameters. Although we
appreciate the models proposed by Banerjee and by Quoy et al., we are skeptical
to their assumption that inputs can be treated as bifurcation parameters.
We now
consider the situation in which a system receives inputs from other systems. In
the case that invariant sets like attractors are present in phase space, the
change of such invariant sets in parameter space can be described as a
bifurcation. To treat an input as a bifurcation parameter is equivalent to
assuming the presence of such invariant sets, and hence this treatment becomes
feasible only when the inputs change extremely slowly, compared with the
system's dynamics. This is not a valid assumption in a dynamic system like the
brain or any of its subsystem in which the input rapidly varies. It is crucial
for understanding a brain of this nature to investigate it as a dynamical system
influenced by varying inputs that may be produced by other dynamical systems
either with or without noisy perturbations. By viewing inputs as originating
from other system, rather than as bifurcation parameters internal to the
system, the dynamic behavior of the model can be better characterized.
When
considering inputs as variables controlled by other systems, rather than as
bifurcation parameters, a total system can be viewed as an IFS. Pollack (1991)
used recurrent neural networks for the system under study and incorporated the
external world in the form of varying inputs. In a dynamic memory model, we
used recurrent neural networks with inhibitory neurons for the model brain
system and modeled probabilistic synapses as varying inputs. This model appears
to contain a Hopfield spinglass-like model, since it is essentially reduced to
a Hopfield net when the probability characterizing these synapses approaches
the inverse of the system size. Even in the neighborhood of this value, our net
is dynamically equivalent to the Hopfield net, as Barnerjee points out. Despite this fact, there are dynamics
embedded in our model that are essentially different from any dynamics
exhibited by the Hopfield net. In particular, our model exhibits a chaotic
transition between far-from equilibrium quasi-stationary states. (It should be
noted that this is not a transition from an equilibrium state.) Increasing the
probability to a certain value, chaotic itinerancy appears. In our chaos-driven
contracting system, the model brain is a stable network and varying inputs are
provided by a chaotic dynamical system which exhibits high-dimensional chaotic
itinerancy or low-dimensional chaos with a restricted grammar. This grammar is
restricted in the sense that it possesses forbidden symbol sequences. We regard
this as a skeletal mathematical model of the olfactory system and the
hippocampus. Other studies in which inputs are treated as variables controlled
by other dynamical systems have recently been published by Gohara and colleagues
(Gohara and Okuyama 1999; Gohara and Okuyama 1999; Gohara et al 2000; Yamamoto
and Gohara 2000; Sato and Gohara 2000).
As an
important factor other than inputs that can influence the system, Banerjee discusses synaptic modulation.
For a reason similar to that in the case of inputs, it is not feasible to model
this as a bifurcation parameter when one wishes to understand the mechanism of
nonstationary and itinerant behavior. Only if one tries to understand the
system's dynamics as consisting of the change undergone by invariant sets can
inputs and synaptic modulation be treated as bifurcation parameters.
R2.3
Landscape lacks a reality
Freeman claims that attractor landscapes in the
olfactory system are recreated in each inhalation period. Freeman identified this recreation as the olfactory flexibility. He
criticizes our model as being too rigid and not allowing for the change of
phase space structure. A similar criticism is also given by Ikegami & Tani. However, despite
this criticism, we assert that our memory dynamics model does indeed exhibit
structural change of phase space under the Hebbian learning. With regard to
this point, Quoy, Banquet & Daucé
questioned how the dynamic landscape changes via Hebbian learning. We now
briefly explain this process. The learning of new patterns alters the
transition of memories in such a way that new memories are incorporated into a
sequence of memories which appear dynamically to display chaotic itinerancy. In
this way, a new sequence of memories is created.
Hebbian
learning classifies the closeness of input patterns in the following way. In a
conventional associative memory model, a new input pattern is placed within the
basin of a certain attractor. Here, some attractors are a memory representation
and others are a parasitic one. If that pattern is learned, the basin structure
becomes complex due to the formation of a new basin (See also Amari and Maginu
1988). In our dynamic memory model, there is no conventional (geometric)
attractor, and hence no conventional basin is present. In place of such a
basin, at least one hole is present, which links each memory representation to
all the others. We have not obtained a mathematical proof of whether riddled
basins appear in the present model. We have not found symmetry in our model
like that which Breakspear & Friston,
and Rowe have discovered. It is,
however, certain that a similar structure to that of riddled basins has been
observed in numerical studies. Because, in the situation that the transition of
memories is allowed, Hebbian learning acts along the transition paths also;
that is, the transition paths are also reinforced. The closeness of input
patterns can thus be defined in terms of temporal order, since the transition
occurs between patterns with a large overlap. In this way, as more patterns are
learned, the phase space structure becomes more complex.
In the
manner of Freeman, here we would
like to use a metaphor. Imagine we are observing a stream at a fixed position.
Then, we always observe different water molecules at each time, even in
Escher's waterfall chain. For this reason, we cannot find invariance at such a
level. On the other hand, a river possesses certain structures at different
levels -- from mesoscopic to macroscopic -- as a flow of water. We may be able
to find some invariance at such levels and we may recognize universality within
the continually changing behavior. In contrast to Freeman's intended
demonstration in his allusion to Escher, we think Escher's chain demonstrates a
method of representing the creation of new `quality', even though the structure
appears static. Geometric impossibility embedded into this static structure
forces us to change the viewpoint from which we consider the picture and
enables us to find new `quality' hidden in the structure, for instance, the
waterfall chain may provide us with a hint about the four-dimensional `qualia'
of the scene. New `qualia' at a mesoscopic or a macroscopic level, which is not
manifested at a microscopic level, might be created in the same way, as Dinse discussed.
Freeman and Quoy,
Banquet & Daucé both use the term ``landscape". Freeman in the expression ``attractor
landscape" and QuoyC
Banquet & Daucé
in the expression ``dynamic landscape". We think the use of this word is
misleading with regard to both our model and the KIII model, and maybe also
with regard to other far-from equilibrium dynamic systems. Concerning this
point, we give the following discussion from the general theory of nonlinear
dynamical systems (Kaneko and Tsuda 2001). If there are at least two extremely
different time scales characterizing the system in question, then the system's
behavior can be described by dynamics on a static landscape and its dynamic
modulation. Here, the landscape can appear to be a rugged landscape. Such an
extreme separation of time scales is often observed in nonlinear systems. However, no evidence for such a
separation of time scales has been found in the very flexible system like the
olfactory system that Freeman studies. In such flexible systems, the
``landscape" cannot exist. Thus the statement that the ``landscape is
recreated" is misleading. If we interpret Freeman's intention correctly, we may be able to describe this as
something like ``epigenetic landscape" proposed by Waddington. However,
this cannot be described by anything that could be considered a
``landscape". (Although, when considering the dynamic behavior of the entire
process of development, it might be possible a posteriori to account for this
development in terms of a landscape.) Describing a system with a landscape is
inconsistent with the flexibility of the system, and the concept of a landscape
does not apply to the flexible brain.
R2.4
Action is contained implicitly in probability terms
Kay points out that our model lacks an action
term, and for this reason she suggests that we introduce a somatosensory
system. She asserts that by doing this, interfacial dynamics may emerge. This
is closely related to the causality problem considered by Ikegami & Tani, context cues considered by Heath, and falsifiability considered by Rowe & Wright. Kay
is correct that our model does not explicitly contain an action term. However,
the model implicitly contains such a term. A typical mathematical model in
which an action term exists implicitly is given by Samuel Karlin (Karlin 1953).
He formalized the situation in which a living system with internal state that
can be expressed by a variable x must
make a decision to choose a certain action i
among many possibilities at a certain time. Let pi(x) be the
probability of choosing an action i,
where x is the state of the living
system. We consider this process to be described by a dynamical system. That
is, the state of the system is determined by a dynamical system. As the result
of the choice of an action, the state must change in accordance with this
action. Thus the state evolves as a parametrized dynamical system, Fpi(x)(x). Since pi(x) depends on the state x,
a change in the state causes the probability for the choice of the succeeding
action to change also. This Karlin's formulation gives the first example of
IFS.
The
system described above exhibits a stochastic renewal of dynamics, since the
dynamical system governing the development of the state depends on the action
chosen. If the probability function for the choice of the action is described
by a certain chaotic dynamical system, this type of decision making can be
described by a skew product transformation. In this case, the feedback effect
of the action on the state of the system is implicitly taken into account. We
believe that the feedback from the environment as influenced by system's action
is thus implicit. As stated in Section R2.1, this framework yields
`coupled' systems with characteristics that differ from those typically seen in
what Breakspear & Friston
present as symmetrically coupled nonlinear oscillators. There may be a level at
which brain activity can be described by coupled nonlinear oscillators, but it
is doubtful that a symmetric coupling system would be useful in the modeling of
actual brain activity. In general, the forward connections in the brain are
related to a sensory information processing, while the backward connections are
related to the context, that is, the intention, motivation, situation,
condition, etc. The context may appear to be a cue code for sensory
information. The key factor is the existence of a type of `connection'. In the
brain, the type of connections between feedforward and feedback differs. For
this reason, it is important to study the effects of skew product
transformations.
Because
the chaotic behavior found in the olfactory bulb (OB) is caused by the feedback
connections from the prepyriform cortex (PPC) possessing contraction dynamics,
the presence of physical coupling is likely, as Kay mentions. We would like to know what the feedback is in such a
case. Damped oscillations are enhanced and then become chaotic in the OB.
According to Freeman, this happens only in a motivated condition like in a
hunger state of an animal. Hence, the feedback to the OB is a motivational
signal. This situation of the `coupling' can be realized in the following
dichotomy. The input-output function of the OB is chosen to be F1 in the
presence of motivation, and chosen to be F2 in the absence of motivation, where
chaos is assumed not to exist. Then, the main dynamics in the PPC appear as the
process of chaos-driven contraction dynamics.
Taking
into account the stimulus-induced stochastic release of synaptic vesicles,
whose physiological significance is correctly pointed out by Liljenström, contrary to the claims of Freeman and Breakspear & Friston, we considered the metaphor of `neuronal
decision making'. One can extend the present model to include the state
dependence of the probabilities for the choice of action. This is a topic for
future study. Karlin investigated ergodicity and the convergence of the
distribution, assuming a simple form for the state dependence of the
probabilities, and showed as a special case that the limiting distribution is a
singular distribution on the Cantor set. Later, Norman (1968) demonstrated a
convergence theorem in stochastic learning models. Bressloff and Stark applied
Norman's idea to the dynamics and learning in neural networks in a series of
works (Stark 1991; Bressloff and Stark 1992). Thus, our model can be viewed as
a model of action-driven (though yet uniform) dynamic memory and perception.
R2.5
What is the relation between the model and reality?
One
common type of criticism was made by Foster,
among others. Essentially, this criticism is that the theory is mathematical,
but neither psychological nor physiological. This is why we present our theory
for a dynamic brain from a different viewpoint. As we emphasized above,
especially in Section R1, it is
important to seriously consider the levels of a model. Most commentators
neglect this point. Modeling from an overly physiological point of view results
in a theory that lacks explanatory power for cognition, and modeling from an
overly psychological point of view results in a theory that lacks predictive
power for the mechanism of cognitive processes that should be related to brain
activity, as long as we consider the mind to be a physical phenomenon. At a
certain time in scientific history, those in the field of artificial
intelligence neglected brain activity, especially neurophysiological facts.
Perhaps they believed that the physiological nature of the brain need not be
studied for a full understanding of cognition. On the other hand, people who
have studied neural network models have tended to neglect symbol manipulation.
Perhaps they did/do not realize how something expressed symbolically could
possess a neurophysiological basis. Then, the connectionist approach proposed
neural networks that can treat symbol manipulation through its dynamics. This
was epoch-making. However, it seems that connectionists have not yet found an
adequate language system, whose importance we emphasized in Section R1.
In the
situation that most approaches do not provide an adequate language system to
make a bridge between psychological and physiological levels for understanding
of the brain and mind, we have chosen a mathematically interpretative direction
of study. In particular, we have chosen in this article a high-dimensional
chaotic dynamical system as one possible explanatory and predictive language.
Recently,
psycho-physiological experiments have been conducted on various areas of the
brain. In these experiments, a cognitive task is performed by an animal or a
person, and while it is being performed the activity of neurons or neural
assemblies is monitored. Then, neural correlates are investigated. This
represents a promising direction of study, but has the serious weak point that
neural correlates must be interpreted in terms of natural language, taking into
account the meaning of the task and the neural activity. Moreover, there might
be an `experimenter effect'. This is not surprising, since the object of
experiment is a very complex system.
We
have proposed a mathematical formal theory to analyze the task performed in
these experiments itself (Tsuda and Hatakeyama 2001; Hatakeyama and Tsuda
2001). We are now studying the establishment of a formalism for such
experiments and attempting to construct a method of extracting the immanent
chaotic dynamics of neural systems exhibiting cognition. We point out that
Descartes' principles of thoughts (Descartes 1701) should still be useful in
our attempt to gain a deeper understanding of the brain and mind.
The
Lorenz model for atmospheric unpredictable and nonperiodic motion is also
relevant to the present discussion (Lorenz 1963; 1991). Following Saltzman's
observation (Saltzman 1962), Lorenz derived three-dimensional ordinary
differential equations for the purpose of describing atmospheric circulations,
and he found chaotic motion resulting from the instability of convective
solutions. However, the chaotic motion he found, which is called Lorenz chaos,
has never been observed in real atmospheric motion. Apparently, therefore, his
model does not simulate real turbulent motion of atmosphere. Then, why did the
Lorenz model impart such a strong scientific impact (much stronger than that of
conjecturing of the `butterfly effect', which alleges that a butterfly flapping
its wings in China can drastically change the weather in New York)? It should
be noted that this impact does not stem from falsifiability nor from
provability of this model. In fact, this impact is not due to the ability of
this model to correctly simulate physical phenomena. Rather, this impact is due
to the fact that his chaotic model displays the essence of atmospheric motion,
its immanent chaotic dynamics. A similar type of modeling is seen in Kaneko's
series of studies of complex phenomena in terms of coupled map lattices (CML)
and globally coupled maps (GCM) (Kaneko and Tsuda, 2001 and references cited
therein). We believe this way to capture certain features of reality (or it
might be better to use the term ``actuality" in place of ``reality",
according to Bin Kimura), some of whose features may be hidden, but can emerge
in observation with an adequate language, is effective and possesses an
explanatory and predictive power at a level that differs from that of
physiologically realistic models, like the KIII model that Kozma recently developed. The underlying important point in this
discussion is that we believe strong evidence that chaotic dynamics exist in
living brains, as Liljenström, Mandell
& Selz, and Rowe & Wright
have suggested.
Given
the present situation with regard to a theory, Liljenström's suggestion that the mechanism of emergent properties
should be discriminated from observed behavior itself is crucial for maintaining
the reliability of theory. If an effect of macroscopic activity on activity at
the cell level and/or molecular level emerges, through the mechanism of
macroscopically emergent properties, a qualitative theory could be directly
tested in the laboratory. As Molnár
suggests, the discovery of an unbiased method to describe the potential
functional significance of high-dimensional chaotic or stochastic behavior will
help to further the development of a qualitative theory.
R3.
Poor man's chaotic itinerancy and chaotic code
R3.1
Mechanism of chaotic itinerancy
Many
commentators have reported dynamic behavior similar to chaotic itinerancy (CI).
(Rowe supplies many references on
chaotic dynamical systems which generate phenomena similar to chaotic itinerancy.
Komuro has investigated a possible mechanism of CI in some mathematical
framework (Komuro 1998; 1999).) We have described CI as chaotic transition
dynamics resulting from a weak instability of Milnor-type attractors, that is,
a chaotic transition among attractor ruins, and before such an instability
arises, a ceratin complex phase space structure similar to a riddled basin
appears. Érdi, Breakspear & Friston and
Rowe inquired about the structural
conditions of the emergence of CI. Breakspear
& Friston particularly emphasize the significance of symmetry in the
emergence of Milnor attractors and a riddled basin. (They corrected our
citation of works on the riddled basin. As they point out, the first paper on
the riddled basin is that of Alexander et al., 1992. The paper by Grebogi et
al., 1987, which we cite in the target article is concerned with fractal basin
boundaries
multi-dimensionally
intertwined on arbitrarily fine scales.) Since symmetrically coupled systems
like globally coupled maps (GCM), possess certain symmetries, such systems have
been studied thoroughly. As Breakspear
& Friston point out, studies of the Milnor attractor have been carried
out most actively in the context of symmetrical systems. Typical such studies
are reported in a series of works by Ashwin and his colleagues (Ashwin 2000).
However, as Kaneko showed (1998), symmetrical coupling is not a necessary
condition for the emergence of Milnor attractors, since they also appear in GCM
systems without such symmetry.
Let us
assume that a dynamical system f: M->M, where M is the phase space, commutes with a certain group action q:M->M on M;
that is fq = qf. Let S(q) be an invariant set under the action q: S(q)={x|qx = x}.
Then, f(S(q)) = S(q),
because f(qx) = f(x) and q(fx) = f(qx)
= f(x). When a dynamical system possesses this type of symmetry, its
effective dimensionality can be drastically reduced, and as a result the
detailed structure of its invariant sets can be investigated. In this respect,
the assertion concerning symmetry made by Breakspear
& Friston is very relevant with regard to the mechanism responsible for
Milnor attractors and riddled basins. However, such symmetrical systems are not
characteristic of the brain, as networks of neurons in the brain are
asymmetrically coupled. Nevertheless, the questions of what type of symmetry
could be present in our asymmetrically coupled neural network and how, if it
exists, could this symmetry affect the potential invariant sets are interesting
to consider. Also, we note that the noise effect is crucial in CI-like
transitions, since neural systems in the brain exist in a noisy environment. As
Rowe points out, it is important to
note that depending on the type of Milnor attractor in question, the stability
with respect to noise differs. In relation to this, it should be noted that
noise can induce basin riddling even after a blowout bifurcation, that is, even
in the presence of a transversely positive Lyapunov exponent (Lai and Grebogi
1996).
Feudel
et al found a CI-like phenomenon in the double rotor system with small
amplitude noise (Feudel et al 1998). In this system many periodic orbits
coexist. Among these, the higher periodic orbits possess very tiny basins which
disappear under the influence of noise, leaving only the low periodic orbits.
This situation is similar to that in the KIII model, which Kozma and Freeman found.
Due to fractal basin boundaries, long chaotic transients appear before the
system falls into a periodic orbit. Orbits are trapped for some time in the
vicinity of periodic attractors, but eventually are kicked by noise into the
fractal boundary region.
Figure
5 in the target article shows the presence of the simplest Milnor attractor and
also presents a model to describe our simulation results, empirically determined
quasi one-dimensional return maps. Mandell
& Selz treat the situation shown in Fig. 5 in the target article as a
bifurcation point of tangent bifurcations. In this treatment, for a parameter a, in the case a < ac,
where ac is a bifurcation
point, there exist a pair of stable and unstable fixed points (This resembles a
saddle-node pair.), and for a > ac no fixed points exist and
chaotic behavior appears, so that the system at a = ac is
structurally unstable. This is not the case we consider. In our case, this
one-dimensional map representation is a projection of high-dimensional
dynamics. All fixed points, each representing a different memory, are reduced
to two critical points. Furthermore, in our dynamic memory model, this critical
situation is robust with respect to changes of the system's parameters, such as
the strength of synaptic connections, the steepness of the input-output
function of the neurons, and assigned probabilities, within the regions that
chaotic itinerancy occurs. We have found evidence through network simulations
that suggests the possibility of such a critical system becoming structurally
stable. One such possibility is realized through the appearance of structurally
stable heteroclinic cycles (Guckenheimer and Holmes 1988; May and Leonard;
Chawanya 1995; 1997; Nishiura and Ueyama 2001). Because the appearance of
structurally stable heteroclinic cycles requires differentiable vector fields
that are equivariant with respect to a symmetry group, whether our case
corresponds to such an ideal case is unknown. Our assertion is that the
essential dynamics may be due to indifferent fixed points, not hyperbolic fixed
points. The appearance of non-hyperbolicity yields characteristics of
nonstationary statistics, such as a long time tail of the correlations (Yuri
2000, and references cited therein).
From
the result of studies on several types of neural networks with different
structures (Körner et al 1987; 1991), the empirically determined conditions for
CI are as follows. (1) The presence of networks, such as recurrent neural
networks, which guarantees the coexistence of attractors. (2) The presence of a
mechanism causing the neutral stability of attractors. It is by this mechanism
that Milnor attractors are generated. (3) The presence of perturbations that
weakly destroy such an attractor. These conditions are not well-suited for the
appearance of CI, and for this reason, mathematically detailed studies are
needed for a deeper understanding of this mechanism.
R3.2
Ubiquitous chaotic itinerancy
Many
commentators discussed transition phenomena similar to that of CI. Many CI-like
phenomena other than those we consider in the target article have been studied.
Breakspear & Friston assert the
significance of chaotic transience. Rowe
suggests the possibility of heteroclinic cycles in CI-like phenomenon, and
emphasizes the significance of heteroclinic cycles in neural networks. Banerjee discusses a topological
attractor as representing the overall dynamics of coupled Milnor-type
attractors in his spiking neuron model. This topological attractor is identical
to an itinerant attractor. Kowalik applies
the name, ``self-reanimating chaos", to a transition between weakly
barriered chaos and quasi-periodic oscillations. Borisyuk hypothesizes that CI-like activity in neural assemblies
may be describable as behavior of a dynamical system with a time-dependent
coefficient. In relation to Borisyuk's
idea, we constructed a simple model consisting of unidirectionally coupled
chaotic systems with distinct time scales (Okuda and Tsuda 1994). When a fast
system forces a slow system, the slow system usually becomes simply noisy. This
could be used to simulate the motion in dynamical system with noise.
Conversely, when a slow system forces a fast one, CI-like behavior often
appears. This may correspond to the slow modulation of a certain parameter of a
dynamical system. It might also be similar to the CI-like behavior observed by Mandell & Selz in neural systems.
Among
other systems, CI-like phenomena in random recurrent neural networks, which
were discovered by Quoy, Banquet &
Daucé, are very interesting. Their system used for robot navigation control
can learn both patterns and pattern sequences. CI-like phenomena appear in this
system when the input signal and the inner signal are mismatched. This behavior
and function of chaos and CI-like high-dimensional activity are very similar to
those Tani found in his robot control system (Tani 1998). On a related note, Breakspear & Friston suggest the
involvement of NMDA channels in the neural mechanism causing the relatively
rapid change of attractors. They further predict that if the phase space
includes many saddles, ``typical orbits" will shadow a saddle and that
this may be realized in monoamine-mediated changes of functional synaptic
coupling. This prediction is worth checking. However, one question arises: Does
the phenomenon of irregular transitory orbits accompanied by a saddle network
that can be shadowed by typical orbits belong to the same class of statistical
behavior as CI orbits? Mandell & Selz (1993) found that the effect of noise
increases the residence time of orbits in the neighborhoods of unstable states,
and they actually reported the observation for it in the hippocampus. Since
NMDA channels in the hippocampus is responsible for LTP, this noise effect
might guarantee the structural stability of transitory dynamics through the
noise-induced shadowing.
As
described above, CI-like phenomena have been found in many neural systems. Most
researchers are mainly concerned with the topological similarity of these
phenomena, but what we have asserted as their important characteristics are as
follows. (1) The appearance of many approximately zero Lyapunov exponents, but
with large fluctuations. (2) It possesses nonstationary statistics, and hence
convergence theorem might not hold. These observations regarding the statistics
of the CI in our network model indicate the non-existence of shadowing of both
individual orbits and attractors. Sauer has identified this CI characteristic
and proposed this non-existence as a definition of CI (Sauer 2000; Sauer et al
1997; Dawson et al 1994; Grebogi et al 1990).
R3.3
Chaotic code
In the
target article, we stressed the functional significance of a certain class of
chaos and networks. The required characteristic for the functional significance
is information mixing due to large fluctuations of information flow (Matsumoto
and Tsuda, 1985; 1987; 1988; Nicolis and Tsuda, 1985; 1989). This class of
chaos should appear as intermittent activity. A network displaying this class
of chaos can preserve input information in its dynamic activity. Thus, such a
network may provide a dynamic mechanism of working memory, which should be
arbitrarily long term. CI possesses the same characteristic. Furthermore, as
proposed in the target article, CI consists of high-dimensional transitory
dynamics which may provide a dynamic mechanism for linking memories. The
linking of memories is necessary for categorization and perceptual drifts. Here
let us recall the criticism made by Ikegami
& Tani that since memory dynamics should be restricted by semantics and
causalities under ``embodied conditions" through behavior, it is not
possible to simulate memory dynamics only with CI, which does not have a clear
correspondence to the real world. This criticism seems to be worth considering.
In thinking ``embodied conditions", studies with machines, like robots,
are very important. However, we should not overlook the fact that the world
robots are experiencing is not real, but man-made, in which the experimenter's
intention has been built in advance. A theory based on such biased experience
of robots leads us to over-interpretation.
It is
important to inquire into the nature of the neural mechanism of chaotic
activity, as Érdi points out. In
this regard, we identified three distinct situations (Tsuda, 1991):
(1)
chaotic activity at one level results from chaotic activity existing at a lower
level;
(2)
chaotic activity at one level is independent of that at the lower levels, and
rather it results from damped oscillations enhanced by feedback from activity
at higher levels;
(3)
chaotic activity at one level results from a self-organization at the lower
level.
A
representative model for each of the above situations has been investigated:
Kaneko's CML and GCM for case (1), Freeman's KIII model for (2), and our
dynamic memory model for (3).
Contrary
to the assertion of Mandell & Selz,
chaotic dynamical systems can be viewed as computation machines. In general,
the expanding dynamics can be used to ``readh the information given initially
or as an input. For instance, let us consider the discrete dynamics defined by
the function f(x) = 2x, where x is a real number. Here, the variable x is represented by a binary expansion.
This type of dynamics is equivalent to a shift dynamics in which the decimal
point is shifted one place from left to right per iteration of the dynamics.
Contrastingly, contracting dynamics can be used to ``writeh the information.
For instance, the discrete dynamics defined by the function g(x)
= x/2, where x is a real number represented by a binary expansion, is equivalent
to shift dynamics in which the decimal point is shifted one place from right to
left per iteration. Usually, in chaotic dynamics these two types of dynamics
appear alternately, and on average the process of `readout' of the information
given in the initial distribution is dominant. This situation corresponds to
the presence of a positive Lyapunov exponent. The function of chaotic dynamics as
a computation machine can be realized in the case that the expanding and
contracting dynamics are embedded by cut and paste operations in each
eigen-direction, as is seen in Moore's generalized shift (Moore 1990; 1991),
and also in the case that these two kinds of dynamics are well separated along
each eigen-direction, as is seen in Smale's horseshoe map (Smale 1967). In
particular, in the former case, a Turing machine can be embedded at each point
in the phase space of a generalized shift map. In this respect, a generalized
shift can be viewed as a universal Turing machine.
An
essential feature of the horseshoe map as a chaotic dynamical system is
described by the transformations
f(x, y)
= (2x, ay) (for 0 < x <1/2) and f(x, y) = (2-2x, 1 - ay) (for 1/2 < x <1),
where 0 < a <1/2.
Here,
the dynamics of x are expanding,
chaotic dynamics that are independent of y,
and the dynamics of y, which consists
of two types of contracting dynamics, depends on x. A horseshoe map is the simplest example of a chaos-driven
contracting system. The x variable is
responsible for reading the information provided by the initial conditions, and
the read-out of this information is written in the dynamics of y direction. Actually, in the
contracting case, 0 < a <1/2, a
Cantor set is generated along the y direction.
This observation led us to the study of Cantor coding in chaos-driven
contracting systems. In neural systems, unidirectional coupling usually
produces overlapped IFS. In a totally-disconnected IFS, this loss of information
does not exist, and thus in this case coding and decoding have a one-to-one
correspondence (see also Aihara &
Ryue).
Borisyuk and Érdi
asked the advantage of chaotic coding. As mentioned in the target article, the
advantage of Cantor coding is the ability for encoding and decoding a large
amount of information hierarchically in some finite region of phase space, that
is, with a restricted activity level. In other words, a set of temporal
patterns with infinite length can be hierarchically embedded, in principle.
This coding is robust with respect to noise up to some depth. In the
hippocampus, embedding of a large amount of information with an extremely long
code for a short period is not necessary, and hence this coding is realistic,
even in a noisy environment. Hierarchical embedding in terms of Cantor coding
in the hippocampus may represent the emergence of a grammar concerning the time
order of events. In CA1 or PPC, the neural activity changes in a short time, on
the order of 100 msec. This implies that Cantor sets can only be observed by
the superposition of snapshots of activity during an interval of approximately
100 msec. The functional significance of the metric of Cantor coding, about
which Heath inquires, lies in the
identification of the closeness of episodes as the closeness of codes. Through
the introduction of such a metric, we can realize that any code in a code
sequence can be a cue signal for the association of episodes.
Raffone & van Leeuwen demonstrated one merit of chaotic
coding by showing that a flexible synchrony of chaotic neural activity is more
effective than a stable synchrony of periodic activity. They propose to use
this effectiveness to solve the binding problem. Friston (1997) also discussed
the significance of transient coding, which is associated with a transient
motion, and he confirmed its existence in some functional-MEG data. These are a
nice realization of our idea that the dynamic link of memories in terms of
chaos and CI may provide a means of flexible information processing in
perception (Tsuda, 1993; 1996; Kaneko and Tsuda, 2001). The ``binding" of
features shared by different object through the synchrony of chaotic
oscillations should inevitably generate an alternation of synchronized and
desynchronized states. This alternation activity should be CI-like transitory
dynamics. The strengths of interactions among oscillations determine
synchronization. In opposition to this, chaos is effective for causing rapid
desynchronization, because of its characteristic exponential divergence of
nearby orbits. Contrary to the assertion of Raffone & van Leeuwen, we still think that the binding problem
is only a pseudo-problem. To solve the binding problem, people have used spike
coincidence and neural oscillations, that is, temporal information, because
rate coding fails for this problem. It is not yet clear if the cause of this
problem is spike coincidence or neural oscillations. This is something of a
chicken-and-egg problem. If an oscillation is periodic, or binding is created
by the coincidence of feature-detecting neurons, nonflexible operations and
even combinatorial explosion cannot be avoided. Ironically, in such a
nonflexible case, the concept of ``binding" is appropriate. In order to
avoid this difficulty, and to make ``binding" functional, we must abandon
the concept of bound feature(s). If we use chaotic oscillations, a flexible
synchrony can appear. In such a case, the ``binding" process will proceed
in the neural dynamics without a help of feature-detecting neurons. The term
``binding" cannot be an element of an adequate language system.
R3.4
Real chaos?
Borisyuk, Freeman, Kowalik, Liljenström and Molnár point out the difficulty to discriminate high-dimensional
chaos from noise. With regard to
this, we first note that the chaos analysis of experimental data is still at an
immature level. We believe that
there will be great development of chaotic dynamical systems analysis in the
future. Before the discovery of deterministic chaos, the analysis of random
phenomena was commonly carried out by first finding the probability
distribution of an appropriate random variable and then calculating average
values and fluctuations of
observables using this distribution. The true fluctuations can be approximated
by calculating the second, third, and (if necessary) higher order moments of
the distribution. Also, in time-series analyses, the autoregression method has
been used in linear prediction theory. Recently, Okabe et al. proposed a new
statistical method that includes nonlinear filters, which has proved to be
effective when the data is stationary (Okabe and Nakano 1991; Okabe and Inoue
1994; Okabe and Yamane 1998; Okabe and Kaneko 2000). However, because the
discovery of deterministic chaos implies that a certain class of random
phenomena can be described by a deterministic rule, such as that provided by a
dynamical system on some smooth manifold, it has come to be believed that many
types of random phenomena result from deterministic chaos and their randomness
originates from a nonlinear transformation of phase space, and further a random
time series can be considered projections of orbits on a manifold on to the
real axis. Unfortunately, however, chaos analysis in its present form, and
especially the embedding technique, is feasible only for relatively
low-dimensional dynamical systems. It is ineffective for extremely
high-dimensional cases and also in the presence of nonstationarity.
Given
this present situation of our understanding of chaos, Kowalik states that there is no strict limit between noise and
high-dimensional deterministic chaos in the sense that we are not able to
clearly distinguish between these. However, before jumping to a conclusion in
this regard, it is prudent to note Liljenström's
observation that chaos is predictable over short time scales, while noise is
unpredictable over any time scale, but no discrimination can be made over long
time scales. Concerning this point, it is also important to note that in
statistical physics, the hypothesis of molecular chaos at a microscopic level
is necessary to derive the velocity distribution of an ideal gas as a
macroscopic quantity. The presence of molecular chaos guarantees the ergodicity
of the system. With respect to this velocity distribution, physical properties
of a gas can be expressed as an average plus a variance. This method is very
often formally applied to other stochastic phenomena. Usually in such
treatments, the average term is viewed as a deterministic component and the
variance term as a noise, equivalent to molecular chaos. Since a biological
system is not a Hamilton system but a dissipative system, we are concerned with
far-from equilibrium conditions. To maintain such a system in a far-from
equilibrium state, an external source of energy is necessary. Therefore,
generally a far-from equilibrium system is caused to be in high energy level.
Under such conditions, aperiodic and unpredictable behavior of the averaged
deterministic component is often observed. In order to discriminate this
deterministic random behavior from the molecular chaos, physicists have
referred to the former as ``deterministic chaos" or ``macroscopic
chaos". Since these chaotic states appear in a far-from equilibrium
system, deterministic chaos should have a much greater power than noise. During
the early stage of the study of deterministic chaos, the indicator of such
chaos used in experiments was the power spectrum. There are two merits of using
the power spectrum to discriminate chaos from noise. First, while both chaos
and noise have continuous spectra, the power of chaos is much greater than that
of noise, which is almost negligible. Second, since chaos appearing in
dynamical systems is generated by bifurcations, one can insure the existence of
chaos through the change of control parameters. At this standpoint, enhanced
noise can be interpreted as resulting from chaos.
It
should be further noted here that most common methods of experimental data
analysis are problematic. In order to determine whether the neural activity
observed in any given case is described by CI, the measurement of neural
activity over a long time
is
necessary. Interestingly, the data found in long-term measurements of neural
activity usually exhibit nonstationarity. This is in contrast to the case of
shadowing of an entire attractor, i.e. a set of orbits, in which the long-term
nature of a measurement implies stationarity. When we wish to study
nonstationary neural phenomena experimentally, is it possible to use
conventional methods of measurement and analysis? When we wish to observe the
neural mechanism corresponding to a single act, are data obtained as the
average of neural activity measured over many trials, such as a firing rate or
correlation coefficients, meaningful? If so, what is the assumed condition? To
use statistical quantities under the assumption of a stationary process
reflects the belief that a single time series of neural activity is meaningless
or that such time series possesses ergodicity. However, ergodicity does not
likely exist for behavior-related neural activity. Therefore, people who
attempt to use (stationary) statistical quantities in effect deny the
meaningfulness of a single time series of neurons or neuron assemblies. But a
single time series of neural activity has been observed to be associated with a
single act in the laboratory, and hence it is known that such an activity is
indeed meaningful. It would thus seem that we have to invent a new dynamical
systems analysis which is able to treat high-dimensional and/or nonstationary
data.
R4.
Dynamic brain revisited
R4.1
Multiple codes
As Dinse points out, a cortical
``module" is flexible enough to be able to adapt to rapid changes in the
environment, allowing for the link between fast time scales on the order of
msec and time scales of learning. Here, the alternation of synchronization and
desynchronization of the activity of neuron assemblies often appears,
associated with this adaptation process. This chaotic alternation between
synchronization and desynchronization could be described by CI. In this case,
the output resulting from an input is determined by the internal dynamics,
which are not fixed as a rigid input-output relation or a stimulus-response
relation, but change flexibly, in a manner that depends on the outputs (see Freeman, Kay, and Kozma). Thus there is a feedback of the action from the environment
to the system that generates internal dynamics. Our idea is that any feedback
represents code at some level. We think that this might be one origin of
multiple codes in neurons or neuron assemblies. In general, there cannot be a
feedback process existing in a hierarchical information processing system. If
there were some feedback an originally hierarchical structure would be broken,
resulting in multiple codes. Foster
introduced John's works on the interactions of coherent ensembles in neural
cell assemblies. Here we briefly introduce Sakurai's series of works on
multiple codes based on neural cell assemblies.
Sakurai
studied the hippocampal and temporal cortical neuron activity exhibited during
the performance of simple auditory, simple visual, and configural
auditory-visual discrimination tasks. He found behavior-correlated activity of
neurons, which emerged as task related. It was found that approximately one
third of the task-related neurons overlap.
A
single neuron's activity represents the difference between stimuli to be
memorized and stimuli to be discriminated in a given task. However, cell
assemblies that arise through functional connections between neurons are
necessary in order to represent the difference between kinds of tasks. He
called this sharing of roles between individual neurons and neuron assemblies
``dual coding". From this viewpoint of cell assemblies, the function of a
single neuron is not fixed, but changes flexibly depending on its relations
with other neurons. A single neuron can belong to many different cell
assemblies, and for this reason, a single neuron can represent different
functions in manners related to task, purpose, the functions of other neurons,
etc.
We
believe that by taking into account macro-action, as Kay suggests, the existence and function of multiple codes will
become clearer. There is a work of Iwamura and Tanaka (1978). That reports the
discovery of active touch-related neurons in the somatosensory cortex of
monkey. These neurons become active only when a monkey holds an object that it
has come to possess through its own action; that is, such neurons do not
respond when an object is placed in its hands.
These
findings show that the presence of feedback from behavioral levels to
individual neurons and neuron assemblies generates multiple codes at neuronal
levels. We emphasize again that feedback signals carry codes corresponding to
action, not action itself.
R4.2
Dynamic memory
We
have proposed a dynamic memory model for episodic memory and also for olfactory
perception. Here, for the first time, following Foster, we present the definitions of episodic memory, semantic
memory and working memory, which we envisage in the target article. Our
definitions of semantic and episodic memories basically follow Tulving
(Tulving,
1972), but we have added a new perspective. Declative memory is classified into
two categories, semantic memory and episodic memory. Semantic memory is memory
consisting of general knowledge. Semantic memory is apparently separated from
the spatio-temporal causality of events occurring in our daily experience. The
database in a computer is similar to semantic memory in this sense. However,
since knowledge is essentially internal (Gernert, 1996), semantic memory may be
represented in a manner that depends on the internal dynamics, and hence can
change, while a database is external and fixed.
Episodic
memory is that concerning individual experience in the spatio-temporal context.
This individual experience includes ``future memory" consisting of plans
for future actions (Meacham and Leiman 1982; Tsukada 1992). Meacham and Leiman
call this ``prospective remembering". Individual experience is, in general,
memorized chronologically, but it can be memorized according to causality if a
mechanism, in which the prefrontal cortex participates and by which the
hippocampal dynamics can be influenced, that precedes consistency in
experienced events operates. In our dynamic memory model, we treated this type
of causality using a chaotic rule generated through the interactions between
internal dynamics and external information (acquisition of knowledge).
Generated CI provides a flexible grammar for linking memories of events, in
which highly correlated memories are linked. In order to develop the theory and
the model in the manner in which Ikegami
& Tani suggest, introducing an explicit state dependence of the
assigned probabilities and a mechanism of learning probabilities should be
helpful (see also R2.4).
We
follow Baddeley's definition (1986) of working memory. According to Baddeley,
working memory consist of a conscious-related system providing a procedure of
obtaining knowledge and a temporal storage of knowledge, which is necessary for
performing complex cognitive tasks. For this reason, Baddeley calls this
``active memory".
In the
target article, semantic memories are assumed to be represented as disperse
spatial patterns in the network. We represent them by dynamical fixed points of
the Milnor type. A weakly collapsed Milnor attractor can form an attractor
ruin. Since we assume that a chain
of knowledge associated with experienced events forms an individual episode, we
consider episodic memory to consist of a chain of semantic memories. We have
cited two possibilities: that in which a code sequence representing chaotic
orbits which link events is embedded in Cantor sets, and that in which a series
of events is embedded in Cantor sets. In either case, the Cantor coding of a
chain of knowledge is equivalent to the decoding of an episode.
Ikegami & Tani addressed an important question
concerning a seemingly paradoxical feature of memory structure. On one hand,
memory structure appears stable, but on the other hand memory dynamics are
chaotic. We do not think that this is a paradox. Dynamics that are unstable in
the usual sense are not always unstable from the information theoretical point
of view. In chaos with non-uniform invariant measure that is absolutely continuous
w.r.t. the Lesbeague measure, and in transitory dynamics like CI, there exist
quantities that remain stable in the unstable dynamics of orbits. One such
quantity is the difference between the Kullback information before and after
applying the Perron-Frobenius operator. In the case with uniform invariant
measure, this quantity is equivalent to the maximum Lyapunov exponent but in
the case with non-uniform invariant measure, it is related to the fluctuations
of Lyapunov exponents. In the latter case, the time-dependent mutual
information provides an appropriate quantification of such fluctuations. The
slow decay of the mutual information in time reflects an information mixing,
which ensures the conservation of information content through the dynamics in coupled
non-uniform chaos (see also Section R3.3). This implies that input
information repeatedly appears and disappears in each local element but is
globally maintained. A coupled system can then be viewed as an information
channel, though its dynamics are chaotic. In other words, the inputs can be
extracted as outputs, even though the state of the channel is chaotic. We
believe that the appearance and disappearance of the information in places over
the system carries meaning.
Many
theories and models of learning and memory have been proposed. However, in
general models lack an explicit coding scheme and a description of its relation
to neurodynamics. McClelland, McNaughton and O'Reilly (1995) have discussed the
details of a possible mechanism for the consolidation of memory. For this
reason they are concerned with temporally graded retrograde amnesia, which
typically appears in patients with hippocampal lesions like HM. The presence of
temporally graded retrograde amnesia indicates a consolidation of memory based
on a continual interaction between the hippocampal system and the neocortical
system. We have constructed a model for a CA3-CA1 interacting system, which we
present in a separate paper (Tsuda and Kuroda, 2001; Kuroda and Tsuda, 2001).
In this model, chorinergic
and
GABAergic innervations are introduced, and Cantor coding is found. The new
point in such models is that the phase of theta rhythms may control whether the
dynamics in CA3 become CI-like dynamics or stable attractor dynamics. In
connection with the comments of Foster and Heath, we wish to consider the method
by which a stimulus sequence is recalled by use of Cantor codes. There could be
two types of recall, direct and indirect. When a person experiences some
events, sensory stimuli enter CA3 from the entorhinal cortex, with the
influence of internal dynamics, and as a result, CA3 begins to display the
associative dynamics, such as CI. Then, a Cantor code is retrieved from this
partial sequence of events in a manner that depends on the length of the
sequence of events. In this way, the recall of an episode from partial
information is possible. This is usual situation in the recall of a stimulus
sequence. Another type of recall that can occur is here called ``Proust
phenomenon". In ``A la Recherche du Temps perdu," by Marcel Proust,
the character Marcel suddenly recalled an episode which had been forgotten when
he put madeleine soaked in black tea into his mouth. We all experience this
type of recall of episodes in our daily life. We offer a hypothesis about the
mechanism of ``Proust phenomenon" in which we employ Cantor coding. The
distinguishing characteristic of ``Proust phenomenon" is that a specific
stimulus, which previously had no relation to any episodic memories, triggers a
complete recall of some episode. In this situation, CA3 cannot be stimulated
directly by such a stimulus, but rather, it must be the case that CA1 receives direct stimulation from
the entorhinal cortex. A direct perforant path from the entorhinal cortex to
CA1 allows for this. Since in CA1 a code sequence is embedded in a cluster of a
Cantor set, a certain level of this Cantor cluster contains a single code
corresponding to a stimulus which is created in the sensory cortices or the
entorhinal cortex. Then, at such a level, a code sequence can be evoked in CA1
or in the neocortex through the temporal evocation of a trace of the code
sequence in CA1. This hypothesis is consistent with hypothesis 2 in Treves and
Rolls (1994), where they state that the perforant path may be involved in the
carrying of a cue signal that can initiate the retrieval of an episode.
We
here emphasize that a feedback signal in the brain should consist of a code, so
that anatomical couplings do not imply the usual formalization of synaptic
connections, nor the usual formalization of oscillation couplings. What we
envisage is as follows. We believe that the above described situation holds in
the connections from the prepyriform cortex to the olfactory bulb, and also in
those from CA1 to CA3 through the neocortex and the entorhinal cortex, in both
of which we hypothesize the formation of Cantor coding, described by
dx/dt = F(x) + c
hy, dy/dt = G(y) + w
x, where hy = 0 if y is included in a certain level of
cluster, and hy = 1
otherwise.
Acknowledgement
We
would like to express our sincere thanks to Takao Namiki for critically reading
the manuscript of this response paper. We also thank Michael Breakspear and
Karl Friston, for kindly answering a basic question we asked them on the
symmetry they used in their commentary, and Jaroslav Stark, for informing us of
many related papers on IFS.
References
Aihara,
K., Takabe, T., and Toyoda, M. (1990) Chaotic neural networks. Physics Letters
A 144: 333-340.
Amari,
S. and Maginu, K. (1988) Statistical neurodynamics of associative memory.
Neural Networks 1: 63-73.
Alexander,
J., Yorke, J., You, Z., and Kan, I. (1992) Riddled basins. International
Journal of Bifurcation and Chaos 2: 795-813.
Arbib,
M. A., Érdi, P. and Szentágothai, J. (1998) Neural Organization. Structure,
Function, and Dynamics. A Bradford Book, The MIT Press, Cambridge, London.
Arbib,
M. A. and Hesse, M. B. (1986) The Construction of Reality. Cambridge
University
Press, London.
Baddeley,
A. D. (1986) Working Memory. Clarendon Press, Oxford.
Blomfield,
S. and Marr, D. (1970) How the cerebellum may be used, Nature 227: 1224-1228.
Bressloff,
P. C. and Stark, J. (1992) Analysis of associative reinforcement learning in
neural networks using iterated function systems. IEEE Transactions on Systems,
Man, and Cybernetics 22: 1348-1360.
Chawanya,
T. (1995) A new type of irregular motion in aclass of game dynamics systems.
Progress of Theoretical Physics 94: 163-179.
Chawanya,
T. (1997) Coexistence of infinitely many attractors in a simple flow. Physica D
109: 201-241.
Dawson,
S., Grebogi, C., Sauer, T., and Yorke, J. (1994) Obstructions to shadowing when
a Lyapunov exponent fluctuates about zero. Physical Review Letters 73:
1927-1930.
Descartes,
R. (1701) Regulae ad directionem ingenii. In: Qeuvre de Decartes, publiees par
Ch. Adam et P. Tannery, 1701, Amsterdam (Japanese translation by M. Noda,
Iwanami publ., 1973 (the 22th edition).
Érdi,
P. (1996) The brain as a hermeneutic device. BioSystems 38:179-189.
Érdi,
P. and Tsuda, I. (2001) Hermeneutic approach to the brain: Process versus
device? Theoria et Historia Scientiarum vol. VI (in press).
Eriksson,
P. S., Perfilieva E., Bjork-Erksson, T., Alborn, A.-M., Nordborg, C., Peterson,
D. A., and Gage, F. H. (1998) Neurogenesis in the adult human hippocampus.
Nature Medicine 4: 1313-1317.
Feudel, U., Grebogi, C., Poon, L., a