Author's Response

The Plausibility of a Chaotic Brain Theory

 

Ichiro Tsuda

 

Department of Mathematics, Graduate School of Science,

Hokkaido University, Sapporo, 060-0810, Japan

tsuda@math.sci.hokudai.ac.jp    

 

Abstract: We consider the significance of high-dimensional transitory dynamics in the brain and mind. In particular, we highlight the roles of high-dimensional chaotic dynamical systems as an ``adequate language" (Gelfand 1989), which should possess both explanatory and predictive power of description. We discuss the methods of description of dynamic behavior of the brain. These methods have been adopted to capture the averaged or deterministic complexity, and further to allow for discussion of a new approach to capture the complexity of the deviation from such an averaged complexity and also the complexity of interactive modes. We also give arguments in defense of our models for dynamic memory with chaotic itinerancy and Cantor coding. In addition, we give discussion with regard to the reality that a model of the brain and mind should reflect.

 

Key words: Adequate language; hermeneutic theory; dynamic aspects of the brain; asymmetry; feedback code; skew product transformations; dynamic memory

 

 

R1. What would a theory of the brain be like?

 

R1.1 Why hermeneutics?

 

As Érdi correctly points out, the brain is a hermeneutic device in the sense that it interprets the world through sensory information processing, and for us to understand the brain, we must develop some way of interpreting its activity. The brain not only receives and processes external information but also creates a new ``reality" and ``actuality". According to Bin Kimura (1998; 2000), ``reality" is a kind of sensation that can be objectively understood, while ``actuality" is a more subjective experience based on sensation relating to action and behavior. With the revolutionary finding of Freeman and his colleagues that took place over a period of over 40 years, we now know with certainty the following: Animals do not directly respond to external stimuli but rather respond to internal images they create, and animals' perception is a result of an autonomic interpretation process. In the case of human cognition, a more direct interpretation process must exist. In this case, interpretation is a recursive process evolving in time that acts between pre-understanding and perceived information. A person's pre-understanding should be altered in accordance with perceived signals, while the perceived information will also change in a manner that depends on the change in the pre-understanding.

 

In order to understand the global function of the brain we must learn how to interpret brain activity. Many people have attempted in the laboratory to find a neuronal representation of information, assuming the existence of neuronal correlates to cognition. However, it is not possible to obtain such a representation without knowing precisely the actual effects experienced by neurons or neural assemblies during the performance of the task in question. On the other hand, such actual effects themselves are the object of study.

 

In order to observe how an artificial brain creates a new actuality in the sense of Kimura, Ikegami & Tani (see also Tani, 1998) attempted to interpret with cognitive language the interaction between perception and action manifested in the behavioral self-control of robots. Although it is still controversial whether or not the interpretation provided by Ikegami & Tani regarding the perception and behavior of a robot with recurrent neural network (RNN) is plausible, we consider it to be one study in one possible useful direction. The work of Quoy, Banquet & Daucé on robot navigation control utilizing random RNN also represents a promising direction. Actually, these works can be viewed as implementation of a hermeneutic theory of the brain (Blomfield and Marr 1970; Marr 1982; Tsuda 1984; T. Winograd and F. Flores 1986; Arbib and Hesse 1986; Tsuda 1991; Érdi 1996; Arbib, Érdi and Szentágothai 1998; Érdi and Tsuda 2001). 

 

Raffone & van Leeuwen criticize our theory as a holistic theory expressing opposition to our statements in our target article that information representation in the brain is dynamically realized as a whole, and that the precise nature of neural elements, such as neurons and cortical sub-areas, is irrelevant. They interpret our theory as a top down theory and thus as a hermeneutic theory. However, as Érdi correctly states and as we emphasized above, a hermeneutic theory is plausible and even provides an adequate language system (Gelfand 1989) that is sufficient to express an understanding the brain function in terms of a relation of physiological phenomena to cognitive and psychological phenomena. A proper theory of the brain must be a hermeneutic theory. It is correct for Raffone & van Leewen to point out what their model suggests: In early vision, the dominance of local features processing and the later processing of global structure. This local dominance, however, exists only under the condition of tabula rasa in the sense of Locke. After the development of learning, the processing of global structure can become dominant. This situation is clearly seen in the cognitive process of inference in a certain language game that I introduced previously (Tsuda 1991; Kaneko and Tsuda 2001) as a Shannon test. Shannon invented this game when he estimated the information content (the number of bit) contained in one (English) word. In this game, there are two people, A and B. In the beginning, A has a sentence in mind, of which B has not been informed. B attempts to determine this sentence and does so by first attempting to determine the first word, and hence the successive letters. This is done by asking a series of questions to which A can give only Yes and No answers. Early in the game, B can only ask questions randomly, in a bottom-up form of processing. As B's knowledge increases, however, a top-down form of processing that relies on B's context-dependent judgment becomes increasingly effective.

 

Let us further discuss briefly the uncertainty regarding the understanding of neural elements. Dinse has described recent developments in the study of dynamic receptive fields (dynamic-RFs). Population level activity in early sensory cortices expressed with respect to coordinates of the stimulus space has been studied, and dynamic population-RFs have been constructed (Jancke et al. 1999). Dinse has discussed the possibility that these two types of RFs play a functional role, influenced by the structure and function of surrounding neural networks in early sensory cortices. In other words, the difference between spatio-temporal dynamics in dynamic-RFs and those in dynamic population-RFs clearly reflects the dependence of the identity of the neural element or functional unit on the nature of the information processing. We would also like to mention Sakurai's series of studies on the flexible coding and decoding of cortical neurons (Sakurai 1996; 1998; 1999) as well as the work of Arieli et al., which both Dinse and we (in the target article) cite. These studies clearly show the dependence of neuron activity on the activity of the system as a whole or at least of the surrounding networks on a large scale. Sakurai found that the type of activity displayed by individual neurons is correlated with global behavior, which was characterized according to different tasks performed by the experimental subject. Both Kay's comment and Freeman's finding on the olfactory bulb support the existence of such a relation, although Foster argues against it. The fundamental fact underlying our point of view is that the identity of the neuronal element cannot be known a priori.

 

In addition to the points raised above, adult neurogenesis (Sakakibara and Okano 1997; Kempermann et al 1997; Eriksson et al 1998; Pincus et al 1998) in wide areas of the brain, in particular, in the olfactory system, the hippocampus, and even the neocortex may support the hypothesis of the variability of neural representation due to dynamic reorganization of neural networks. The term ` as a whole' that we use in the target article expresses the variability of the nature of the functional unit in the sense that which acts as the `unit' may change in a manner that depends on interrelations within the surrounding network and also the process of functional manifestation. Therefore, the word `holistic' is not appropriate to characterize our theory. A term like `relationally dynamic' would be more appropriate.

 

 

R1.2 The method of description

 

Freeman has proposed a mesoscopic level description in order to identify the level at which functional dynamics emerge. Through his own studies and other evidence, he has found that this level is not that of a single neuron level, that is, the microscopic level. In conventional phase transitions studied in physics, many degrees of freedom at a microscopic level begin to become correlated with each other in some critical regime of the system's parameter(s), and as a result macroscopic order that can be described by a few degrees of freedom emerges. These few degrees of freedom are represented by so-called order parameters. However, in some complex systems, the dynamics of these order parameters can be complex and even chaotic, though chaotic behavior in this case is confined to a low-dimensional attractor. In more complex systems, the identity of the quantities that act as the order parameters may change in space and time. In this case, the macroscopic description looses its descriptive power. It is natural to consider an intermediate level between the microscopic and the macroscopic levels as a level of description where dynamically complex behavior can be captured. Physicists call this level the `mesoscopic' level. Freeman borrows this idea. Freeman's use of the term `mesoscopic' in the description of inputs-driven chaotic behavior in the olfactory system is appropriate, because in the brain, dynamically transient motion is generic, as Breakspear & Friston, Dinse, Freeman, Heath, and Kay correctly point out in their commentary.

 

Several methods have been employed in the construction of scientific theories. Before discussing the method we propose in the modeling of the brain, we give some general discussion. It should be noted, though, that the following distinction between such methods described below might be controversial, and many other ways of distinguishing such methods based on different philosophical viewpoints are possible. However, since we believe there is a difference between the methods of constructing theories in the study of the brain and the study of physics, we feel that the manner of thinking we use here is useful.

 

One method employed in the construction of theories is that which begins from `first principles'. Here, by a `first principle' we mean a hypothesis or an axiom on which a theory is based. An example of a theory obtained using this method is Newtonian dynamics formalism. However, in the situation that proper first principles cannot be identified or when a theory based on first principles is not feasible, a `phenomenological' method is adopted.@The description of fluid using Navier-Stokes equations is a typical such method. Thermodynamics is also such a `phenomenological' theory. By `phenomenological' theory we here mean that a theory is based on experiential rules. In neural systems, a useful method of modeling that seems to be based on something very different from both a method of first principles and a phenomenological method has been proposed. This method consists of modeling in terms of the Hodgkin-Huxley (H-H) equations (Hodgkin and Huxley 1952). The distinguishing characteristic of this method is that it includes a set of inductively derived equations that explicitly include experiential equations. We call this type of model a `semi-experiential model'. Banerjee's model of spiking neurons is at this level. Also, Aihara & Ryeu have studied chaotic neuron model constructed by Aihara et al. (Aihara, Takabe and Toyoda 1990. See also Fig. 9 in the target article). This chaotic neuron model is a kind of abstraction of periodically forced Hodgkin-Huxley equations.

 

It is probably true that among models which possess a physiological background there is no generic model found to this time other than the H-H equations. Freeman's population model simulates many types of dynamic behavior in the olfactory system, as described by Freeman and Kozma in their commentary. Through studies of population models like the KIII model, one can extract the essence of the dynamics that might be embedded in various areas of the brain. Whether this type of model can be directly applied also to cortical systems other than the olfactory system is still unclear. Therefore, it is still not known if we can use a KIII-type model as a generic model applicable to all areas of the brain. Nevertheless, we believe that population models, rather than types of models such as coupled H-H equations, are more suited to describe (macroscopic) functional manifestations, such as perception and cognition. It is important to determine the proper variable at a mesoscopic level which can be used to make a bridge between the physiological level and the psychological level. In other words, it is important to determine the `adequate language'. It should be noted that an electric potential or a sequence of impulses as such cannot be considered a proper variable for the description of cognition.

 

Instead of a direct use of a KIII-type model, we have considered another mesoscopic description in the target article. This description is based on the realization of self-organization in memory through chaotic dynamics.

 

We have also investigated a general method of study for complex dynamical systems, which will, we believe, provide a high-power description also for the study of the brain and mind (see also Section R2). The basic steps of this method are as follows (Kaneko and Tsuda 2001):

(a) find structural changes of complex behavior from both static and dynamic viewpoints by means of dissecting phase space;

(b) find universality, reconstructing structures and relationships, immanent in various types of complex phenomena;

(c) construct an artificial system, based on the fundamental conceptual elements of the universal properties;

(d) construct a model that describes both top and bottom levels from an intermediate level which is neither macroscopic nor microscopic;

(e) construct an adequate language system sufficient to describe complex systems, based on a mathematical theory for treating the complex dynamics and processes;

(f) acquire new intuition by formulating contra-intuitive situation and by observing the simulated variety of complex phenomena.

 

In the above described procedure, the method of modeling mentioned in (d) corresponds to the mesoscopic description discussed by Dinse and Freeman. The present attempt to construct a theory of the brain and the modeling given in the target article represent a realization of this generic method. The above steps (e) and (f) are deeply related to hermeneutics. It is important to realize the existence of a dual purpose regarding the predictive and the explanatory power of a model (Gernert 1998). The above described generic method possesses such a duality. This dual purpose, proposed by Gernert, is also alluded to Ikegami & Tani. They correctly assert the importance of dynamical systems as a tool or descriptive language, which is thought to possess a stronger descriptive power than natural language. We agree with Ikegami & Tani, in particular with regard to the point that high-dimensional dynamical systems including IFS may provide an appropriate descriptive language for the brain. Certainly physiological terminology itself cannot be considered as possessing explanatory power regarding cognitive phenomena, as psychological terminology itself cannot be considered as possessing predictive power regarding the physiological phenomena. We wish to obtain a third language, a language system that is capable of thoroughly describing brain and mind. At this time, of course, our chaotic dynamical systems terminology is still primitive to realize such a goal.

 

Quoy, Banquet & Daucé suggest a similar perspective, inquiring about the standpoint of chaos theory concerning behaviorism (stimulus-response) and cognitivism (mental representation). For the modeling of animal experiments involving higher functions, most of which consist of a type of stimulus-response, we have attempted to interpret the internal states of the brain as mathematical functions or distributions by observing the stimulus-response relations. We employ high-dimensional chaotic dynamical systems for this inferred internal representation. In order to decrease the ambiguity of an interpretation of this type of animal experiment, we have constructed (Tsuda and Hatakeyama 2001; Hatakeyama and Tsuda 2001) a formal theory of the structure of task-related functional manifestation. We have applied this theory to a series of experiments conducted by Sakagami et al. (Sakagami and Niki 1994; Sakagami and Tsutsui 1999). The theory was able to predict all possible types of discrimination of stimuli and conditions which can be represented by neurons found in the prefrontal cortex. We believe that chaotic dynamical systems can be used to represent the neural correlates of cognitive processes that can be detected by mesoscopic level measurements, such as f-MRI, optical recordings, and (local) electroencephalograph. If a neuronal dynamical system possesses point attractors and limit cycles only, this neural system lacks adaptability to varying environments. Thus it will inevitably become destabilized if we attempt to use it to model a full range of animal behavior. The emergent dynamics capable of providing this adaptability should possess a moderate stability that maybe a global stability. Our idea is to use chaotic itinerancy as a means of guaranteeing both stability and adaptability.

 

 

R2. Reality of the model

 

R2.1 Falsifiability of the theory

 

In our attempt to find a new method of understanding the brain and mind at a mesoscopic level, we face several difficult problems. First, a hermeneutic theory seems to lack the falsifiability property demanded by Popper as a minimum condition which a scientific theory must possess, since this theory can develop self-consistently through the evolution of the pre-understanding, allowing for a self-consistent interpretation to be reached. A hermeneutic theory consists of components, each of which could itself consist of a quantitative model and its predictions, and these, rather than the theory as a whole, can possess falsifiability. From another standpoint, we could construct a theory -- something that could be termed a `qualitative model' -- for the purpose of providing a plausible story of neurons or neuron assemblies. Such a theory should be constructed to allow us to carry out a more proper and deeper understanding of the brain and mind. Thus, as Rowe & Wright state, a quantitative model like theirs can lead to elemental models supplying such a qualitative theory. For example, the PDE and coupled ODEs which constitute an elemental level of Wright's theory (2000) of the brain activity and its cognitive function can possess falsifiability, while the whole theory can be justified as a hermeneutic theory.

 

In biology, it has been asserted that the correspondence between structure and function is crucial (Szentágothai and Érdi 1989; Li and Hopfield 1989). Following this assertion, we have attempted to construct a structure-based model of biological function. Actually, we constructed a skeletal model, based on anatomical data that were collected in detailed studies over thirty years by Szentágothai. In the modeling, we hypothesized that a type of structure that is common to various areas possesses a common function, and a structure specific to a given area possesses a specific function (see also Li and Hopfield 1989). Both models presented in the target article were constructed according to such a principle, and thus these models are examples of a kind of skeletal model.

 

Heath has proposed a cognitive model consisting of dynamic neural networks, which could possess a predictive power. He also discusses the possibility of converting the principles given in Section 3.5 of the target article into predictions that can be verified.

 

One of the characteristics observed in chaotically itinerant behavior is a long time tail of the time-dependent mutual information. This tail often exhibits an algebraic damping. This characteristic exist even when noise is present, because the frequency of stagnant motion in the vicinity of an attractor ruin cannot be decreased by such a perturbation. The presence of a long time tail indicates the presence of recurrence of similar dynamic behavior in the evolution of the system, and hence it may provide a mechanism capable of producing short-term memory, like working memory. Nicolis and Tsuda (1985) proposed a feasible mechanism of magic number seven plus minus two with chaotic dynamics with large fluctuations, and further demonstrated (Nicolis and Tsuda 1989) that these long-range correlations may lead to a universal power law known in the study of natural languages as the Zipf law. The presence of recurrence of similar dynamic behavior can work effectively when an episode is embedded in the CA1 region by the use of Cantor coding. During a period of approximately 100-200 msec cortical-hippocampal loop time, only a few events in an episode will be able to be embedded in a Cantor set. This loop time would not be sufficient for the transformation of episode from a short-term memory to a long-term memory. Some kind of recursive dynamic behavior may facilitate this type of transformation.

 

At this point, we must consider the fact that a memory is not independent of its cortical context, as Heath points out. Therefore, taking into account context cues in studying the process of memory dynamics and formation is essential. It is certain that our present model lacks this feature. Although the significance of context cues has been emphasized by many authors, no mathematical model that is capable of incorporating them has yet been proposed, as far as we know. We believe that such context cues are input into other lower cortical areas in the more abstract form of codes rather than raw sensory information. Usually, this input corresponds to a feedback signal. For proper modeling, cortical neural activity representing codes must differ from that representing raw sensory information. In our hippocampal model, CA3 activity consists of chaotic itinerancy, but CA1 activity does not. This is because Cantor coding is carried out in the cross section on which CA3 chaotic activity is constant. Coding hierarchy is generally limited only by the nonlinearity of the chaos that provides a grammar of chaotic motion. This limitation can be observed in the present model (see also Aihara & Ryue). If code signals strongly affect the chaotic behavior in CA3, the Cantor coding will be fragile, and this calls into question its realism, as pointed out by Érdi, Freeman and Kay in the case with feedback. We believe that the feedback signal is generally different in quality from the feedforward signal. Thus we doubt that the feedforward and the feedback connections can be thoroughly described in the same form as in coupled oscillators.

 

It would be very useful for further development of the study of dynamic memory to identify those features of our model with stochastic renewal and Heath's model with chaos control that are similar and those features that are different, since they have a similar structure of the interacting `modules' (see also Heath 2000).

 

 

R2.2 Could inputs and modifiable synapses be a bifurcation parameter?

 

The brain is an open system in both energetic and informatic senses. With respect to energy, the brain is a far-from-equilibrium system, since it is maintained in a high energy state by the influx and outflux of energy and matter. With respect to information, the brain receives external information and dictates action on the environment in response to this information. However, contrary to Banerjee's claim, we assert that such inputs should not take the form of bifurcation parameters.

 

Banerjee's observation concerning transitory dynamics made with regard to his treatment of spiking neurons is correct. This observation is that the attractor created in any cortical `column' is continually influenced by neighboring `columns', subcortical areas, and the environment. For this reason, this attractor changes or disappears, and a new attractor is created. This is the nature of the transitory dynamics characterizing the system. Through this observation, Banerjee studied these transitory dynamics using a treatment in which the inputs are represented by a bifurcation parameter. Quoy, Banquet & Daucé also consider inputs as bifurcation parameters. Although we appreciate the models proposed by Banerjee and by Quoy et al., we are skeptical to their assumption that inputs can be treated as bifurcation parameters.

 

We now consider the situation in which a system receives inputs from other systems. In the case that invariant sets like attractors are present in phase space, the change of such invariant sets in parameter space can be described as a bifurcation. To treat an input as a bifurcation parameter is equivalent to assuming the presence of such invariant sets, and hence this treatment becomes feasible only when the inputs change extremely slowly, compared with the system's dynamics. This is not a valid assumption in a dynamic system like the brain or any of its subsystem in which the input rapidly varies. It is crucial for understanding a brain of this nature to investigate it as a dynamical system influenced by varying inputs that may be produced by other dynamical systems either with or without noisy perturbations. By viewing inputs as originating from other system, rather than as bifurcation parameters internal to the system, the dynamic behavior of the model can be better characterized.

 

When considering inputs as variables controlled by other systems, rather than as bifurcation parameters, a total system can be viewed as an IFS. Pollack (1991) used recurrent neural networks for the system under study and incorporated the external world in the form of varying inputs. In a dynamic memory model, we used recurrent neural networks with inhibitory neurons for the model brain system and modeled probabilistic synapses as varying inputs. This model appears to contain a Hopfield spinglass-like model, since it is essentially reduced to a Hopfield net when the probability characterizing these synapses approaches the inverse of the system size. Even in the neighborhood of this value, our net is dynamically equivalent to the Hopfield net, as Barnerjee points out. Despite this fact, there are dynamics embedded in our model that are essentially different from any dynamics exhibited by the Hopfield net. In particular, our model exhibits a chaotic transition between far-from equilibrium quasi-stationary states. (It should be noted that this is not a transition from an equilibrium state.) Increasing the probability to a certain value, chaotic itinerancy appears. In our chaos-driven contracting system, the model brain is a stable network and varying inputs are provided by a chaotic dynamical system which exhibits high-dimensional chaotic itinerancy or low-dimensional chaos with a restricted grammar. This grammar is restricted in the sense that it possesses forbidden symbol sequences. We regard this as a skeletal mathematical model of the olfactory system and the hippocampus. Other studies in which inputs are treated as variables controlled by other dynamical systems have recently been published by Gohara and colleagues (Gohara and Okuyama 1999; Gohara and Okuyama 1999; Gohara et al 2000; Yamamoto and Gohara 2000; Sato and Gohara 2000).

 

As an important factor other than inputs that can influence the system, Banerjee discusses synaptic modulation. For a reason similar to that in the case of inputs, it is not feasible to model this as a bifurcation parameter when one wishes to understand the mechanism of nonstationary and itinerant behavior. Only if one tries to understand the system's dynamics as consisting of the change undergone by invariant sets can inputs and synaptic modulation be treated as bifurcation parameters.

 

 

R2.3 Landscape lacks a reality

 

Freeman claims that attractor landscapes in the olfactory system are recreated in each inhalation period. Freeman identified this recreation as the olfactory flexibility. He criticizes our model as being too rigid and not allowing for the change of phase space structure. A similar criticism is also given by Ikegami & Tani. However, despite this criticism, we assert that our memory dynamics model does indeed exhibit structural change of phase space under the Hebbian learning. With regard to this point, Quoy, Banquet & Daucé questioned how the dynamic landscape changes via Hebbian learning. We now briefly explain this process. The learning of new patterns alters the transition of memories in such a way that new memories are incorporated into a sequence of memories which appear dynamically to display chaotic itinerancy. In this way, a new sequence of memories is created.

 

Hebbian learning classifies the closeness of input patterns in the following way. In a conventional associative memory model, a new input pattern is placed within the basin of a certain attractor. Here, some attractors are a memory representation and others are a parasitic one. If that pattern is learned, the basin structure becomes complex due to the formation of a new basin (See also Amari and Maginu 1988). In our dynamic memory model, there is no conventional (geometric) attractor, and hence no conventional basin is present. In place of such a basin, at least one hole is present, which links each memory representation to all the others. We have not obtained a mathematical proof of whether riddled basins appear in the present model. We have not found symmetry in our model like that which Breakspear & Friston, and Rowe have discovered. It is, however, certain that a similar structure to that of riddled basins has been observed in numerical studies. Because, in the situation that the transition of memories is allowed, Hebbian learning acts along the transition paths also; that is, the transition paths are also reinforced. The closeness of input patterns can thus be defined in terms of temporal order, since the transition occurs between patterns with a large overlap. In this way, as more patterns are learned, the phase space structure becomes more complex.

 

In the manner of Freeman, here we would like to use a metaphor. Imagine we are observing a stream at a fixed position. Then, we always observe different water molecules at each time, even in Escher's waterfall chain. For this reason, we cannot find invariance at such a level. On the other hand, a river possesses certain structures at different levels -- from mesoscopic to macroscopic -- as a flow of water. We may be able to find some invariance at such levels and we may recognize universality within the continually changing behavior. In contrast to Freeman's intended demonstration in his allusion to Escher, we think Escher's chain demonstrates a method of representing the creation of new `quality', even though the structure appears static. Geometric impossibility embedded into this static structure forces us to change the viewpoint from which we consider the picture and enables us to find new `quality' hidden in the structure, for instance, the waterfall chain may provide us with a hint about the four-dimensional `qualia' of the scene. New `qualia' at a mesoscopic or a macroscopic level, which is not manifested at a microscopic level, might be created in the same way, as Dinse discussed.

 

Freeman and Quoy, Banquet & Daucé both use the term ``landscape". Freeman in the expression ``attractor landscape" and QuoyC Banquet & Daucé in the expression ``dynamic landscape". We think the use of this word is misleading with regard to both our model and the KIII model, and maybe also with regard to other far-from equilibrium dynamic systems. Concerning this point, we give the following discussion from the general theory of nonlinear dynamical systems (Kaneko and Tsuda 2001). If there are at least two extremely different time scales characterizing the system in question, then the system's behavior can be described by dynamics on a static landscape and its dynamic modulation. Here, the landscape can appear to be a rugged landscape. Such an extreme separation of time scales is often observed in nonlinear systems.  However, no evidence for such a separation of time scales has been found in the very flexible system like the olfactory system that Freeman studies. In such flexible systems, the ``landscape" cannot exist. Thus the statement that the ``landscape is recreated" is misleading. If we interpret Freeman's intention correctly, we may be able to describe this as something like ``epigenetic landscape" proposed by Waddington. However, this cannot be described by anything that could be considered a ``landscape". (Although, when considering the dynamic behavior of the entire process of development, it might be possible a posteriori to account for this development in terms of a landscape.) Describing a system with a landscape is inconsistent with the flexibility of the system, and the concept of a landscape does not apply to the flexible brain.

 

 

R2.4 Action is contained implicitly in probability terms

 

Kay points out that our model lacks an action term, and for this reason she suggests that we introduce a somatosensory system. She asserts that by doing this, interfacial dynamics may emerge. This is closely related to the causality problem considered by Ikegami & Tani, context cues considered by Heath, and falsifiability considered by Rowe & Wright. Kay is correct that our model does not explicitly contain an action term. However, the model implicitly contains such a term. A typical mathematical model in which an action term exists implicitly is given by Samuel Karlin (Karlin 1953). He formalized the situation in which a living system with internal state that can be expressed by a variable x must make a decision to choose a certain action i among many possibilities at a certain time. Let pi(x) be the probability of choosing an action i, where x is the state of the living system. We consider this process to be described by a dynamical system. That is, the state of the system is determined by a dynamical system. As the result of the choice of an action, the state must change in accordance with this action. Thus the state evolves as a parametrized dynamical system, Fpi(x)(x). Since pi(x) depends on the state x, a change in the state causes the probability for the choice of the succeeding action to change also. This Karlin's formulation gives the first example of IFS.

 

The system described above exhibits a stochastic renewal of dynamics, since the dynamical system governing the development of the state depends on the action chosen. If the probability function for the choice of the action is described by a certain chaotic dynamical system, this type of decision making can be described by a skew product transformation. In this case, the feedback effect of the action on the state of the system is implicitly taken into account. We believe that the feedback from the environment as influenced by system's action is thus implicit. As stated in Section R2.1, this framework yields `coupled' systems with characteristics that differ from those typically seen in what Breakspear & Friston present as symmetrically coupled nonlinear oscillators. There may be a level at which brain activity can be described by coupled nonlinear oscillators, but it is doubtful that a symmetric coupling system would be useful in the modeling of actual brain activity. In general, the forward connections in the brain are related to a sensory information processing, while the backward connections are related to the context, that is, the intention, motivation, situation, condition, etc. The context may appear to be a cue code for sensory information. The key factor is the existence of a type of `connection'. In the brain, the type of connections between feedforward and feedback differs. For this reason, it is important to study the effects of skew product transformations.

 

Because the chaotic behavior found in the olfactory bulb (OB) is caused by the feedback connections from the prepyriform cortex (PPC) possessing contraction dynamics, the presence of physical coupling is likely, as Kay mentions. We would like to know what the feedback is in such a case. Damped oscillations are enhanced and then become chaotic in the OB. According to Freeman, this happens only in a motivated condition like in a hunger state of an animal. Hence, the feedback to the OB is a motivational signal. This situation of the `coupling' can be realized in the following dichotomy. The input-output function of the OB is chosen to be F1 in the presence of motivation, and chosen to be F2 in the absence of motivation, where chaos is assumed not to exist. Then, the main dynamics in the PPC appear as the process of chaos-driven contraction dynamics.

 

Taking into account the stimulus-induced stochastic release of synaptic vesicles, whose physiological significance is correctly pointed out by Liljenström, contrary to the claims of Freeman and Breakspear & Friston, we considered the metaphor of `neuronal decision making'. One can extend the present model to include the state dependence of the probabilities for the choice of action. This is a topic for future study. Karlin investigated ergodicity and the convergence of the distribution, assuming a simple form for the state dependence of the probabilities, and showed as a special case that the limiting distribution is a singular distribution on the Cantor set. Later, Norman (1968) demonstrated a convergence theorem in stochastic learning models. Bressloff and Stark applied Norman's idea to the dynamics and learning in neural networks in a series of works (Stark 1991; Bressloff and Stark 1992). Thus, our model can be viewed as a model of action-driven (though yet uniform) dynamic memory and perception.

 

 

R2.5 What is the relation between the model and reality?

 

One common type of criticism was made by Foster, among others. Essentially, this criticism is that the theory is mathematical, but neither psychological nor physiological. This is why we present our theory for a dynamic brain from a different viewpoint. As we emphasized above, especially in Section R1, it is important to seriously consider the levels of a model. Most commentators neglect this point. Modeling from an overly physiological point of view results in a theory that lacks explanatory power for cognition, and modeling from an overly psychological point of view results in a theory that lacks predictive power for the mechanism of cognitive processes that should be related to brain activity, as long as we consider the mind to be a physical phenomenon. At a certain time in scientific history, those in the field of artificial intelligence neglected brain activity, especially neurophysiological facts. Perhaps they believed that the physiological nature of the brain need not be studied for a full understanding of cognition. On the other hand, people who have studied neural network models have tended to neglect symbol manipulation. Perhaps they did/do not realize how something expressed symbolically could possess a neurophysiological basis. Then, the connectionist approach proposed neural networks that can treat symbol manipulation through its dynamics. This was epoch-making. However, it seems that connectionists have not yet found an adequate language system, whose importance we emphasized in Section R1.

 

In the situation that most approaches do not provide an adequate language system to make a bridge between psychological and physiological levels for understanding of the brain and mind, we have chosen a mathematically interpretative direction of study. In particular, we have chosen in this article a high-dimensional chaotic dynamical system as one possible explanatory and predictive language.

 

Recently, psycho-physiological experiments have been conducted on various areas of the brain. In these experiments, a cognitive task is performed by an animal or a person, and while it is being performed the activity of neurons or neural assemblies is monitored. Then, neural correlates are investigated. This represents a promising direction of study, but has the serious weak point that neural correlates must be interpreted in terms of natural language, taking into account the meaning of the task and the neural activity. Moreover, there might be an `experimenter effect'. This is not surprising, since the object of experiment is a very complex system.

 

We have proposed a mathematical formal theory to analyze the task performed in these experiments itself (Tsuda and Hatakeyama 2001; Hatakeyama and Tsuda 2001). We are now studying the establishment of a formalism for such experiments and attempting to construct a method of extracting the immanent chaotic dynamics of neural systems exhibiting cognition. We point out that Descartes' principles of thoughts (Descartes 1701) should still be useful in our attempt to gain a deeper understanding of the brain and mind.

 

The Lorenz model for atmospheric unpredictable and nonperiodic motion is also relevant to the present discussion (Lorenz 1963; 1991). Following Saltzman's observation (Saltzman 1962), Lorenz derived three-dimensional ordinary differential equations for the purpose of describing atmospheric circulations, and he found chaotic motion resulting from the instability of convective solutions. However, the chaotic motion he found, which is called Lorenz chaos, has never been observed in real atmospheric motion. Apparently, therefore, his model does not simulate real turbulent motion of atmosphere. Then, why did the Lorenz model impart such a strong scientific impact (much stronger than that of conjecturing of the `butterfly effect', which alleges that a butterfly flapping its wings in China can drastically change the weather in New York)? It should be noted that this impact does not stem from falsifiability nor from provability of this model. In fact, this impact is not due to the ability of this model to correctly simulate physical phenomena. Rather, this impact is due to the fact that his chaotic model displays the essence of atmospheric motion, its immanent chaotic dynamics. A similar type of modeling is seen in Kaneko's series of studies of complex phenomena in terms of coupled map lattices (CML) and globally coupled maps (GCM) (Kaneko and Tsuda, 2001 and references cited therein). We believe this way to capture certain features of reality (or it might be better to use the term ``actuality" in place of ``reality", according to Bin Kimura), some of whose features may be hidden, but can emerge in observation with an adequate language, is effective and possesses an explanatory and predictive power at a level that differs from that of physiologically realistic models, like the KIII model that Kozma recently developed. The underlying important point in this discussion is that we believe strong evidence that chaotic dynamics exist in living brains, as Liljenström, Mandell & Selz, and Rowe & Wright have suggested.

 

Given the present situation with regard to a theory, Liljenström's suggestion that the mechanism of emergent properties should be discriminated from observed behavior itself is crucial for maintaining the reliability of theory. If an effect of macroscopic activity on activity at the cell level and/or molecular level emerges, through the mechanism of macroscopically emergent properties, a qualitative theory could be directly tested in the laboratory. As Molnár suggests, the discovery of an unbiased method to describe the potential functional significance of high-dimensional chaotic or stochastic behavior will help to further the development of a qualitative theory.

 

 

R3. Poor man's chaotic itinerancy and chaotic code

 

R3.1 Mechanism of chaotic itinerancy

 

Many commentators have reported dynamic behavior similar to chaotic itinerancy (CI). (Rowe supplies many references on chaotic dynamical systems which generate phenomena similar to chaotic itinerancy. Komuro has investigated a possible mechanism of CI in some mathematical framework (Komuro 1998; 1999).) We have described CI as chaotic transition dynamics resulting from a weak instability of Milnor-type attractors, that is, a chaotic transition among attractor ruins, and before such an instability arises, a ceratin complex phase space structure similar to a riddled basin appears. Érdi, Breakspear & Friston and Rowe inquired about the structural conditions of the emergence of CI. Breakspear & Friston particularly emphasize the significance of symmetry in the emergence of Milnor attractors and a riddled basin. (They corrected our citation of works on the riddled basin. As they point out, the first paper on the riddled basin is that of Alexander et al., 1992. The paper by Grebogi et al., 1987, which we cite in the target article is concerned with fractal basin boundaries

multi-dimensionally intertwined on arbitrarily fine scales.) Since symmetrically coupled systems like globally coupled maps (GCM), possess certain symmetries, such systems have been studied thoroughly. As Breakspear & Friston point out, studies of the Milnor attractor have been carried out most actively in the context of symmetrical systems. Typical such studies are reported in a series of works by Ashwin and his colleagues (Ashwin 2000). However, as Kaneko showed (1998), symmetrical coupling is not a necessary condition for the emergence of Milnor attractors, since they also appear in GCM systems without such symmetry.

 

Let us assume that a dynamical system f: M->M, where M is the phase space, commutes with a certain group action q:M->M on M; that is fq = qf. Let S(q) be an invariant set under the action q: S(q)={x|qx = x}. Then, f(S(q)) = S(q), because f(qx) = f(x) and q(fx) = f(qx) = f(x). When a dynamical system possesses this type of symmetry, its effective dimensionality can be drastically reduced, and as a result the detailed structure of its invariant sets can be investigated. In this respect, the assertion concerning symmetry made by Breakspear & Friston is very relevant with regard to the mechanism responsible for Milnor attractors and riddled basins. However, such symmetrical systems are not characteristic of the brain, as networks of neurons in the brain are asymmetrically coupled. Nevertheless, the questions of what type of symmetry could be present in our asymmetrically coupled neural network and how, if it exists, could this symmetry affect the potential invariant sets are interesting to consider. Also, we note that the noise effect is crucial in CI-like transitions, since neural systems in the brain exist in a noisy environment. As Rowe points out, it is important to note that depending on the type of Milnor attractor in question, the stability with respect to noise differs. In relation to this, it should be noted that noise can induce basin riddling even after a blowout bifurcation, that is, even in the presence of a transversely positive Lyapunov exponent (Lai and Grebogi 1996).

 

Feudel et al found a CI-like phenomenon in the double rotor system with small amplitude noise (Feudel et al 1998). In this system many periodic orbits coexist. Among these, the higher periodic orbits possess very tiny basins which disappear under the influence of noise, leaving only the low periodic orbits. This situation is similar to that in the KIII model, which Kozma and Freeman found. Due to fractal basin boundaries, long chaotic transients appear before the system falls into a periodic orbit. Orbits are trapped for some time in the vicinity of periodic attractors, but eventually are kicked by noise into the fractal boundary region.

 

Figure 5 in the target article shows the presence of the simplest Milnor attractor and also presents a model to describe our simulation results, empirically determined quasi one-dimensional return maps. Mandell & Selz treat the situation shown in Fig. 5 in the target article as a bifurcation point of tangent bifurcations. In this treatment, for a parameter a, in the case a < ac, where ac is a bifurcation point, there exist a pair of stable and unstable fixed points (This resembles a saddle-node pair.), and for a > ac no fixed points exist and chaotic behavior appears, so that the system at a = ac is structurally unstable. This is not the case we consider. In our case, this one-dimensional map representation is a projection of high-dimensional dynamics. All fixed points, each representing a different memory, are reduced to two critical points. Furthermore, in our dynamic memory model, this critical situation is robust with respect to changes of the system's parameters, such as the strength of synaptic connections, the steepness of the input-output function of the neurons, and assigned probabilities, within the regions that chaotic itinerancy occurs. We have found evidence through network simulations that suggests the possibility of such a critical system becoming structurally stable. One such possibility is realized through the appearance of structurally stable heteroclinic cycles (Guckenheimer and Holmes 1988; May and Leonard; Chawanya 1995; 1997; Nishiura and Ueyama 2001). Because the appearance of structurally stable heteroclinic cycles requires differentiable vector fields that are equivariant with respect to a symmetry group, whether our case corresponds to such an ideal case is unknown. Our assertion is that the essential dynamics may be due to indifferent fixed points, not hyperbolic fixed points. The appearance of non-hyperbolicity yields characteristics of nonstationary statistics, such as a long time tail of the correlations (Yuri 2000, and references cited therein).

 

From the result of studies on several types of neural networks with different structures (Körner et al 1987; 1991), the empirically determined conditions for CI are as follows. (1) The presence of networks, such as recurrent neural networks, which guarantees the coexistence of attractors. (2) The presence of a mechanism causing the neutral stability of attractors. It is by this mechanism that Milnor attractors are generated. (3) The presence of perturbations that weakly destroy such an attractor. These conditions are not well-suited for the appearance of CI, and for this reason, mathematically detailed studies are needed for a deeper understanding of this mechanism.

 

 

R3.2 Ubiquitous chaotic itinerancy

 

Many commentators discussed transition phenomena similar to that of CI. Many CI-like phenomena other than those we consider in the target article have been studied. Breakspear & Friston assert the significance of chaotic transience. Rowe suggests the possibility of heteroclinic cycles in CI-like phenomenon, and emphasizes the significance of heteroclinic cycles in neural networks. Banerjee discusses a topological attractor as representing the overall dynamics of coupled Milnor-type attractors in his spiking neuron model. This topological attractor is identical to an itinerant attractor. Kowalik applies the name, ``self-reanimating chaos", to a transition between weakly barriered chaos and quasi-periodic oscillations. Borisyuk hypothesizes that CI-like activity in neural assemblies may be describable as behavior of a dynamical system with a time-dependent coefficient. In relation to Borisyuk's idea, we constructed a simple model consisting of unidirectionally coupled chaotic systems with distinct time scales (Okuda and Tsuda 1994). When a fast system forces a slow system, the slow system usually becomes simply noisy. This could be used to simulate the motion in dynamical system with noise. Conversely, when a slow system forces a fast one, CI-like behavior often appears. This may correspond to the slow modulation of a certain parameter of a dynamical system. It might also be similar to the CI-like behavior observed by Mandell & Selz in neural systems.

 

Among other systems, CI-like phenomena in random recurrent neural networks, which were discovered by Quoy, Banquet & Daucé, are very interesting. Their system used for robot navigation control can learn both patterns and pattern sequences. CI-like phenomena appear in this system when the input signal and the inner signal are mismatched. This behavior and function of chaos and CI-like high-dimensional activity are very similar to those Tani found in his robot control system (Tani 1998). On a related note, Breakspear & Friston suggest the involvement of NMDA channels in the neural mechanism causing the relatively rapid change of attractors. They further predict that if the phase space includes many saddles, ``typical orbits" will shadow a saddle and that this may be realized in monoamine-mediated changes of functional synaptic coupling. This prediction is worth checking. However, one question arises: Does the phenomenon of irregular transitory orbits accompanied by a saddle network that can be shadowed by typical orbits belong to the same class of statistical behavior as CI orbits? Mandell & Selz (1993) found that the effect of noise increases the residence time of orbits in the neighborhoods of unstable states, and they actually reported the observation for it in the hippocampus. Since NMDA channels in the hippocampus is responsible for LTP, this noise effect might guarantee the structural stability of transitory dynamics through the noise-induced shadowing.

 

As described above, CI-like phenomena have been found in many neural systems. Most researchers are mainly concerned with the topological similarity of these phenomena, but what we have asserted as their important characteristics are as follows. (1) The appearance of many approximately zero Lyapunov exponents, but with large fluctuations. (2) It possesses nonstationary statistics, and hence convergence theorem might not hold. These observations regarding the statistics of the CI in our network model indicate the non-existence of shadowing of both individual orbits and attractors. Sauer has identified this CI characteristic and proposed this non-existence as a definition of CI (Sauer 2000; Sauer et al 1997; Dawson et al 1994; Grebogi et al 1990).

 

 

R3.3 Chaotic code

 

In the target article, we stressed the functional significance of a certain class of chaos and networks. The required characteristic for the functional significance is information mixing due to large fluctuations of information flow (Matsumoto and Tsuda, 1985; 1987; 1988; Nicolis and Tsuda, 1985; 1989). This class of chaos should appear as intermittent activity. A network displaying this class of chaos can preserve input information in its dynamic activity. Thus, such a network may provide a dynamic mechanism of working memory, which should be arbitrarily long term. CI possesses the same characteristic. Furthermore, as proposed in the target article, CI consists of high-dimensional transitory dynamics which may provide a dynamic mechanism for linking memories. The linking of memories is necessary for categorization and perceptual drifts. Here let us recall the criticism made by Ikegami & Tani that since memory dynamics should be restricted by semantics and causalities under ``embodied conditions" through behavior, it is not possible to simulate memory dynamics only with CI, which does not have a clear correspondence to the real world. This criticism seems to be worth considering. In thinking ``embodied conditions", studies with machines, like robots, are very important. However, we should not overlook the fact that the world robots are experiencing is not real, but man-made, in which the experimenter's intention has been built in advance. A theory based on such biased experience of robots leads us to over-interpretation.

 

It is important to inquire into the nature of the neural mechanism of chaotic activity, as Érdi points out. In this regard, we identified three distinct situations (Tsuda, 1991):

(1) chaotic activity at one level results from chaotic activity existing at a lower level;

(2) chaotic activity at one level is independent of that at the lower levels, and rather it results from damped oscillations enhanced by feedback from activity at higher levels;

(3) chaotic activity at one level results from a self-organization at the lower level.

 

A representative model for each of the above situations has been investigated: Kaneko's CML and GCM for case (1), Freeman's KIII model for (2), and our dynamic memory model for (3).

 

Contrary to the assertion of Mandell & Selz, chaotic dynamical systems can be viewed as computation machines. In general, the expanding dynamics can be used to ``readh the information given initially or as an input. For instance, let us consider the discrete dynamics defined by the function f(x) = 2x, where x is a real number. Here, the variable x is represented by a binary expansion. This type of dynamics is equivalent to a shift dynamics in which the decimal point is shifted one place from left to right per iteration of the dynamics. Contrastingly, contracting dynamics can be used to ``writeh the information. For instance, the discrete dynamics defined by the function g(x) = x/2, where x is a real number represented by a binary expansion, is equivalent to shift dynamics in which the decimal point is shifted one place from right to left per iteration. Usually, in chaotic dynamics these two types of dynamics appear alternately, and on average the process of `readout' of the information given in the initial distribution is dominant. This situation corresponds to the presence of a positive Lyapunov exponent. The function of chaotic dynamics as a computation machine can be realized in the case that the expanding and contracting dynamics are embedded by cut and paste operations in each eigen-direction, as is seen in Moore's generalized shift (Moore 1990; 1991), and also in the case that these two kinds of dynamics are well separated along each eigen-direction, as is seen in Smale's horseshoe map (Smale 1967). In particular, in the former case, a Turing machine can be embedded at each point in the phase space of a generalized shift map. In this respect, a generalized shift can be viewed as a universal Turing machine.

 

An essential feature of the horseshoe map as a chaotic dynamical system is described by the transformations

f(x, y) = (2x, ay) (for 0 < x <1/2) and f(x, y) = (2-2x, 1 - ay) (for 1/2 < x <1), where 0 < a <1/2.

Here, the dynamics of x are expanding, chaotic dynamics that are independent of y, and the dynamics of y, which consists of two types of contracting dynamics, depends on x. A horseshoe map is the simplest example of a chaos-driven contracting system. The x variable is responsible for reading the information provided by the initial conditions, and the read-out of this information is written in the dynamics of y direction. Actually, in the contracting case, 0 < a <1/2, a Cantor set is generated along the y direction. This observation led us to the study of Cantor coding in chaos-driven contracting systems. In neural systems, unidirectional coupling usually produces overlapped IFS. In a totally-disconnected IFS, this loss of information does not exist, and thus in this case coding and decoding have a one-to-one correspondence (see also Aihara & Ryue).

 

Borisyuk and Érdi asked the advantage of chaotic coding. As mentioned in the target article, the advantage of Cantor coding is the ability for encoding and decoding a large amount of information hierarchically in some finite region of phase space, that is, with a restricted activity level. In other words, a set of temporal patterns with infinite length can be hierarchically embedded, in principle. This coding is robust with respect to noise up to some depth. In the hippocampus, embedding of a large amount of information with an extremely long code for a short period is not necessary, and hence this coding is realistic, even in a noisy environment. Hierarchical embedding in terms of Cantor coding in the hippocampus may represent the emergence of a grammar concerning the time order of events. In CA1 or PPC, the neural activity changes in a short time, on the order of 100 msec. This implies that Cantor sets can only be observed by the superposition of snapshots of activity during an interval of approximately 100 msec. The functional significance of the metric of Cantor coding, about which Heath inquires, lies in the identification of the closeness of episodes as the closeness of codes. Through the introduction of such a metric, we can realize that any code in a code sequence can be a cue signal for the association of episodes.

 

Raffone & van Leeuwen demonstrated one merit of chaotic coding by showing that a flexible synchrony of chaotic neural activity is more effective than a stable synchrony of periodic activity. They propose to use this effectiveness to solve the binding problem. Friston (1997) also discussed the significance of transient coding, which is associated with a transient motion, and he confirmed its existence in some functional-MEG data. These are a nice realization of our idea that the dynamic link of memories in terms of chaos and CI may provide a means of flexible information processing in perception (Tsuda, 1993; 1996; Kaneko and Tsuda, 2001). The ``binding" of features shared by different object through the synchrony of chaotic oscillations should inevitably generate an alternation of synchronized and desynchronized states. This alternation activity should be CI-like transitory dynamics. The strengths of interactions among oscillations determine synchronization. In opposition to this, chaos is effective for causing rapid desynchronization, because of its characteristic exponential divergence of nearby orbits. Contrary to the assertion of Raffone & van Leeuwen, we still think that the binding problem is only a pseudo-problem. To solve the binding problem, people have used spike coincidence and neural oscillations, that is, temporal information, because rate coding fails for this problem. It is not yet clear if the cause of this problem is spike coincidence or neural oscillations. This is something of a chicken-and-egg problem. If an oscillation is periodic, or binding is created by the coincidence of feature-detecting neurons, nonflexible operations and even combinatorial explosion cannot be avoided. Ironically, in such a nonflexible case, the concept of ``binding" is appropriate. In order to avoid this difficulty, and to make ``binding" functional, we must abandon the concept of bound feature(s). If we use chaotic oscillations, a flexible synchrony can appear. In such a case, the ``binding" process will proceed in the neural dynamics without a help of feature-detecting neurons. The term ``binding" cannot be an element of an adequate language system.

 

 

R3.4 Real chaos?

 

Borisyuk, Freeman, Kowalik, Liljenström and Molnár point out the difficulty to discriminate high-dimensional chaos from noise.  With regard to this, we first note that the chaos analysis of experimental data is still at an immature level.  We believe that there will be great development of chaotic dynamical systems analysis in the future. Before the discovery of deterministic chaos, the analysis of random phenomena was commonly carried out by first finding the probability distribution of an appropriate random variable and then calculating average values  and fluctuations of observables using this distribution. The true fluctuations can be approximated by calculating the second, third, and (if necessary) higher order moments of the distribution. Also, in time-series analyses, the autoregression method has been used in linear prediction theory. Recently, Okabe et al. proposed a new statistical method that includes nonlinear filters, which has proved to be effective when the data is stationary (Okabe and Nakano 1991; Okabe and Inoue 1994; Okabe and Yamane 1998; Okabe and Kaneko 2000). However, because the discovery of deterministic chaos implies that a certain class of random phenomena can be described by a deterministic rule, such as that provided by a dynamical system on some smooth manifold, it has come to be believed that many types of random phenomena result from deterministic chaos and their randomness originates from a nonlinear transformation of phase space, and further a random time series can be considered projections of orbits on a manifold on to the real axis. Unfortunately, however, chaos analysis in its present form, and especially the embedding technique, is feasible only for relatively low-dimensional dynamical systems. It is ineffective for extremely high-dimensional cases and also in the presence of nonstationarity.

 

Given this present situation of our understanding of chaos, Kowalik states that there is no strict limit between noise and high-dimensional deterministic chaos in the sense that we are not able to clearly distinguish between these. However, before jumping to a conclusion in this regard, it is prudent to note Liljenström's observation that chaos is predictable over short time scales, while noise is unpredictable over any time scale, but no discrimination can be made over long time scales. Concerning this point, it is also important to note that in statistical physics, the hypothesis of molecular chaos at a microscopic level is necessary to derive the velocity distribution of an ideal gas as a macroscopic quantity. The presence of molecular chaos guarantees the ergodicity of the system. With respect to this velocity distribution, physical properties of a gas can be expressed as an average plus a variance. This method is very often formally applied to other stochastic phenomena. Usually in such treatments, the average term is viewed as a deterministic component and the variance term as a noise, equivalent to molecular chaos. Since a biological system is not a Hamilton system but a dissipative system, we are concerned with far-from equilibrium conditions. To maintain such a system in a far-from equilibrium state, an external source of energy is necessary. Therefore, generally a far-from equilibrium system is caused to be in high energy level. Under such conditions, aperiodic and unpredictable behavior of the averaged deterministic component is often observed. In order to discriminate this deterministic random behavior from the molecular chaos, physicists have referred to the former as ``deterministic chaos" or ``macroscopic chaos". Since these chaotic states appear in a far-from equilibrium system, deterministic chaos should have a much greater power than noise. During the early stage of the study of deterministic chaos, the indicator of such chaos used in experiments was the power spectrum. There are two merits of using the power spectrum to discriminate chaos from noise. First, while both chaos and noise have continuous spectra, the power of chaos is much greater than that of noise, which is almost negligible. Second, since chaos appearing in dynamical systems is generated by bifurcations, one can insure the existence of chaos through the change of control parameters. At this standpoint, enhanced noise can be interpreted as resulting from chaos.

 

It should be further noted here that most common methods of experimental data analysis are problematic. In order to determine whether the neural activity observed in any given case is described by CI, the measurement of neural activity over a long time

is necessary. Interestingly, the data found in long-term measurements of neural activity usually exhibit nonstationarity. This is in contrast to the case of shadowing of an entire attractor, i.e. a set of orbits, in which the long-term nature of a measurement implies stationarity. When we wish to study nonstationary neural phenomena experimentally, is it possible to use conventional methods of measurement and analysis? When we wish to observe the neural mechanism corresponding to a single act, are data obtained as the average of neural activity measured over many trials, such as a firing rate or correlation coefficients, meaningful? If so, what is the assumed condition? To use statistical quantities under the assumption of a stationary process reflects the belief that a single time series of neural activity is meaningless or that such time series possesses ergodicity. However, ergodicity does not likely exist for behavior-related neural activity. Therefore, people who attempt to use (stationary) statistical quantities in effect deny the meaningfulness of a single time series of neurons or neuron assemblies. But a single time series of neural activity has been observed to be associated with a single act in the laboratory, and hence it is known that such an activity is indeed meaningful. It would thus seem that we have to invent a new dynamical systems analysis which is able to treat high-dimensional and/or nonstationary data.

 

 

R4. Dynamic brain revisited

 

R4.1 Multiple codes

 

As Dinse points out, a cortical ``module" is flexible enough to be able to adapt to rapid changes in the environment, allowing for the link between fast time scales on the order of msec and time scales of learning. Here, the alternation of synchronization and desynchronization of the activity of neuron assemblies often appears, associated with this adaptation process. This chaotic alternation between synchronization and desynchronization could be described by CI. In this case, the output resulting from an input is determined by the internal dynamics, which are not fixed as a rigid input-output relation or a stimulus-response relation, but change flexibly, in a manner that depends on the outputs (see Freeman, Kay, and Kozma). Thus there is a feedback of the action from the environment to the system that generates internal dynamics. Our idea is that any feedback represents code at some level. We think that this might be one origin of multiple codes in neurons or neuron assemblies. In general, there cannot be a feedback process existing in a hierarchical information processing system. If there were some feedback an originally hierarchical structure would be broken, resulting in multiple codes. Foster introduced John's works on the interactions of coherent ensembles in neural cell assemblies. Here we briefly introduce Sakurai's series of works on multiple codes based on neural cell assemblies.

 

Sakurai studied the hippocampal and temporal cortical neuron activity exhibited during the performance of simple auditory, simple visual, and configural auditory-visual discrimination tasks. He found behavior-correlated activity of neurons, which emerged as task related. It was found that approximately one third of the task-related neurons overlap.

A single neuron's activity represents the difference between stimuli to be memorized and stimuli to be discriminated in a given task. However, cell assemblies that arise through functional connections between neurons are necessary in order to represent the difference between kinds of tasks. He called this sharing of roles between individual neurons and neuron assemblies ``dual coding". From this viewpoint of cell assemblies, the function of a single neuron is not fixed, but changes flexibly depending on its relations with other neurons. A single neuron can belong to many different cell assemblies, and for this reason, a single neuron can represent different functions in manners related to task, purpose, the functions of other neurons, etc.

 

We believe that by taking into account macro-action, as Kay suggests, the existence and function of multiple codes will become clearer. There is a work of Iwamura and Tanaka (1978). That reports the discovery of active touch-related neurons in the somatosensory cortex of monkey. These neurons become active only when a monkey holds an object that it has come to possess through its own action; that is, such neurons do not respond when an object is placed in its hands.

 

These findings show that the presence of feedback from behavioral levels to individual neurons and neuron assemblies generates multiple codes at neuronal levels. We emphasize again that feedback signals carry codes corresponding to action, not action itself.

 

 

R4.2 Dynamic memory

 

We have proposed a dynamic memory model for episodic memory and also for olfactory perception. Here, for the first time, following Foster, we present the definitions of episodic memory, semantic memory and working memory, which we envisage in the target article. Our definitions of semantic and episodic memories basically follow Tulving

(Tulving, 1972), but we have added a new perspective. Declative memory is classified into two categories, semantic memory and episodic memory. Semantic memory is memory consisting of general knowledge. Semantic memory is apparently separated from the spatio-temporal causality of events occurring in our daily experience. The database in a computer is similar to semantic memory in this sense. However, since knowledge is essentially internal (Gernert, 1996), semantic memory may be represented in a manner that depends on the internal dynamics, and hence can change, while a database is external and fixed.

 

Episodic memory is that concerning individual experience in the spatio-temporal context. This individual experience includes ``future memory" consisting of plans for future actions (Meacham and Leiman 1982; Tsukada 1992). Meacham and Leiman call this ``prospective remembering". Individual experience is, in general, memorized chronologically, but it can be memorized according to causality if a mechanism, in which the prefrontal cortex participates and by which the hippocampal dynamics can be influenced, that precedes consistency in experienced events operates. In our dynamic memory model, we treated this type of causality using a chaotic rule generated through the interactions between internal dynamics and external information (acquisition of knowledge). Generated CI provides a flexible grammar for linking memories of events, in which highly correlated memories are linked. In order to develop the theory and the model in the manner in which Ikegami & Tani suggest, introducing an explicit state dependence of the assigned probabilities and a mechanism of learning probabilities should be helpful (see also R2.4).

 

We follow Baddeley's definition (1986) of working memory. According to Baddeley, working memory consist of a conscious-related system providing a procedure of obtaining knowledge and a temporal storage of knowledge, which is necessary for performing complex cognitive tasks. For this reason, Baddeley calls this ``active memory".

 

In the target article, semantic memories are assumed to be represented as disperse spatial patterns in the network. We represent them by dynamical fixed points of the Milnor type. A weakly collapsed Milnor attractor can form an attractor ruin.  Since we assume that a chain of knowledge associated with experienced events forms an individual episode, we consider episodic memory to consist of a chain of semantic memories. We have cited two possibilities: that in which a code sequence representing chaotic orbits which link events is embedded in Cantor sets, and that in which a series of events is embedded in Cantor sets. In either case, the Cantor coding of a chain of knowledge is equivalent to the decoding of an episode.

 

Ikegami & Tani addressed an important question concerning a seemingly paradoxical feature of memory structure. On one hand, memory structure appears stable, but on the other hand memory dynamics are chaotic. We do not think that this is a paradox. Dynamics that are unstable in the usual sense are not always unstable from the information theoretical point of view. In chaos with non-uniform invariant measure that is absolutely continuous w.r.t. the Lesbeague measure, and in transitory dynamics like CI, there exist quantities that remain stable in the unstable dynamics of orbits. One such quantity is the difference between the Kullback information before and after applying the Perron-Frobenius operator. In the case with uniform invariant measure, this quantity is equivalent to the maximum Lyapunov exponent but in the case with non-uniform invariant measure, it is related to the fluctuations of Lyapunov exponents. In the latter case, the time-dependent mutual information provides an appropriate quantification of such fluctuations. The slow decay of the mutual information in time reflects an information mixing, which ensures the conservation of information content through the dynamics in coupled non-uniform chaos (see also Section R3.3). This implies that input information repeatedly appears and disappears in each local element but is globally maintained. A coupled system can then be viewed as an information channel, though its dynamics are chaotic. In other words, the inputs can be extracted as outputs, even though the state of the channel is chaotic. We believe that the appearance and disappearance of the information in places over the system carries meaning.

 

Many theories and models of learning and memory have been proposed. However, in general models lack an explicit coding scheme and a description of its relation to neurodynamics. McClelland, McNaughton and O'Reilly (1995) have discussed the details of a possible mechanism for the consolidation of memory. For this reason they are concerned with temporally graded retrograde amnesia, which typically appears in patients with hippocampal lesions like HM. The presence of temporally graded retrograde amnesia indicates a consolidation of memory based on a continual interaction between the hippocampal system and the neocortical system. We have constructed a model for a CA3-CA1 interacting system, which we present in a separate paper (Tsuda and Kuroda, 2001; Kuroda and Tsuda, 2001). In this model, chorinergic

and GABAergic innervations are introduced, and Cantor coding is found. The new point in such models is that the phase of theta rhythms may control whether the dynamics in CA3 become CI-like dynamics or stable attractor dynamics. In connection with the comments of Foster and Heath, we wish to consider the method by which a stimulus sequence is recalled by use of Cantor codes. There could be two types of recall, direct and indirect. When a person experiences some events, sensory stimuli enter CA3 from the entorhinal cortex, with the influence of internal dynamics, and as a result, CA3 begins to display the associative dynamics, such as CI. Then, a Cantor code is retrieved from this partial sequence of events in a manner that depends on the length of the sequence of events. In this way, the recall of an episode from partial information is possible. This is usual situation in the recall of a stimulus sequence. Another type of recall that can occur is here called ``Proust phenomenon". In ``A la Recherche du Temps perdu," by Marcel Proust, the character Marcel suddenly recalled an episode which had been forgotten when he put madeleine soaked in black tea into his mouth. We all experience this type of recall of episodes in our daily life. We offer a hypothesis about the mechanism of ``Proust phenomenon" in which we employ Cantor coding. The distinguishing characteristic of ``Proust phenomenon" is that a specific stimulus, which previously had no relation to any episodic memories, triggers a complete recall of some episode. In this situation, CA3 cannot be stimulated directly by such a stimulus, but rather, it must be the case that  CA1 receives direct stimulation from the entorhinal cortex. A direct perforant path from the entorhinal cortex to CA1 allows for this. Since in CA1 a code sequence is embedded in a cluster of a Cantor set, a certain level of this Cantor cluster contains a single code corresponding to a stimulus which is created in the sensory cortices or the entorhinal cortex. Then, at such a level, a code sequence can be evoked in CA1 or in the neocortex through the temporal evocation of a trace of the code sequence in CA1. This hypothesis is consistent with hypothesis 2 in Treves and Rolls (1994), where they state that the perforant path may be involved in the carrying of a cue signal that can initiate the retrieval of an episode.

 

We here emphasize that a feedback signal in the brain should consist of a code, so that anatomical couplings do not imply the usual formalization of synaptic connections, nor the usual formalization of oscillation couplings. What we envisage is as follows. We believe that the above described situation holds in the connections from the prepyriform cortex to the olfactory bulb, and also in those from CA1 to CA3 through the neocortex and the entorhinal cortex, in both of which we hypothesize the formation of Cantor coding, described by

dx/dt = F(x) + c hy, dy/dt = G(y) + w x, where hy = 0 if y is included in a certain level of cluster, and hy = 1 otherwise.

 

 

Acknowledgement

 

We would like to express our sincere thanks to Takao Namiki for critically reading the manuscript of this response paper. We also thank Michael Breakspear and Karl Friston, for kindly answering a basic question we asked them on the symmetry they used in their commentary, and Jaroslav Stark, for informing us of many related papers on IFS.

 

 

References

 

Aihara, K., Takabe, T., and Toyoda, M. (1990) Chaotic neural networks. Physics Letters A 144: 333-340.

 

Amari, S. and Maginu, K. (1988) Statistical neurodynamics of associative memory. Neural Networks 1: 63-73.

 

Alexander, J., Yorke, J., You, Z., and Kan, I. (1992) Riddled basins. International Journal of Bifurcation and Chaos 2: 795-813.

 

Arbib, M. A., Érdi, P. and Szentágothai, J. (1998) Neural Organization. Structure, Function, and Dynamics. A Bradford Book, The MIT Press, Cambridge, London.

 

Arbib, M. A. and Hesse, M. B. (1986) The Construction of Reality. Cambridge

University Press, London.

 

Baddeley, A. D. (1986) Working Memory. Clarendon Press, Oxford.

 

Blomfield, S. and Marr, D. (1970) How the cerebellum may be used, Nature 227: 1224-1228.

 

Bressloff, P. C. and Stark, J. (1992) Analysis of associative reinforcement learning in neural networks using iterated function systems. IEEE Transactions on Systems, Man, and Cybernetics 22: 1348-1360.

 

Chawanya, T. (1995) A new type of irregular motion in aclass of game dynamics systems. Progress of Theoretical Physics 94: 163-179.

 

Chawanya, T. (1997) Coexistence of infinitely many attractors in a simple flow. Physica D 109: 201-241.

 

Dawson, S., Grebogi, C., Sauer, T., and Yorke, J. (1994) Obstructions to shadowing when a Lyapunov exponent fluctuates about zero. Physical Review Letters 73: 1927-1930.

 

Descartes, R. (1701) Regulae ad directionem ingenii. In: Qeuvre de Decartes, publiees par Ch. Adam et P. Tannery, 1701, Amsterdam (Japanese translation by M. Noda, Iwanami publ., 1973 (the 22th edition).

 

Érdi, P. (1996) The brain as a hermeneutic device. BioSystems 38:179-189.

 

Érdi, P. and Tsuda, I. (2001) Hermeneutic approach to the brain: Process versus device? Theoria et Historia Scientiarum vol. VI (in press).

 

Eriksson, P. S., Perfilieva E., Bjork-Erksson, T., Alborn, A.-M., Nordborg, C., Peterson, D. A., and Gage, F. H. (1998) Neurogenesis in the adult human hippocampus. Nature Medicine 4: 1313-1317.

 

Feudel, U., Grebogi, C., Poon, L., a