1 Preface

Location: House 6, Lecture Hall - Campus Nord
Time: Tue., 16:00-18:00
Contact: Email
Script: under construction

It is still a great challenge to come up with quantitative assertions about complex biological systems. Especially, if one aims towards a functional understanding at the cell, tissue or organismic level, it typically leads to quite involved modells. Analytical progress can be made with simplifying assumptions and the restriction to limiting cases.

This course will discuss a potpourri of mathematical methods ranging from analytical techniques to numerical methods and inform the participant on how to apply them to answer biological question and have enormous fun in doing so.

The mathematical techniques encompass stochastic systems, numerical bifurcation analysis, information theory, perturbation theory and Fourier analysis. The biological examples are chosen mostly from neurobiology, sensory ecology, but also bacterial communication and evolution.

1.1 (Dis)claimer

This scriptum shall evolve. It will have flaws (didactic ones and outright errors) but you, the reader, student, search bot, provide the selective preasure to improve it. Please write to the lecturer if you find fawlt of any kind with it.

2 The Floquet theory of the aktion potential

2.1 Periodic event

Of many natural systems we observe only some drastic events.

The clock regular pulses of pulsars [@]

Action potentials
There is a voltage threshold. The dynamics is described by conductance based equations.
Wake up time
Malatonin level below threshold (suprachiasmatic nucleus)
cell division
Gap phase 1 \to synthsis \to gap phase 2 \to mitosis

2.2 Tonic spikes are limit cycles

Many intersting differential equations describing biological dynamics can be cast into the following form

  1. ẋ=F(x,θ)+η(x,t)\dot{\vec x} = \vec F(\vec x,\vec\theta) + \vec\eta(\vec x,t)

where x\vec x is a vector of state variables, e.g., voltage and kinetic variables. F\vec F is the time independent (homogeneous) flow field. η\vec\eta is a (often assumed small), possibly time dependent (inhomogeneous) perturbation to the system. θ\vec\theta are system’s parameter. The change of solutions in response to variations in system’s parameter will be studied later.

Example: Hodgkin & Huxley equations
From Kirchhof’s law of current conservation at the membrane follows a dynamics for the membrane voltage cv̇+kKIkion(v,mk)=Iinc\dot v+\sum_k^{K}I_k^\mathrm{ion}(v,\vec m_k)={I_{\mathrm{in}}}. The first term is the capacitive (pseudo current), followed by a sum of corrents from voltage dependent ion channels and the applied input current Iin{I_{\mathrm{in}}}. The ion channel dynamics is governed by kinetic equations, i.e., chemical reactions between open and closed states. The reactions often follows first order kinetic equations τ̲k(v)ṁk=M̲k()(v)mk{\underline \tau}_k(v)\dot m_k = {\underline M}_k^{(\infty)}(v) - \vec m_k. τ̲k(v){\underline \tau}_k(v) is the matrix of kinetic time constants and M̲k()(v){\underline M}_k^{(\infty)}(v) are the steady state activation curves. τ̲{\underline \tau} and M̲()(v){\underline M}^{(\infty)}(v) are typically diagonal if the channels are independent. These equations will be derived more formally in Sec. XX. The state vector is then x=[v,m11,...,mK1,...]\vec x=[v,m_{11},...,m_{K1},...]^{\dagger}. In the absence of perturbations Eq. (1) becomes a homogeneous equation
  1. ẋ=F(x,θ)\dot{\vec x} = \vec F(\vec x,\vec\theta)

and we assume the existance of a P{P}-periodic limit cycle solution, xLC(t)=xLC(t+P){\vec x_{\mathrm{LC}}}(t)={\vec x_{\mathrm{LC}}}(t+{P}), also known as periodic orbit.

A neurobiological dogma states that action potentials follow the all-or-nothing principle1. This can be construed to mean that the exact shape of the action potential does not matter2 and that information is stored in the exact spike times3. It also means that the action potential should be stable under perturbations, but that inputs ought to be able to shift the occurrence of spikes in time. If you want, from the dogma and the intend to code into spike times follow two desiderata:

  1. The action potential must be stable in amplitude,
  2. yet neutrally stable in phase.

To wit, II. just means that stimulus induced phase shifts should neither decay nor blow up.

2.3 Limit cycle stability

Stability of is probed by studying small perturbation to an invariant set solution. Our invariant set is the limit cycle (periodic orbit) xLC{\vec x_{\mathrm{LC}}}. Assuming there was a small perturbation to the system the solution can be decomposed as

  1. x(t)=xLC(t)+y(t)\vec x(t)={\vec x_{\mathrm{LC}}}(t)+\vec y(t)

with t:y(t)<ϵ\forall t:\|y(t)\|<\epsilon some “small” perturbation to the orbit. What small, i.e., ϵ\epsilon is we do not want to say now, maybe later, lets see …

Assuming the perturbation was transient (only in initial conditions) and the system is homogeneous again we plug the Ansatz of Eq. (3) into Eq. (2) and get

ddtxLC(t)+ẏ(t)=F(xLC)+F(xLC)J̲(xLC(t))y(t)\frac{\mathrm{d}}{{\mathrm{d}}t}{\vec x_{\mathrm{LC}}}(t)+\dot{\vec y}(t)=\vec F({\vec x_{\mathrm{LC}}})+\underbrace{\nabla\vec F({\vec x_{\mathrm{LC}}})}_{{{{\underline J}}}({\vec x_{\mathrm{LC}}}(t))} \cdot \vec y(t)

The Jacobi matrix evaluated on the limit cycle can be written as a function of time J̲(t)=F(xLC(t)){{{\underline J}}}(t)=\nabla\vec F({\vec x_{\mathrm{LC}}}(t)). Note that since the limit cycle solution is P{P}-periodic, so is J̲(t)=J̲(t+P){\underline J}(t)={\underline J}(t+{P}).

Identifying the limit cycle solution above we are left with the first variational equation of Eq. (2)

  1. ẏ(t)=J̲(t)y(t).\dot{\vec y}(t)={{{\underline J}}}(t)\vec y(t).

Hence, one needs to study of linear system with periodic coefficients. One solution of Eq. (4) can be guessed, let us try the time derivative of the orbit, dxLCdt\frac{{\mathrm{d}}{\vec x_{\mathrm{LC}}}}{{\mathrm{d}}t}{}

  1. ddt(dxLCdt)=ddtF(xLC)=F(xLC)ddtxLC=J̲(t)dxLCdt.\frac{\mathrm{d}}{{\mathrm{d}}t}(\frac{{\mathrm{d}}{\vec x_{\mathrm{LC}}}}{{\mathrm{d}}t})=\frac{\mathrm{d}}{{\mathrm{d}}t}\vec F({\vec x_{\mathrm{LC}}})=\nabla\vec F({\vec x_{\mathrm{LC}}})\frac{\mathrm{d}}{{\mathrm{d}}t}{{\vec x_{\mathrm{LC}}}}={{{\underline J}}}(t)\frac{{\mathrm{d}}{\vec x_{\mathrm{LC}}}}{{\mathrm{d}}t}.

So it is a solution alright, and it happens to be a P{P}-periodic solution. This solution is called the Goldstone mode. But for arbitray intitial conditions not all solutions should be periodic.

Def: Floquet Ansatz
According to Floquet theory the solution to Eq. (4) can be written in the form of a P{P}-periodic similarity matrix4 and an matrix exponential
  1. y(t)=R̲(t)etΛ̲.y(t)={\underline R}(t)e^{t{\underline \Lambda}}.

For the proof please consult (Chicone 2006), here we are just goint to work with this as an Ansatz. The constant matrix Λ̲{\underline \Lambda} is called the Floquet matrix.

Recall the matrix exponential

Def: Matrix Exp
Let A̲ICn×n{\underline A}\in{I\!\!\!\!C}^{n\times n} then expA̲=k=01k!A̲k\exp{\underline A}=\sum_{k=0}^\infty\frac1{k!}{\underline A}^k A useful corollary of this definition is that the eigen vectors of an exponentiated matrx are the same as those of the original and eigenvalues become exponentiated. If λi\lambda_i, wi\vec w_i are the eigenvalue, eigenvector pairs of the matrix A̲{\underline A}, i.e., A̲wi=λiwi{\underline A}\vec w_i=\lambda_i\vec w_i then by using this identity kk-times
  1. eA̲wi=(k=01k!A̲k)wi=k=01k!λ̲ikwi=eλiwie^{{\underline A}}\vec w_i=\left(\sum_{k=0}^\infty\frac1{k!}{\underline A}^k\right)\vec w_i=\sum_{k=0}^\infty\frac1{k!}{\underline \lambda}_i^k\vec w_i=e^{\lambda_i}\vec w_i

Inserting the Floquet Ansatz into Eq. (4) yields

Ṙ̲etΛ̲+R̲(t)Λ̲etΛ̲=J̲(t)R̲(t)etΛ̲.\dot{{\underline R}}e^{t{\underline \Lambda}}+{\underline R}(t){\underline \Lambda} e^{t{\underline \Lambda}}={\underline J}(t){\underline R}(t)e^{t{\underline \Lambda}}.

Which by multiplying with etΛ̲e^{-t{\underline \Lambda}} results in a dynamics equation for similarity matrix R̲{\underline R}.

  1. Ṙ̲=J̲(t)R̲(t)R̲(t)Λ̲.\dot{{\underline R}}={\underline J}(t){\underline R}(t)-{\underline R}(t){\underline \Lambda}.

Remember that R̲{\underline R} was invertable so one can also derive an equation for the inverse

  1. ddtR̲1=R̲1Ṙ̲R̲1=Λ̲R̲1(t)R̲1(t)J̲(t).\frac{\mathrm{d}}{{\mathrm{d}}t}{\underline R}^{-1}={\underline R}^{-1}\dot{{\underline R}}\,{\underline R}^{-1}={\underline \Lambda}\,{\underline R}^{-1}(t)-{\underline R}^{-1}(t){\underline J}(t).

2.3.1 Eigensystem of the Floquet matrix

The Floquet matrix, Λ̲{\underline \Lambda}, is constant, though not necesarily symmetric. Hence it has an orthonormal left and right eigensystem

  1. Λwi=μiwi\Lambda\vec w_i=\mu_i\vec w_i and ziΛ=μizi\vec z_i\Lambda=\mu_i\vec z_i with ziwj=δij\vec z_i\cdot\vec w_j=\delta_{ij}

One can define a “rotated” eigensystem

Def: (“Rotated” eigensystem):
Wi(t)=R̲(t)wi\vec W_i(t)={\underline R}(t)\vec w_i and Zi(t)=ziR̲1(t)\vec Z_i(t)=\vec z_i{\underline R}^{-1}(t). For which ipso facto obeys
  1. t:Zi(t)Wj(t)=δij\forall t:\vec Z_i(t)\cdot\vec W_j(t)=\delta_{ij}.

Wi(t)\vec W_i(t) and Zi(t)\vec Z_i(t) are also called the Floquet modes.

If we project the eigenvectors on the Eqs. (8) and (9) and use this definitions we get

  1. ddtWk=(J̲(t)μkI̲)Wk(t)\frac{\mathrm{d}}{{\mathrm{d}}t}{\vec W_k}=({\underline J}(t)-\mu_k{\underline I})\vec W_k(t)

and

  1. ddtZk=(μkI̲J̲(t))Zk(t).\frac{\mathrm{d}}{{\mathrm{d}}t}{\vec Z_k}=(\mu_k{\underline I}-{\underline J}^{\dagger}(t))\vec Z_k(t).

If one projects the general solution y(t)y(t) from Eq. (6) on the the adjoint Floquet modes

Zk(t)y(t)=zkR̲1(t)R̲(t)etΛ̲=zketΛ̲=zketμk\vec Z_k(t)\cdot\vec y(t)=\vec z_k{\underline R}^{-1}(t){\underline R}(t)e^{t{\underline \Lambda}}=\vec z_ke^{t{\underline \Lambda}}=\vec z_ke^{t\mu_k}

If νk<0\nu_k<0 the perturbation decays exponentially in this rotated coordinate frame.

Poincare Section
bla

Note that if ν0=0\nu_0=0, then according to Eq. (5)

  1. W0(t)=ddtxLC(t)\vec W_0(t)=\frac{\mathrm{d}}{{\mathrm{d}}t}{\vec x_{\mathrm{LC}}}(t)

is the Goldstone mode, and from Eq. (11)

  1. Z0(ϕ)ddtxLC(ϕ)=1\vec Z_0(\phi)\cdot\frac{\mathrm{d}}{{\mathrm{d}}t}{\vec x_{\mathrm{LC}}}(\phi)=1

2.4 Neutral dimension and phase shifts

The evolution of the phase of Eq. (1) is given by

dϕdt=ϕ(x)dxdt\frac{{\mathrm{d}}\phi}{{\mathrm{d}}t}=\nabla\phi(\vec x)\cdot\frac{{\mathrm{d}}\vec x}{{\mathrm{d}}t}.

To first order this is

dϕdt=ϕ(xLC)dxLCdt=ϕF(xLC)+ϕη(xLC,t)\frac{{\mathrm{d}}\phi}{{\mathrm{d}}t}=\nabla\phi({\vec x_{\mathrm{LC}}})\cdot\frac{{\mathrm{d}}{\vec x_{\mathrm{LC}}}}{{\mathrm{d}}t} =\nabla\phi\cdot\vec F({\vec x_{\mathrm{LC}}})+\nabla\phi\cdot\vec\eta({\vec x_{\mathrm{LC}}},t).

There are several ways to define a phase (Hilbert transform, linear interpolation, …). A desiderata could be to have a linear phase increase in the unperturbed case (η=0\vec\eta=0), say ϕ(t)=f0t\phi(t)={f_0}t. [… proto-phase]. From this desiderata it follows that one must have t:ϕF(xLC)=f0\forall t:\nabla\phi\cdot\vec F({\vec x_{\mathrm{LC}}})={f_0}. Given Eq. (16), this is easily achieved with the following identification

  1. ϕ=Z0(ϕ)f0=Z(ϕ)\nabla\phi=\vec Z_0(\phi){f_0}=\vec Z(\phi)

The input-output (I/O) equivalent phase oscillator to Eq. (1) can then be written as

  1. ϕ̇=f0+Z(ϕ)η(t)\dot\phi={f_0}+ \vec Z(\phi)\cdot \vec\eta(t)

A spike-train can be written as

  1. y(t)=kδ(ttsp)=kδ(ϕ(t)k)y(t)=\sum_k\delta(t-{t^{\mathrm{sp}}})=\sum_k\delta(\phi(t)-k)

3 Fourier theory

To proceed with the analysis we need some results from Fourier theory.

3.1 The Fourier base

One may think of the Fourier transform as a projection onto a new basis eω(t)=eiωte_\omega(t)=e^{{\mathrm{i}}\omega t}. Define the projection of a function as

F(ω)=ewf=dteω*(t)f(t)F(\omega)={\langle}e_w\cdot f{\rangle}=\int_{-\infty}^\infty dt\,e_\omega^*(t)f(t)

The inverse Fourier transform is a projection onto et*(ω)e_t^*(\omega)

f(t)=et*F=dωet(ω)F(ω)f(t)={\langle}e_t^*\cdot F{\rangle}=\int_{-\infty}^\infty d\omega\,e_t(\omega)F(\omega)

3.2 Existence of the Fourier integral

Usually all transcendental pondering about the existance of mathematical objects is the subject of pure math and should not bother us too much (we trust the work as been done properly). But in the Fourier transform case it motivates a subject we need to address mainly because of δ\delta-functions, which are heavily used in theoretical neurobiology, because they are reminestcent of action potentials or indicate the time of such a spike event. What does it mean for the Fourier transform to exist? It means that the integrals invoved in the definition of the Fourier transform converge to finite values. OK. So let us look at the magnitude of the transform of a function ff. It can be upperbound by the Cauchy-Schwarz inequality

|ewf|=|dteiωtf(t)||eiωt|=1|f(t)|dt=|f(t)|dt.|{\langle}e_w\cdot f{\rangle}|=\left|\int_{-\infty}^\infty dt\; e^{-{\mathrm{i}}\omega t}f(t)\right| \leqslant \int_{-\infty}^\infty\underbrace{|e^{-{\mathrm{i}}\omega t}|}_{=1} |f(t)|\;dt= \int_{-\infty}^\infty |f(t)|\;dt.

This means that if one assumes the function f(t)f(t) is absolute integrable

|f(t)|dt<,\int_{-\infty}^\infty|f(t)|\;dt<\infty,

then the Fourier integral exists – hurray. Of course, the same works for the integral involved in inverse Fourier transform. Note that this is an implication in one dirrection. All functions satisfying absolute integrability are lumped together in L1(IR)L^1({I\!\!R}).

The bad news is that applying a Fourier transform to one of the members of L1(IR)L^1({I\!\!R}) can through you out of it. And then? Well, luckily there is the set of Schwartz functions, wich is closed under the Fourier transform.

3.2.1 Orthonormality

The issue that absolute integrability is insufficient already manifests when trying to Fourier transform it own basis, since dt|eiωt|=\int_{-\infty}^\infty dt|e^{{\mathrm{i}}\omega t}|=\infty. Lets give it a name anyway

eωeν=dtei(νω)t=δ(νω){\langle}e_\omega\cdot e_\nu{\rangle}=\int_{-\infty}^\infty dt\,e^{{\mathrm{i}}(\nu-\omega)t}=\delta(\nu-\omega)

It follows that the delta function is symmetric

δ(ω)=dteiωt=dteiωt=δ(ω)\delta(\omega)=\int_{-\infty}^\infty dt\,e^{{\mathrm{i}}\omega t}=\int_{-\infty}^\infty dt\,e^{-{\mathrm{i}}\omega t}=\delta(-\omega)

Note also that

  1. eω1=δ(ω){\langle}e_\omega\cdot 1{\rangle}=\delta(\omega)

3.2.2 Convolution

The convolution of two function is defined as

h(t)=(f*g)(t)=drf(tr)g(r)h(t)=(f*g)(t)=\int_{-\infty}^\infty dr f(t-r)g(r)

H(ω)=dteiωtdrf(tr)g(r)=drg(r)dteiω(t+r)f(r)H(\omega)=\int dt\,e^{{\mathrm{i}}\omega t}\int dr\,f(t-r)g(r)=\int dr\,g(r)\int dt\,e^{-{\mathrm{i}}\omega(t+r)}f(r)

=drg(r)eiωrdteiωrf(t)=G(ω)F(ω)=\int dr\,g(r)e^{-{\mathrm{i}}\omega r}\int dt\,e^{-{\mathrm{i}}\omega r}f(t)=G(\omega)F(\omega)

3.2.3 Derivative

What is δ(t)\delta'(t)?

(δ*f(t))=δ(r)f(tr)dr=[δ(r)f(tr)]r=δ(r)f(tr)dr=f(t)(\delta'*f(t))=\int\limits_{-\infty}^\infty\delta'(r)f(t-r)dr=[\delta(r)f(t-r)]_{r=-\infty}^\infty-\int\limits_{-\infty}^\infty\delta(r)f'(t-r)dr=-f'(t)

Hence

  1. (δ*f(t))=f(t)(\delta'*f(t))=-f'(t)

4 The continuum limit of a membrane patch

Motivation and aim

In this lecture ion channels are introduced as stochastic devices floating in the membrane of a nerve cell. It should motivate why the analysis techniques and models introduced in this lecture need to deal with fluctuations and noise. (Nerve) cells produce stochastic processes on several levels:

  1. The ion channels in their membrane stochastically jump beteen conformations, governed by Master equations.

  2. On a more macroscipic level the their membrane voltage fluctuations show properties of coloured noise, well described by diffusion processes and stochastic differential equations.

  3. The trains of action potentials they emmit from point processes.

In the stationary state these processes can be subjected to spectral analysis.

Note that in 1952, the first equations describing membrane voltage dynamics where the deterministic rate equations by Hodgin & Huxley (Hodgkin and Huxley 1952). Only in 1994 Fox and Lu derived these equations from the continuum limit of an ensemble of stochastic ion channels (Fox and Lu 1994). Essentially by doing a diffusion approximation.

Since then the nuences of appllying diffusion approximations to neurons have been investigated (Linaro, Storace, and Giugliano 2011,Orio and Soudry (2012)) and reviewed (Goldwyn et al. 2011,Goldwyn and Shea-Brown (2011),Pezo, Soudry, and Orio (2014)).

4.1 The ion channel as a Markov model

Proteins change conformation on various triggering signals:

Such dynamics can be described by a finite state Markov model which is mathematically described as a Master equation.

Starting with a simple ion channel that has an open conformation, OO, in which ions can pass (say K+ ions) and a cloased states, CC, which blocks ion flow

OαβCO\underset{\beta}{\overset{\alpha}{\rightleftharpoons}}C

We can define a vector of the probability of being open and closed, p=[pO,pC]\vec p=[p_O,p_C]^{\dagger}, respectively.

An ionic current produced in the open state would be

$I^\mathrm{ion}=\gamma_{\mathrm K^+}N_O(E_{\mathrm K^+}-v)$

$\gamma_{\mathrm K^+}$ and $E_{\mathrm K^+}$ are the unitary conductance and the Nernst potential respectively. The average such current would be

${\langle}I^\mathrm{ion}{\rangle}=\gamma_{\mathrm K^+}Np_O(E_{\mathrm K^+}-v)$

where NN is the total number of channels in the membrane patch under consideration. But what about a particular stochastic realisation of the current, what about the fluctuations around the mean?

If we have NN channels than the number of kk of them being open is binomially distributed

PO(k)=(Nk)pOkpCNkP_O(k)={N\choose k}p_O^kp_C^{N-k}

In actuality the channels are part of the membrane dynamical system, where α\alpha and β\beta depend at least on vv and hence are not constant during a spike. We need an update rule how to get from the probability of being open at time tt to the probability of begin open at time t+dtt+dt. This is given by

(pO(t+dt)pC(t+dt))=(1αdtβdtαdt1βdt)(pO(t)pC(t))\begin{pmatrix}p_O(t+dt)\\p_C(t+dt)\end{pmatrix}=\begin{pmatrix}1-\alpha\,dt&\beta\,dt\\\alpha\,dt&1-\beta\,dt\end{pmatrix}\begin{pmatrix}p_O(t)\\p_C(t)\end{pmatrix}

or in vector form

p(t+dt)=(I̲+Q̲dt)p(t)\vec p(t+dt)=\left({\underline I}+{\underline Q} dt\right)\vec p(t)

The infinitesimal limit is

ddtp=Q̲p\frac{\mathrm{d}}{{\mathrm{d}}t}\vec p={\underline Q}\vec p

With pC=1pOp_C=1-p_O we can express one row of this equation as

ṗO=αpOβ(1pO)\dot p_O=\alpha p_O - \beta(1-p_O)

or

τṗO=pO()pO\tau\dot p_O=p_O^{(\infty)}-p_O with τ=1α+β\tau=\frac1{\alpha+\beta} and pO()=αα+βp_O^{(\infty)}=\frac\alpha{\alpha+\beta}

which has solution

pO(t)=pO()(1et/τ)p_O(t)=p^{(\infty)}_O(1-e^{-t/\tau})

Fourier transformation leads to

τiωp̃(ω)=pO()p̃(ω)\tau{\mathrm{i}}\omega\tilde p(\omega)=p^{(\infty)}_O-\tilde p(\omega)

or

p̃(ω)=p()1+iτω\tilde p(\omega)=\frac{p^{(\infty)}}{1+{\mathrm{i}}\tau\omega}

|p̃(ω)|2=p̃(ω)p̃*(ω)=p()1+(τω)2|\tilde p(\omega)|^2=\tilde p(\omega)\tilde p^*(\omega)=\frac{p^{(\infty)}}{1+(\tau\omega)^2}

a Lorenzian spectrum. Inverse Fouerier transform yields the covariance function

c(τ)=pO(t)pO(t+τ)c(\tau)={\langle}p_O(t)p_O(t+\tau){\rangle}

4.1.1 The nn-state channel

Let us try to calculate the statistics of the current originating from an nn-state channel (just like in the two state case). Why would one do this? The idea, later on is to be able to find a continuous stochastic process that we can simulate and analyise easily.

Let K(t)[1,...,n]K(t)\in[1,...,n] be the realisation of the nn-state Markov channel. For example a K+-channel with four subunits

14αβ23α2β32α3β4α4β51 \underset{\beta}{\overset{4\alpha}{\rightleftharpoons}} 2 \underset{2\beta}{\overset{3\alpha}{\rightleftharpoons}} 3 \underset{3\beta}{\overset{2\alpha}{\rightleftharpoons}} 4 \underset{4\beta}{\overset{\alpha}{\rightleftharpoons}} 5

Assuming the ion channel has nn conformations, of which one is conducting, let us further define

G=δ1K(t)={1:K(t)=10:K(t)>1.G=\delta_{1K(t)}=\begin{cases}1&:K(t)=1\\0&:K(t)>1.\end{cases}

The single channel current at a voltage clamp v(t)=vv(t)=v is then

  1. I(t)=γG(t)(Ev).I(t)=\gamma G(t)(E-v).

How does it evolve? Define pi(t)=P(K(t)=i)p_{i}(t)=P(K(t)=i) and p(t)=[p1,...,pn]\vec p(t)=[p_1,...,p_n]^{\dagger}, then

  1. ddtp=Q̲p\frac{\mathrm{d}}{{\mathrm{d}}t}\vec p={\underline Q}\vec p

With formal solution

p(t)=eQ̲tp(0)=(eQ̲)M̲tp(0)=M̲tp(0)\vec p(t)=e^{{\underline Q}t}\vec p(0)={\underbrace{(e^{{\underline Q}})}_{{\underline M}}}^t\vec p(0)={\underline M}^t\vec p(0)

Use the singular value decompostion, M̲=U̲Σ̲V̲{\underline M}={\underline U}\,{\underline \Sigma}\,{\underline V}^{\dagger}, the matrix power can be written as M̲t=U̲Σ̲tV̲{\underline M}^t={\underline U}\,{\underline \Sigma}^t{\underline V}^{\dagger}. Or (recall Eq. (7))

  1. p(t)=k=1nukvkeνktp(0)\vec p(t)=\sum_{k=1}^n\vec u_k\vec v_k^{\dagger}e^{\nu_kt}\vec p(0).

If Eq. (22) has a stationary distribution in the tt\to\infty limit, then this must correspond to the eigenvalue ν1=0\nu_1=0 (let us assume they are ordered). So ddtp()=0\frac{\mathrm{d}}{{\mathrm{d}}t}\vec p(\infty)=0\Longrightarrow for p()=v1,ν1=0:Q̲v1=ν0v1=0\vec p(\infty)=\vec v_1,\exists\nu_1=0:{\underline Q}\vec v_1=\nu_0\vec v_1=0. Therefore, the solution can be written as

  1. p(t)=p()+k=2nukvkeνktp(0)\vec p(t)=\vec p(\infty)+\sum_{k=2}^n\vec u_k\vec v_k^{\dagger}e^{\nu_kt}\vec p(0).

The average channel current of Eq. (21) is

I(t)=γ(Ev)k=1npk(t)δ1k=γ(Ev)(p1()+k=2nu1kv1kp1(0)eνkt){\langle}I(t){\rangle}=\gamma(E-v)\sum_{k=1}^n p_k(t)\delta_{1k}=\gamma(E-v)(p_1(\infty)+\sum_{k=2}^nu_{1k}v_{1k}p_1(0)e^{\nu_kt}),

which if the chain is stable (νk<0:k>1\nu_k<0:\forall k>1) has steady state

I=limtI(t)=p1(){\langle}I{\rangle}=\lim_{t\to\infty}{\langle}I(t){\rangle}=p_1(\infty).

The steady state covariance C(Δ)=limtC(\Delta)=\lim_{t\to\infty} in this case is

Ct(Δ)=I(t)I(t+Δ)I2=γ2(Ev)2j,k=1nδ1jδ1kpj(t)pk(t+Δ)I2C_t(\Delta)={\langle}I(t)I(t+\Delta){\rangle}-{\langle}I{\rangle}^2=\gamma^2(E-v)^2\sum_{j,k=1}^n\delta_{1j}\delta_{1k}\,p_j(t)p_k(t+\Delta)-{\langle}I{\rangle}^2

Hence

Ct(Δ)=p1(t)p1(t+Δ)I2=i,j=2n[uivivj=δijui]11eνit+νj(t+Δ)C_t(\Delta)=p_1(t)p_1(t+\Delta)-{\langle}I{\rangle}^2=\sum_{i,j=2}^n[\vec u_i \underbrace{\vec v_i^{\dagger}\vec v_j}_{=\delta_{ij}}\vec u_i^{\dagger}]_{11}e^{\nu_it+\nu_j(t+\Delta)}

If νk<0:k>1\nu_k<0:\forall k>1 then

  1. C(Δ)=i=2nu1iu1ieνiΔC(\Delta)= \sum_{i=2}^nu_{1i} u_{1i}e^{\nu_i\Delta}

is a sum of exponentials. The spectral density is a superposition of Lorenzians.

4.1.2 Simulation a Jump process

Do not track individual channels but the channel numbers. Starting in a particular state N=[N1,...,Nn]\vec N=[N_1,...,N_n] at time tt the life time of staying in that state until t+τt+\tau is

f(τ)=λeλτf(\tau)=\lambda e^{-\lambda\tau}

The escape rate of the state is (any reaction occuring)

λ=k=1nNkak\lambda=\sum_{k=1}^nN_ka_k

where aka_k are the rate of leaving state kk. For example a3a_3 in the K+-channel is

a3=2β+2αa_3=2\beta+2\alpha

But which reaction did occur? Let jj be the reaction (not the state!). For example, there are 8 reactions in the K+ channel. The probabilities of occurance associated with any one of them is

p(j)=Njζjk=1NreacNkζk=Njζj/λp(j)=\frac{N_j\zeta_j}{\sum_{k=1}^{N_{\mathrm{reac}}}N_k\zeta_k}=N_j\zeta_j/\lambda

In a computer algorithm one can draw the waiting time for the next reaction simply by generating a uniform random number r1U(0,1)r_1\sim U(0,1) and then

τln(r11)/λ\tau\leftarrow\ln(r_1^{-1})/\lambda

The reaction is determined with a second random number r2U(0,1)r_2\sim U(0,1)

P(j)=k=1jp(k)P(j)=\sum_{k=1}^jp(k)

jargmaxjP(j)<r2j\leftarrow\text{argmax}_{j}P(j)<r_2

while
  r1 = rand
  r2 = rand
  tau = ln(1/r1) / a

4.2 Statistically equivalent diffusion process (Orenstein-Uhlenbeck Process)

The jump process discussed in the previous sections is a continuous time Markov-process on a discrete domain, K(t)INK(t)\in{I\!\!N} with tIRt\in{I\!\!R}. A diffusion process is a continuous time Markov-process on a continuous domain, η(t)IR\eta(t)\in{I\!\!R} with tIRt\in{I\!\!R}.

Can are in search for a a diffusion process such that

I=γ(p1()+k=2nηk(t))(Ev)I=\gamma\left(p_1(\infty) + \sum_{k=2}^n\eta_k(t)\right)(E-v)

has the same first and second order statistics (in voltage clamp) as Eq. (21)? Let us try

  1. τk(v)η̇k=ηk+σ(v)ξ(t)\tau_k(v)\dot\eta_k=-\eta_k+\sigma(v)\,\xi(t) where ξ(0)ξ(Δ)=δ(Δ){\langle}\xi(0)\xi(\Delta){\rangle}=\delta(\Delta)

To solve it, use the Fourier Transform

iωτη̃(ω)=η̃(ω)+σχ(ω){\mathrm{i}}\omega\tau\tilde\eta(\omega)=-\tilde\eta(\omega) + \sigma\,\chi(\omega)

The place holder symbold χ\chi was introduced for the Fourier transform of the stochastic process ξ(t)\xi(t). Rearanging yields

η̃=σχ(ω)1+iωτ\tilde\eta=\frac{\sigma\chi(\omega)}{1+{\mathrm{i}}\omega\tau}

The spectrum is

η̃(ω)η̃*(ω)=σ2χ(ω)χ*(ω)1+(τω)2\tilde \eta(\omega)\tilde \eta^*(\omega)=\frac{\sigma^2\chi(\omega)\chi^*(\omega)}{1+(\tau\omega)^2}

By definition χ(ω)χ*(ω)\chi(\omega)\chi^*(\omega) is the Fourier Transform of the covariance function δ(Δ)\delta(\Delta) and from Eq. (19) this is one. Hence,

η̃(ω)η̃*(ω)=σ21+(τω)2\tilde \eta(\omega)\tilde \eta^*(\omega)=\frac{\sigma^2}{1+(\tau\omega)^2}

Applying the inverse Fourier transform results in the correlation function

C(t)=σ2τe|t|/τC(t)=\frac{\sigma^2}\tau e^{-|t|/\tau}.

A super position of independent such OU process

k=inηi(t)\sum_{k=i}^n\eta_i(t)

leads to a correlation function with the same structure as in Eq. (25). We identify τi=1/νi\tau_i=1/\nu_i and σi=u1i\sigma_i=u_{1i}.

The idea of matching the second order statistics can be formulated in a far more abstract way in terms of the Kramers-Moyal-van-Kampen expansion of the Master requation

ṗ=w(xx)p(x,t)w(xx)p(x,t)dx\dot p=\int w(x'\to x)p(x',t)-w(x\to x')p(x,t) dx'

p(x,t)=n=1nxnKn(x,t)p(x,t)\partial p(x,t)=\sum_{n=1}^\infty\frac{\partial^n}{\partial x^n} K_n(x,t)p(x,t)

from

5 Information theory for the living

This theory was never ment to be used to describe living systems in which meaning, i.e., the content of a message actually matter. Information theory deals with optimal compression, lossless transmission of signals irrespective of weather it is relevant or a TV program.

Nontheless a look at a quantitative science of communication may be insightfull.

Consult []

Since there is a relation between the PRCs introduced in XXX and the filtering properties of a neuron, one can seek to do the same for phase dynamics and ask what PRC would maximise information transmission. But first, one needs to develop how the PRC predicts a lower bound on the mutual information rate. We begin with a short review of the basic tenets of information theory. Within information theory, a neural pathway is treated as a noisy communication channel in which inputs are transformed to neuronal responses and sent on:

5.1 The communication process

The science of communication is concerned with at least two subtopics: (i)(i) the efficient representation of data (compression); and (ii)(ii) the save transmission of data through unreliable channels.

A source (the message) typically runns through the following processing sequence:

\to

Compress

\to

Encode

\quad\downarrow
Channel
\quad\downarrow

{\leftarrow}

Decompress

{\leftarrow}

Decode

One of the formal assertions of information theory is that these two problems can be addressed separately (without loss of generality or efficiency): Meaning first one compresses by removing redundancies. Then one again adds failsafe redundancy to combat the unreliability.

However convenient for engineering, this does not mean that biological systems have to make use of this fact (source coding and channel coding could be very intertwined).

Also note that many of the mathematical results in information theory are bounds, inequalities not achievable in real systems. To get an idea of the mindset of information theory, check-out Shannon’s source coding theorem.

5.2 Source coding, data compression and efficient representation

Remember Morse’s code. Why does the character E only have a single symbol (\cdot), while the character Q (--\cdot\,-) has so many?

The idea could be that: Highly probable symbols should have the shortest representation, unlikely ones may occupy more space. What we want to compress is called the source and data compression depends on the source distribution. This sounds like a good idea for a biological system: Do not spend resources on rare events. Well not quite. Cicadas hear the sounds of their mating parners only once at the very end of a possibly 19 year long life.

Example (Genetic code):
Cells code 20 amino acids with words composed of a four letter5 alphabet A={A=\{A,G,C,T}\}. Only words of length 3 are sufficient 42=16<20<64=434^2=16<20<64=4^3. If nature was to require only 16 amino acides, two character words would be sufficient. Only four to drop: Discrading the 4 least probable AS to occur in Proteins: Tryptophan 1.1%, Methionin 1.7%, Histidin 2.1% and Cystein 2.8%. No code words would make for an error of 7.7%.

In general, we can define a risk δ\delta, i.e., the probability of not having a code word for a letter. Then, to to be efficient, one should choose the smallest δ\delta-sufficient subset SδAS_\delta\subset A such that p(xSδ)1δp(x\in S_\delta)\geqslant1-\delta.

Def. (Essential bit content):
For an alphabet, SδS_\delta, the essential bit content,i.e., the number of binary questions asked to identifiy an element of the set is Hδ=log2|Sδ|H_\delta=\log_2|S_\delta|

In the case of the genetic colde the essential bit content for δ=0.077\delta = 0.077 is H0.077=4H_{0.077}=4 or if we use the base b=4b=4 for the log it is 2, which is the length of the code words required for 16 AS.



Take an even simpler example: We have the alphabet A={0,1}A=\{0,1\} with probability distributions A: {p0,p1}={12,12}\{p_0,p_1\}=\{\frac12,\frac12\} and B: {p0,p1}={34,14}\{p_0,p_1\}=\{\frac34,\frac14\}. In figure~ you can see a plot of 1NHδ\frac1NH_\delta over the allowed error δ\delta for words of different length NN.

For increasing NN the curves depend on δ\delta to a lesser degree. Even more it converges to a line around the entropy

  1. H=kpklog2(pk){H}=-\sum_k p_k \log_2(p_k)

which is $H_{\text A}=1$ and $H_{\text B}\approx0.81$.

This is closely related concept of the typical set TNT^N. Members of the typical set (x1,..,xN)=xTN(x_1,..,x_N)=\vec x\in T^N have 1Nlogp(x)H(x).-\frac1N\log p({\vec x})\approx H(x). Hence their probability is p(x)2NH(x)p({\vec x})\approx 2^{-N\,H(x)}, for all of them, which implies the cardinality of the typical set is 2NH(x)2^{N\,H(x)}. There holds a kind of law of large numbers (Asymptotic Equipartition Property, AEP) for the typical set stating that: For NN i.i.d. random variables x=(x1,..,xN){\vec x}=(x_1,..,x_N) with NN large x{\vec x} is almost certainly am member of TNT^N (p(xTN)=1εp({\vec x}\in T^N)=1-\varepsilon).

Shannon argued that one should therefore base the compression on the typical set and showed that we can achieve a compression down to NHNH bits.

Asymptotic Equipartition Property (AEP):
For i.i.d. random variables 1nlogp(X1,..,Xn)=1nilogp(Xi)logp(X)=H(X)-\frac1n\log p(X_1,..,X_n)=-\frac1n\sum_i\log p(X_i)\to-{\langle}\log p(X){\rangle}=H(X)

Note that the i.i.d. assumption is not to restrictive as we represent correlated processes in terms of the i.i.d. coefficients in their transform (or the empirical counter parts).

The typical set Tnε={(x1,..,xn):|1/nlogp(x1,..,xn)H(X)|<ε}T^\varepsilon_n=\{(x_1,..,x_n):|-1/n\log p(x_1,..,x_n)-H(X)|<\varepsilon\} allows Shannon’s source coding algorithm: The encoder checks if the input sequence lies within the typical set; if yes, it outputs the index of the input sequence within the typical set; if not, the encoder outputs an arbitrary in n(H+ε)n(H+\varepsilon) digit number.

By formalising this argument Shannon proved that compression rates up to the source entropy is possible. The converse, that compression below is impossible is a bit more involving.

5.3 Channel coding

Here we consider a ,,memoryless’’ channel:

WmessageencodeXxp(y|x)noisy channelyYdecodeW^est. Message\overset{\text{message}}{W}\overset{\text{encode}}{\to}X\ni x\to\overset{\text{noisy channel}}{p(y|x)}\to y\in Y\overset{\text{decode}}{\to}\overset{\text{est. Message}}{\hat W}

The rate RR with which information can be transmitted over a channel without loss is measured Bitstransmission\frac{\text{Bits}}{\text{transmission}} for a discrete time channel or Bitssecond\frac{\text{Bits}}{\text{second}} for a continuous time channel. Operationally, we wish all bits that are transmitted to be recovered with negligible probability of error.

A measure of information could be:

The average reduction in the number of binary-questions needed to identify xXx\in X before and after observing yYy\in Y.

This would just be the difference in entropies:

${M}(X;Y)={H}(X)-{H}(X|Y)= -\sum\limits_{x\in X} p(x)\log_2 p(x) + \sum\limits_{x\in X\atop y\in Y}p(x,y)\log_2p(x|y)$.

This goes by the name of mutual information or transinformation. Remember marginalisation

p(x)=yYp(x,y)p(x)=\sum_{y\in Y}p(x,y).

So the mutual information is

${M}=-\sum_{x\in X\atop y\in Y} p(x,y)\log_2 p(x) + \sum_{x\in X\atop y\in Y}p(x,y)\log_2\left(\frac{p(x|y)p(y)}{p(y)}\right)$

or

${M}(X;Y)=\sum_{x\in X\atop y\in Y} p(x,y)\log\frac{p(x,y)}{p(x)p(y)}$

From the complete symmetry of this quantity we can also write it as

M(X;Y)=H(Y)H(Y|X){M}(X;Y)={H}(Y)-{H}(Y|X).

The following Figure illustrates how the mutual information is related to respective (conditional) entropies of the input and output ensemble.


We have heard that I(X;Y)I(X;Y) quantifies the statistical dependence of XX and YY, but how is that related to error-free communication?

I(X;Y)I(X;Y) depends on the input ensemble. To focus on the properties of the channel we can simply take an ,,optimal’’ input ensemble and define the channel capacity

C=maxp(x)I(X;Y).C=\max_{p(x)}I(X;Y).

It will be left to the sender to actually find the optimal input statistics. Note that I(X;Y)I(X;Y) is a concave function (\cap) of p(x)p(x) over a convex set of probabilities {p(xi)}\{p(x_i)\} (this is relevant for procedures like the Arimoto-Blahut algorithm for estimating CC) and hence a local maximum is a global maximum.

p = []
while
  p(

Shannon established that this capacity indeed measures the maximum amount of error-free information that can be transmitted. A trivial upper bound on the channel capacity is

Cmin{log|X|,log|Y|}C\leqslant\min\{\log|X|,\log|Y|\}.

This is due to the maximum entropy property of the uniform distribution in the discrete case:

Example (Maximum entropy, discrete case):
For the derivate of the entropy from Eq. (27) one gets: piH=log2pi1\frac\partial{\partial p_i}{H}=-\log_2 p_i-1, which leads to pie1ip_i\propto e^{-1}\quad\forall i. After normalisation one has pi=1/Np_i=1/N, so the uniform distribution maximises the entropy in the discrete case.

5.4 Information transmission (continuous case)

Information theory has long been in the discussion as an ecological theory upon which to judge the performance of sensory processing (Atick 1992). This led Joseph Atick, Horace Barlow and others to postulate to use this theory to study how nervous systems adapt to the environment. The goal is to make quantitative predictions about what the connectivity in the nervous system and the structure of receptive fields should look like, for instance. In this sense, information theory was hoped to become the basis for an ecological theory of adaptation to environmental statistics (Atick 1992,Barlow (1961)).

Some of the signals in nature (and those applied in the laboratory) are continuous in time and alphabet. Does it make sense to extend the definition of the entropy as

  1. H(x)=dxp(x)lnp(x){H}(x) =-\int {\mathrm{d}}x\,p(x)\ln p(x)?

Maybe. Let us see how far one gets with this definition. It is called differential entropy by the way. Through quantisation this can be related back to the entropy of discrete alphabets.

If the p(x)p(x) is smooth then one associates the probability of being in iΔx(i+1)Δi\Delta\leqslant x\leqslant(i+1)\Delta with

pi=p(xi)Δ=iΔ(i+1)Δdxp(x)p_i=p(x_i)\Delta = \int_{i\Delta}^{(i+1)\Delta}{\mathrm{d}}x\,p(x)

The entropy of the quantised version is

HΔ=i=pilnpi=i=Δp(xi)ln(p(xi)Δ){H}_\Delta=-\sum\limits_{i=-\infty}^\infty p_i\ln p_i=-\sum\limits_{i=-\infty}^\infty \Delta p(x_i)\ln(p(x_i)\Delta)

=i=Δp(xi)lnp(xi)lnΔ=-\sum\limits_{i=-\infty}^\infty \Delta p(x_i)\ln p(x_i)-\ln\Delta

This is problematic as the second term goes to infinity for small quantisations. Formally, if p(x)p(x) is Rieman integrable, then

limΔ0=HΔ+lnΔ=dxp(x)lnp(x)\lim_{\Delta\to0}={H}_\Delta+\ln\Delta=-\int {\mathrm{d}}x\,p(x)\ln p(x)

Since the infinitesimal limit is taken we can also take nn to be the number of quantal intervals so that in the limit

$\lim_{\Delta\to0\atop n\to\infty}\ln\Delta\approx n$.

So that an nn-bit quantisation of a continous random variable xx has entropy

H(x)+n{H}(x)+n.

With the mutual information being the difference of entropies the quantisation term vanishes.

A first example can be taken from rate coding6. In terms of our spike trains from Eq. (18), the instantaneous firing rate can be defined as

Example (Optimal rate coding in the retina (Laughlin 1981)):
In short, Simon measured the distribution of light intensities that would impinge on the fly’s compund eye in a natural environment: It was more or less normally distributed. He postulated that an efficient allocation of resources (firing rate) would be, to spread more likely inputs over a broader range of firing rates. Why? It makes them easier to differentiate by upstream neurons. Via experiments he found that the fly’s compound eye approximates the cumulative probability distribution of ctrast lelvels in natural scenes.

Waterfilling argument
Let us pause and ask: What is the maximum entropy distribution for a continuous alphabet?

Example (Maxent, continuous with variance constraint):
Take a random variable xIRx\in{I\!\!R}. It can not be the uniform distribution as in the discrete case. In fact we need additional constraints. For example one may ask for the maximum entropy distribution, p(x)p(x), given a fixed mean, μ\mu, and variance σ\sigma. Using Lagrange’s multipliers to formulate the constraint optimisation problem

C=H+λ0dxp(x)+λ1(dxp(x)xμ)+λ2(dxp(x)(xμ)2σ2)C=H+\lambda_0\int{\mathrm{d}}x\,p(x)+\lambda_1\left(\int{\mathrm{d}}x\,p(x)x-\mu\right)+\lambda_2\left(\int{\mathrm{d}}x\,p(x)(x-\mu)^2-\sigma^2\right)

One searches for p(x)p(x) that fulfils δC/δp(x)=0\delta C/\delta p(x)=0 and C/λi=0\partial C/\partial\lambda_i=0, where δ/δp(x)\delta/\delta p(x) is the variantional derivative. One finds

p(x)=exp(λ01+λ1x+λ2(xμ)2)p(x)=\exp(\lambda_0-1+\lambda_1x+\lambda_2(x-\mu)^2).

With the normalisation from C/λi=0\partial C/\partial\lambda_i=0 we get the normal distribution (eλ01=1/2πσ2e^{\lambda_0-1}=1/\sqrt{2\pi\sigma^2}, λ1=0\lambda_1=0 and λ2=1/σ2\lambda_2=1/\sigma^2).

What did we learn? Well

  1. H(x)HGauss(x)H(x)\leqslant H_\text{Gauss}(x).

If we consider the whole spike train y(t)=kδ(ttspk)y(t)=\sum_k\delta(t-{t^{\mathrm{sp}}}_k) (see Eq. (18)) as the ouput, not just its mean input intensity and mean firing rate, we have a continuous time dependent stochastic processes to deal with. Note that the definition of the spike train, if integrated, is related to the empirical distribution function.

If we average infinitely many trials we get the instantaneous firing rate r(t)=y(t)y|xr(t)={\langle}y(t){\rangle}_{y|x}. We will give a mathematical definition of r(t)r(t) later. Our communication channel looks like that

input signal neural response
x(t)IRx(t)\in{I\!\!R} \to pathway \to y(t)IRy(t)\in{I\!\!R}

The entropy rate of an ergodic process can be defined as the the entropy of the process at a given time conditioned on its past realisations in the limit of large time

  1. H[x]=limtH[xt|xτ:τ<t].H[x]=\lim_{t\to\infty}{H}[x_t|x_\tau:\tau<t].

The mutual information rate measures the amount of information a neural pathway transmits about an input signal x(t)x(t) is the mutual information rate,

  1. M[x,y]=H[y]H[y|x]encoding=H[x]H[x|y]decoding{M}[x,y]={H}[y]-\underbrace{{H}[y|x]}_\text{encoding}= {H}[x]-\underbrace{{H}[x|y]}_\text{decoding},

between the stochastic process, x(t)x(t), and the stochastic response process, y(t)y(t). The entropy rate H{H} measures the number of discriminable input or output states, either by themselves, or conditioned on other variables.

The mutual information rates, which is the difference between unconditional and conditional entropy rates, characterises the number of input states that can be distinguished upon observing the output. The response entropy rates H[y]{H}[y], for instance, quantifies the number of typical responses per unit time, while H[x|y]{H}[x|y] is a measure of the decoding noise in the model. If this noise is zero, then the mutual information rate is simply H[x]{H}[x], provided that this is finite.

The conditional entropy rates H[y|x]{H}[y|x] and H[x|y]{H}[x|y], characterising the noise in the encoding and decoding model respectively, are each greater than zero. In information theory, these quantities are also called equivocation. Hence, both the stimulus and response entropy rates, H[x]{H}[x] and H[y]{H}[y], are upper bounds for the transmitted information.

Example (Info rate continuous discrete-time Gaussian process):
The OU-process from Eq. (26) is an example of a Gaussian process. Take a discretised version, x=[x(t1),x(t2),...]\vec x=[x(t_1),x(t_2),...], of it such that

p(x)=|2πK̲|1/2exp(12(xμ)K̲1(xμ))p(\vec x)=|2\pi{\underline K}|^{-1/2}\exp\left(-\frac12(\vec x-\vec\mu)^{\dagger}{\underline K}^{-1}(\vec x-\vec\mu)\right)

Hn=12ln2π|K̲|+n2H_n=\frac12\ln2\pi|{\underline K}| + \frac n2

This is similar to the quantisation problem. It might be reasonable to drop the nn term (sometimes this is done, sometimes not). For the one dimensional case we have

H=12(1+ln2πσ2)H=\frac12(1+\ln2\pi\sigma^2)

or, if we drop the n=1n=1

  1. σ2=12πe2HGauss(x)\sigma^2=\frac1{2\pi}e^{2H_\text{Gauss}(x)}

Any practical consequences?

Def. (Estimation error):
For a random variable xx the estimation error of an estmator x^\hat x is

(xx^)2{\langle}(x-\hat x)^2{\rangle}

The best estimator is the mean, so the statisticians say. Therefore a lower bound to the estimation error is given by

  1. (xx^)2(xx)2=σ2=12πe2HGauss(x)12πe2H(x){\langle}(x-\hat x)^2{\rangle}\geqslant{\langle}(x-{\langle}x{\rangle})^2{\rangle}=\sigma^2 =\frac1{2\pi}e^{2H_\text{Gauss}(x)}\geqslant\frac1{2\pi}e^{2H(x)}.

The lasst inequality followed from Eq. (29).

Example (Info rate continuous continuous-time Gaussian process):

Up to an additive constant the entropy of a multivariate Gaussian was78

$H=\frac12\ln|{\underline K}|=\frac12\tr\ln{\underline K}=\frac12\tr\ln{\underline \Lambda}=\frac12\sum_k\ln\lambda_k$.

First let us observe the process for ever x=[x(),...,x()]\vec x=[x(-\infty),...,x(\infty)], a bi-infinte series with countable elements. The elements of the covariance matrix K̲ij=c(i*dtj*dt){\underline K}_{ij}=c(i*dt-j*dt). The orthogonal eigen-function for the continous covariance operator on tIRt\in{I\!\!R} are the Fourier bases. It can be shown that in the continuum limit

H=dflnλ(f)H=\int df\,\ln\lambda(f)

The result is due to Kolmogorov see also (???,Golshani and Pasha (2010)).

5.5 Linear stimulus reconstruction and a lower bound on the information rate (decoding view)

Without a complete probablilistic description of the model the mutual information can not be calculated. And even with a model the involved integrals may not be tracktable. At least two strategies to estimate it exist, though: Either, create a statistical ensemble of inputs and outputs by stimulation, followed by (histogram based) estimation techniques for the mutual information; or, find bounds on the information that can be evaluated more easily. In general, the estimation of mutual information from empirical data is difficult, as the sample size should be much larger than the size of the alphabet. Indeed, each element of the alphabet should be sampled multiple times so that the underlying statistical distribution can, in principle, be accurately estimated. But this prerequisite is often violated, so some techniques of estimating the information from data directly rely on extrapolation (???). The problem becomes particularly hairy when the alphabet is continuous or a temporal processes had to be discretised, resulting in large alphabets.

Another approach, which will allow us to perform a theoretical analysis of phase dynamics, relies on a comparison of the neuronal ``channel’’ to the continuous Gaussian channel (???,Cpt.~13) is analytically solvable (Cover and Thomas 2012). The approach can be used to estimate the information transmission of neuronal models (???). Also experimental system have ben analysed in this way, e.g.:

  1. the spider’s stretch receptor (???);
  2. the electric sense of weakly electric fish (???) and paddle fish (???);
  3. the posterior canal afferents in the turtle (???).

It was prooven that in that this method leads to a guaranteed lower bound of the actual information transmitted (???).

If one has experimental control of the stimulus ensemble it can choosen to be a Gaussian process with a flat spectrum up to a cutoff as to not introduce biases for certain frequency bands. The mutual information between stimulus x(t)x(t) and response y(t)y(t) can be bound from below as

  1. M[x,y]=H[x]H[x|y]H[x]Hgauss[x|y],{M}[x,y] = {H}[x] - {H}[x|y] \geqslant {H}[x] - {H}_\text{gauss}[x|y],

Here, Hgauss[x|y]{H}_\text{gauss}[x|y] is the equivocation of a process with the same mean and covariance structure as the original decoding noise, but with Gaussian statistics. The conditional entropy of the stimulus given the response is also called reconstruction noise entropy. It reflects the uncertainty remaining about the stimulus when particular responses have been observed.

It turns out that the inequality in Eq. (33) also holds if the estimator is conditioned. Say from the output of the neuron we estimate its input

x^(t)=x^t[y].\hat x(t) = \hat x_t[y].

So if the process has a stationary variance

(x(t)x^(t))2x|yinfx^(x(t)x^(t))2x|y=(x(t)x(t)x|y)2x|y=e2Hgauss[x|y].{\langle}(x(t)-\hat x(t))^2{\rangle}_{x|y}\geqslant\inf\limits_{\hat x}{\langle}(x(t)-\hat x(t))^2{\rangle}_{x|y}={\langle}(x(t)-{\langle}x(t){\rangle}_{x|y})^2{\rangle}_{x|y}=e^{2{H}_\text{gauss}[x|y]}.

The second line uses the fact that in this case the optimal estimator is given by the conditional mean. We have the following bound on the equivocation

  1. ${H}[x|y] \leqslant {H}_\text{gauss}[x|y] \leqslant {{\textstyle\frac12}}\ln{\langle}(x(t)-\hat x(t))^2{\rangle}\leqslant\ln{\langle}n^2(t){\rangle},$

The deviation between stimulus and its estimate, n(t)=x(t)x^(t)n(t)=x(t)-\hat x(t), is treated as the noise process.

In order to obtain a tight bound the estimator x^(t)\hat x(t) should be as close to optimal as possible. For the case of additional information given by the response of the neural system y(t)y(t) to the process x(t)x(t), the estimator should make use of it, x^t[y]\hat x_t[y]. For simplicity one can assume it is carried out by a filtering operation, x^(t)=(f*y)(t)\hat x(t)= (f*y)(t) specified later (Gabbiani and Koch 1998). Like the whole system the noise process is stationary, and its power spectral density, Pnn(ω)P_{nn}(\omega), is

${H}_\text{gauss}[x|y]\leqslant {{\textstyle\frac12}}\ln{\langle}n^2(t){\rangle}={{\textstyle\frac12}}\int_{-\infty}^\infty\frac{{\mathrm{d}}\omega}{2\pi}\ln P_{nn}(\omega).$

Together

  1. M[x,y]12dω2πln(Pxx(ω)Pnn(ω)){M}[x,y]\geqslant \frac12\int_{-\infty}^\infty\frac{{\mathrm{d}}\omega}{2\pi}\;\ln\left(\frac{P_{xx}(\omega)}{P_{nn}(\omega)}\right)

So as to render the inequality in Eq. (35) as tight a bound as possible one should use the optimal reconstruction filter from yy to x^\hat x. In other words, it is necessary to extract as much information about xx from the spike train yy as possible.

The next step should be to find an expression for the noise spectrum, Pnn(ω)P_{nn}(\omega), based on the idea of ideal reconstruction of the stimulus. As opposed to the forward filter, the reconstruction filter depends on the stimulus statistics (even without effects such as adaptation). We seek the filter hh that minimises the variance of the mean square error

  1. n2(t)=(x(t)x^(t))2{\langle}n^2(t) {\rangle}= {\langle}(x(t) - \hat x(t))^2{\rangle}, with x^(t)=dτh(τ)y(tτ)\hat x(t)=\int{\mathrm{d}}\tau\,h(\tau) y(t-\tau).

Taking the variational derivative of the error w.r.t.
the filter (coefficients) h(τ)h(\tau) and equating this to zero one obtains the orthogonality condition for the optimal Wiener filter (???)

  1. n(t)y(tτ)=0{\langle}n(t) y(t-\tau){\rangle}= 0, τ\forall\tau.

Inserting the definition of the error, n(t)=x(t)x^(t)n(t)=x(t)-\hat x(t), into Eq. (38) yields

x(t)y(tτ)dτ1h(l)r(tτ1)r(tτ)=Rxy(τ)(h*Ryy)(τ)=0{\langle}x(t)y(t-\tau){\rangle}-\int{\mathrm{d}}\tau_1\,h(l) {\langle}r(t-\tau_1)r(t-\tau){\rangle}= R_{xy}(\tau) - (h\ast R_{yy})(\tau) = 0

In order to obtain hh we need to deconvolve the equation, which amounts to a division in the Fourier domain

  1. Pxy(ω)=H(ω)Pyy(ω)Hopt(ω)=Pxy(ω)Pyy(ω)P_{xy}(\omega)= H(\omega) P_{yy}(\omega)\Longrightarrow H^\text{opt}(\omega) = \frac{P_{xy}(\omega)}{P_{yy}(\omega)}.

To compute the mutual information rate, we now calculate the full auto-correlation of the noise when the filter is given by Eq. (39). For an arbitrary filter h(t)h(t), we have

Rnn(τ)=n(t)n(t+τ)=n(t)x(t+τ)dτ1h(τ1)n(t)y(t+ττ1)R_{nn}(\tau)={\langle}n(t)n(t+\tau){\rangle}={\langle}n(t)x(t+\tau){\rangle}-\int{\mathrm{d}}\tau_1\,h(\tau_1) {\langle}n(t)y(t+\tau-\tau_1){\rangle}.

When the orthogonality condition of Eq. (38) holds, the right-most term vanishes. Proceeding by expanding the first term algebraically leads to an expression for the noise correlations

Rnn(τ)=n(t)x(t+τ)=Rxx(τ)dτ1h(τ1)Rxy(ττ1)R_{nn}(\tau)={\langle}n(t)x(t+\tau){\rangle}=R_{xx}(\tau)-\int{\mathrm{d}}\tau_1\,h(\tau_1)R_{xy}(\tau-\tau_1).

This expression can be Fourier transformed in order to obtain the required noise spectrum

Pnn(ω)=Pxx(ω)H(ω)Pxy(ω)=Pxx(ω)|Pxy(ω)|2Pyy(ω)P_{nn}(\omega)=P_{xx}(\omega)-H(\omega)P_{xy}(\omega)=P_{xx}(\omega)-\frac{|P_{xy}(\omega)|^2}{P_{yy}(\omega)},

where the definition of the optimal filter, Eq. (39), was utilised. This result can then be inserted into Eq. (36) to obtain the following well known bound on the information rate (???,lindner2005j:mi,holden1976b,stein1972j:coherenceInfo)

  1. $\mathcal M[x,y]\geqslant-\frac12\int_{-\omega_c}^{\omega_c}\frac{{\mathrm{d}}\omega}{2\pi}\ln\left(1-\frac{|P_{xy}(\omega)|^2}{P_{xx}(\omega)P_{yy}(\omega)}\right)$.

This information bound involves only spectra and cross-spectra of the communication channel’s input and output processes which are experimentally measurable in macroscopic recordings . The channel, in this case the neuron, can remain a black box. But since we can bridge the divide between microscopic, biophysical models and their filtering properties, we will, in the following section, derive the mutual information rates.

Def. (spectral coherence):
The expression in Eq. (40) is termed the squared signal response coherence
  1. c2(ω)=|Pxy(ω)|2Pxx(ω)Pyy(ω){c}^2(\omega)=\frac{|P_{xy}(\omega)|^2}{P_{xx}(\omega)P_{yy}(\omega)}.

It quantifies the linearity of the relation between xx and yy in a way that it equals 1 if there is no noise and a linear filter transforms input to output. Both nonlinearities and noise reduce the coherence. The coherence can be estimated from data using the FFT algorithm and spectrum estimation. It is implemented in the free software packages scipy and matplotlib.

What renders the coherence a useful quanity? While the cross-spectrum informs us when stimulus and output have correlated power in a spectral band, the normlisation with the output auto-spectrum can be crucial. Say we find a particular high power in Pxy(ω)P_{xy}(\omega), this may not be ralted to the stimulus but could just be the intrinsic frequency of the neuron itself.

The coherence is a quantity without mention of the explicite decoding filter, in fact it is symmetric in input and output just as the mutual information. This is beneficial because one can now take the encoding view in the next chapter.

6 Linear response filter

The stimulus spectral density is given by the environment or controlled by the experimental setup, while cross- and output spectra need to be measured or calculated from the model in question. In this lecture cross-spectral and spike train spectral densities are derived from phase oscillator, see Eq. (17), that are in turn derived from biophysical models. This means we do not treat the channel as a blackbox but assume a particular model.

The first quantity we need to calculate Eq. (41) is the cross-spactrum. On the one hand it is the Fourier of the cross-corrlation, on the other it can be written as averages of the Fourier transforms (FT and average are linear operation).

  1. Pyx(ω)=ỹ(ω)x̃*(ω)y|xx=ỹ(ω)y|xx̃*(ω)xP_{yx}(\omega)={\langle}{\langle}{{\tilde y}}(\omega){{\tilde x}}^*(\omega){\rangle}_{y|x}{\rangle}_x= {\langle}{\langle}{{\tilde y}}(\omega){\rangle}_{y|x}{{\tilde x}}^*(\omega){\rangle}_x.

What has happened here? The cross-spectrum can be obtained by averaging the Fourier transformed quantities over trials and the stimulus ensemble. The average can be split into the conditinal average over trials y|x{\langle}\cdot{\rangle}_{y|x}, given a fixed, frozen stimulus and the average over the stimulus ensemble, (x{\langle}\cdot{\rangle}_x). The former is essential an average over the encoding noise (Chacron, Lindner, and Longtin 2004,Lindner, Chacron, and Longtin (2005)).
Observe that ỹ(ω)y|x{\langle}\tilde y(\omega){\rangle}_{y|x} is Fourier transform of the trial averaged firing rate conditional on a frozen stimulus

r(t)=y(t)y|xr(t) = {\langle}y(t) {\rangle}_{y|x}.

Thus, it is sufficient to derive a filter that maps input x(t)x(t) to a firing rate, not an individual spike train.

Def. (forward, encoding filter):
Let g(t)g(t) be the filter kernel that maps stimulus into instantaneous firing rate
  1. r(t)=(g*x)(t)=tdrg(r)x(tr)r(t) = (g*x)(t) = \int_{-\infty}^t{\mathrm{d}}r g(r)x(t-r)

The filter is causal, since it is implemented by a differential equation and the Laplace transform yields

  1. R(s)=G(s)X(s)R(s) = G(s)X(s),

where G(s)G(s) denotes the transfer function of the encoding filter.

With this definition the cross-spectrum is written as

  1. Pyx(ω)=ỹ(ω)y|xx̃*(ω)x=G(iω)Pxx(ω)P_{yx}(\omega)={\langle}{\langle}{{\tilde y}}(\omega){\rangle}_{y|x}{{\tilde x}}^*(\omega){\rangle}_x = G({\mathrm{i}}\omega)P_{xx}(\omega).

This shows us that although we are computing the cross-spectrum of stimulus and spike train the response filter G(iω)G({\mathrm{i}}\omega) for the instantaneous firing rate suffices. This simple relation reminds us of the fact that the cross-spectrum is not really a second order quantity, but can be exactly determined by linear response theory. The spike train spectrum Pyy(ω)P_{yy}(\omega), on the other hand, is truly a second order quantity, viz, the Fourier transform of the auto covariance, and can not be related to the linear response filter without further approximations.

6.1 Instantaneous firing rate in the continuum limit

The instantaneous firing rate can be estimated via binning and trial averaging

4Δr(kΔ)4\Delta r(k\Delta): 0 3 1 0 0 0 0 3 1 0 0 0 1 2 1 0
Trials: 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 0
0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0
0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0

Two equivalent descriptions9 of Markov dynamics exist

  1. the path view,
  1. ẋ=u(x,t)=f(x)+g(x)ξ(t)\dot{x}= u(x,t) = f(x) + g(x)\xi(t);
  1. the ensemble view

ṗ(x,t)=\dot{p} (x,t)=?.

For (i) one can simulate many paths of a stochastic differential equation, with different intitial conditions and noise realisations. Histograms can provide the ensemble statistics. But it is also possible to find an evolution equation for the whole ensemble.

The relation between the two can be formally established by the

Def. (empirical measure):
Given a stochastic path realisation, x(t)x(t), from Eq. (46) the empirical measures is
  1. 𝜚(y,t)=δ(x(t)y)\varrho(y,t) = \delta(x(t)-y).

With all the lovely properties of a δ\delta-function.

The probability density

  1. p(y,t)=𝜚(y,t)p(y,t) = {\langle}\varrho(y,t){\rangle},

where the average is over all paths, x(t)x(t), and therefore over all realisations of the noise process ξ(t)\xi(t).

The chain rule yields

t𝜚(y,t)=ẋ(t)x𝜚(y,t)=y𝜚(y,t)u(x(t),t)\frac\partial{\partial t}\varrho(y,t)=\dot x(t)\frac\partial{\partial x} \varrho(y,t) =-\frac\partial{\partial y}\varrho(y,t)u(x(t),t)

Solving such a PDE involves time integration or other integral transformations (Fourier and Laplace’s). Since

dyδ(x(t)y)f(y)=f(x(t))=dyδ(x(t)y)f(x(t))\int{\mathrm{d}}y\,\delta(x(t)-y)f(y)=f(x(t))=\int{\mathrm{d}}y\,\delta(x(t)-y)f(x(t))

Therefore

  1. t𝜚(y,t)=y𝜚(y,t)u(y,t)=yf(y)𝜚(y,t)yg(y)ξ(t)𝜚(y,t)\frac\partial{\partial t}\varrho(y,t)=-\frac\partial{\partial y}\varrho(y,t)u(y,t) =-\frac\partial{\partial y}f(y)\varrho(y,t)-\frac\partial{\partial y}g(y)\xi(t)\varrho(y,t)

Averaging on both sides results in

tp(y,t)=yf(y)p(y,t)yg(y)ξ(t)𝜚(y,t)\frac\partial{\partial t}p(y,t)=-\frac\partial{\partial y}f(y)p(y,t)-\frac\partial{\partial y}g(y){\langle}\xi(t)\varrho(y,t){\rangle}.

The correlation between a stochastic process ξ(t)\xi(t) and a nonlinear functional of it is given by the Novikov-Furutsu-Donsker formula

  1. ξ(t)𝜚(y,t)=12δ𝜚δξ(t)=12yg(y)p(y,t){\langle}\xi(t)\varrho(y,t){\rangle}=-\frac12{\langle}\frac{\delta\varrho}{\delta\xi(t)}{\rangle}=-\frac12\frac\partial{\partial y}g(y)p(y,t)

All to gether we have the

Def. (Fokker-Planck equation):
The FPE correpsonding to Eq. (46) is
  1. tp(y,t)=12yg(y)yg(y)p(y,t)yf(y)p(y,t)\frac\partial{\partial t}p(y,t)=\frac12\frac\partial{\partial y}g(y)\frac\partial{\partial y}g(y)p(y,t)-\frac\partial{\partial y}f(y)p(y,t).

This is a diffusion equation. It can be written in the form

  1. tp(y,t)=yJ(y,t)\frac\partial{\partial t}p(y,t)=-\frac\partial{\partial y}J(y,t).

J(y,t)=f(y)p(y,t)12g(y)yg(y)p(y,t)J(y,t)=f(y)p(y,t)-\frac12g(y)\frac\partial{\partial y}g(y)p(y,t)

One needs boundary conditions and initial conditions to solve this PDE.

6.2 Phase flux = firing rate

For convenience rewrite the I/O-equivalent phase oscillator from Eq. (17) as

  1. ϕ̇=f0+z(ϕ)x(t)+σξ(t)\dot\phi={f_0}+z(\phi)x(t) + \sigma\,\xi(t).

Here, as opposed to Eq. (17) Z(ϕ)η(ϕ,t)\vec Z(\phi)\cdot\vec\eta(\phi,t) was split into the part that results from the presented stimulus, now denoted x(t)x(t), and the part that originated from, e.g. intrinsic noise. From Eq. (26) the perturbation vector has

η=(x(t)σ1(v(ϕ))ξ1(t))\vec\eta = \begin{pmatrix} x(t)\\\sigma_1(v(\phi))\xi_1(t)\\\vdots\end{pmatrix}.

As long as the intrinic noise is fast enough compared to the stimulus an averaging can be applied10 to obtain an effective diffusion

σ2=01dϕiσi2(vLC(ϕ))\sigma^2 = \int_0^1{\mathrm{d}}\phi\sum_i\sigma_i^2(v_\mathrm{LC}(\phi)),

which enters Eq. (53). The benefit is that the corresponding The Fokker-Planck equation

  1. tp(ϕ,t)=σ222ϕ2p(ϕ,t)ϕ(f0+Z(ϕ)x(t))p(ϕ,t)=ϕJ(ϕ,t)\frac\partial{\partial t}p(\phi,t)=\frac{\sigma^2}2\frac{\partial^2}{\partial\phi^2}p(\phi,t)-\frac{\partial}{\partial\phi}({f_0}+Z(\phi)x(t))p(\phi,t)=-\frac\partial{\partial\phi}J(\phi,t)

is tractable in a perturbation expansion. But first, remember what is the goal: Identification of the forward filter g(t)g(t) in r(t)=tdrg(r)x(tr)r(t)=\int_{-\infty}^t{\mathrm{d}}r\,g(r)x(t-r).

Phase density

r(t)=J(1,t)=(f0+Z(ϕ)x(t))p(ϕ,t)σ22ϕp(ϕ,t)|ϕ=1r(t)=J(1,t)=({f_0}+Z(\phi)x(t))p(\phi,t)-\frac{\sigma^2}2\frac\partial{\partial\phi}p(\phi,t)\Big|_{\phi=1}

The equation is solved with the following perturabtion ansatz

  1. p(ϕ,t)=p0(ϕ)+ipi(ϕ,t)p(\phi,t)=p_0(\phi) + \sum_i p_i(\phi,t),

with the normalisation requirement

  1. dϕp0(ϕ)=1\int{\mathrm{d}}\phi\,p_0(\phi)=1 and iIN,tIR:dϕpi(ϕ,t)=0\forall i\in{I\!\!N},t\in{I\!\!R}:\int{\mathrm{d}}\phi\,p_i(\phi,t)=0.

The pi(ϕ,t)p_i(\phi,t) are assumed small correctin terms, given that the stimulus x(t)x(t) is also small.

at the same time one gets

J(ϕ,t)=J0(ϕ)+iJi(ϕ,t)J(\phi,t) = J_0(\phi) + \sum_iJ_i(\phi,t)

In perturbation theory one solves iteratively the equations of the same order

O(0)O(0):
For the lowest order the stimulus x(t)=0x(t)=0 the FPE is

ṗ0=σ222ϕ2p0f0ϕp0\dot p_0 = \frac{\sigma^2}2\frac{\partial^2}{\partial\phi^2}p_0-{f_0}\frac\partial{\partial\phi}p_0.

This equation is homogeneous so we find a time independent steady state solution

p0(ϕ)=1p_0(\phi)=1.

One may test this by back insertion. Here both the boundary conditions and Eq. (56) are inforced. The solution can be inserted into the definition of the flux to obtain the zeroth order flux

J0=f0p0=f0J_0 = f_0p_0 = f_0

The next order involves the time-dependent stimulus

O(x)O(x):
Note that mulitplying two terms of order O(x)O(x) yields a term of order O(x2)O(x^2) and is discarded. One is left with

ṗ1=σ222ϕ2p1f0ϕp1p0x(t)ϕZ(ϕ)\dot p_1= \frac{\sigma^2}2\frac{\partial^2}{\partial\phi^2}p_1 -{f_0}\frac{\partial}{\partial\phi}p_1 -p_0x(t)\frac{\partial}{\partial\phi}Z(\phi)

To turn the PDE into an algebraic equation one can apply both the Fourier series expansion and the Laplace transform. For this the Laplace transform of the stimulus is denoted as X(s)X(s) and the periodic function Z(ϕ)=k=ckei2πkϕZ(\phi)=\sum_{k=-\infty}^\infty c_ke^{{\mathrm{i}}2\pi k\phi}

sP1(k,s)=(2πkσ)22P1(k,s)f0i2πkP1(k,s)X(s)i2πkcksP_1(k,s)=-\frac{(2\pi k\sigma)^2}{2}P_1(k,s)-{f_0}{\mathrm{i}}2\pi kP_1(k,s)-X(s){\mathrm{i}}2\pi kc_k

Solving for P1(ϕ,s)P_1(\phi,s)

P1(ϕ,s)=k=P1(k,s)ei2πkϕ=k=i2πkckX(s)ei2πkϕs+(2πkσ)2/2+i2πkf0P_1(\phi,s)=\sum_{k=-\infty}^\infty P_1(k,s)e^{{\mathrm{i}}2\pi k\phi}=\sum_{k=-\infty}^\infty\frac{{\mathrm{i}}2\pi kc_kX(s)e^{{\mathrm{i}}2\pi k\phi}}{s+(2\pi k\sigma)^2/2 + {\mathrm{i}}2\pi k{f_0}}

For brevity define the pole νk=(2πkσ)2/2i2πkf0\nu_k=-(2\pi k\sigma)^2/2-{\mathrm{i}}2\pi k{f_0}

The first order flux is

J1(k,s)=f0P1(k,s)+ckX(s)i2πkσ22P1(k,s)J_1(k,s)={f_0}P_1(k,s)+c_kX(s)-\frac{{\mathrm{i}}2\pi k\sigma^2}2P_1(k,s)

=f0P1(k,s)+sνki2πkP1(k,s)i2πkσ22P1(k,s)={f_0}P_1(k,s)+\frac{s-\nu_k}{{\mathrm{i}}2\pi k}P_1(k,s)-\frac{{\mathrm{i}}2\pi k\sigma^2}2P_1(k,s)

and

i2πkJ1(k,s)=i2πkf0P1(k,s)+(2πkσ)22P1(k,s)+(sνk)P1(k,s)=sP1(k,s){\mathrm{i}}2\pi kJ_1(k,s)={\mathrm{i}}2\pi k{f_0}P_1(k,s)+\frac{(2\pi k\sigma)^2}2P_1(k,s)+(s-\nu_k)P_1(k,s)=sP_1(k,s)

J1(1,s)=k=scksνkX(s)J_1(1,s)=\sum_{k=-\infty}^\infty\frac{sc_k}{s-\nu_k}X(s)

Happily and consistently one finds

G(s)=k=scksνkG(s)=\sum_{k=-\infty}^\infty\frac{sc_k}{s-\nu_k}

The power spectrum correponds to the imaginary axis, G(iω)G({\mathrm{i}}\omega). The low frequency limit is

limω0G(iω)=c0=Z(ϕ)\lim_{\omega\to0}G({\mathrm{i}}\omega)=c_0={\langle}Z(\phi){\rangle}.

With ck=ak+ibkc_k=a_k+{\mathrm{i}}b_k, the high frequency limit is

limωG(iω)=k=ak=Z(0)\lim_{\omega\to\infty}G({\mathrm{i}}\omega)=\sum_{k=-\infty}^\infty a_k=Z(0).

7 Numerical continuation of fixpoints and orbits

In the past lectures functional consequences (e.g. filter properties) had been derived from the phase response curve of neurons. The PRCs particular shape (e.g. its mean or value at phi=0phi=0) had consequences on what the neuron can do computationally. Next we need do gain some insight into “how a PRC looks like in particular dynamical regimes”. These regimes are reached by changing some system parameter, e.g. the membranes leak conductance or capacitance, the composition of ion channels or their kinetic properties.

Often numerical solutions to nonlinear ordinary differential equations are found by (forward) time-integration. An interesting alternative is to track a found solution through parameter space, for which the solution must be persistent. If it is not, then a bifurcation occurs and one observes a qualitative change of the solution.

For book on bifurcation theory consult (Izhikevich 2007) and numerical bifurcation analysis (Kielhöfer 2011).

7.1 Continuation of fixed points

Asume for

ẋ=f(x,p)\dot{\vec x}=\vec f(\vec x,p)

there is a steady state solution

  1. f(x,p)=0\vec f(\vec x, p)=0

The fixpoint solution x(p)\vec x(p) depends on the parameter pp. The existence of the solution as a function of the parameter is governed by the implicite function theorem.

Implicite function theorem
Consider a system of equations

f(x,p)=0\vec f(\vec x,p)=0, with fIRnf\in I\!\!R^n, xIRn\vec x\in I\!\!R^n, pIRp\in I\!\!R and x,pfIRm×n+1\nabla_{x,p}\vec f\in I\!\!R^{m\times n+1}.

Let ff and x,pf\nabla_{x,p}\vec f be smooth near xx. Then if the Jacobian x,pf\nabla_{x,p}\vec f is nonsingular, \exists a unique, smooth solution family x(p)\vec x(p) such that

f(x(p),p)=0\vec f(\vec x(p),p)=0.

This establishes the existence of lines in a bifurcation diagram.
The method of continuation is a predictor-corrector method. In practice, assume the fixpoint is known for one particular parameter value p0p_0, then for a small change in parameter the solution is is predicted

Predictor step:
Taylor’s linearisation
  1. x(p+δp)x(p)+δpxp\vec x(p+\delta p)\approx\vec x(p)+\delta p\frac{\partial\vec x}{\partial p}.

To predict the new solution, the old solution is required and the derivative of the solution w.r.t. the parameter that is changed. How to compute the latter? Take the total derivative of Eq. (57) w.r.t. the parameter

(xf)xp+fp=0(\nabla_xf)\frac{\partial\vec x}{\partial p}+\frac{\partial\vec f}{\partial p}=0.

with formal solution

xp=(xf)1fp\frac{\partial\vec x}{\partial p}=-(\nabla_x\vec f)^{-1} \frac{\partial\vec f}{\partial p}.

If xf\nabla_x\vec f is full rank one can use some efficient linear algebra library to find the vector xp\frac{\partial\vec x}{\partial p} and back insert it into Eq. (58). For too large δp\delta p the predicted solution will be wrong. Yet, it is a good initial guess form which to find the correct version.

Corrector step:
Newton iterations to find the root of f(x,p)\vec f(\vec x,p)
  1. xn+1=xn(xf)1f(xn,p)\vec x_{n+1}=\vec x_n-(\nabla_x\vec f)^{-1}\vec f(\vec x_n,p)

Actually that Newton’s iterations are also obtained by linearisation

0=f(xn+1,p)=f(xn)+(xf)(xn+1xn)0=\vec f(\vec x_{n+1},p)=\vec f(x_n)+(\nabla_x\vec f)(\vec x_{n+1}-x_{n}),

which if solved for xn+1\vec x_{n+1} yields Eq. (59).

Convergence analysis of Newton iterations yields that with each newton iterations the number of correct decimal places doubles. Hence often a low number of iterations (3-7) suffice to achieve sufficient numerical accuracy.

Example:
Say f(x,p)=x2+p21f(x,p)=\sqrt{x^2+p^2}-1, then the solution branches are x(p)=±1p2x(p)=\pm\sqrt{1-p^2}. The linear stability analysis xf=xx2+p2\partial_xf=\frac x{\sqrt{x^2+p^2}} that

x>0xf>0x>0\to\partial_xf>0\to unstable
x<0xf<0x<0\to\partial_xf<0\to stable

Also pf=px2+p2\partial_pf=\frac p{\sqrt{x^2+p^2}} and thus px=px=p±1p2\partial_px=\frac px=\frac p{\pm\sqrt{1-p^2}}. What happens at p=±1p=\pm1 or x=0x=0?

Numerical continuation

Or more general: What happens if we find that the condition number of f\nabla\vec f explodes?

In the example above the branch switches its stability and it bends back in a fold or turning point. In general folds can occure with and without stability change.

7.2 Local bifurcations: What happens if f\nabla\vec f is singular?

Bifurcatin analysis is one big “case discrimination”

7.2.1 Folds and one-dimensional nullspaces

Recall some difinitions

Def. (nullspace, kernel):
The kernel or nullspace of a matrix J̲{\underline J} is

N(J̲)={xIRn|J̲x=0}N({\underline J})=\{\vec x\in I\!\!R^n|{\underline J}\vec x=0\}

Def. (range, image):
The range of a matrix J̲{\underline J} is

R(J̲)={y|xIRn:J̲x=y}R({\underline J})=\{\vec y|\exists x\in I\!\!R^n:{\underline J}\vec x=\vec y\}

Def:
Let QQ be the projection onto the range
Def:
Eigensystem at the fold. Let Jacobian matrix J̲=f(x(0),p(0)){\underline J}=\nabla\vec f(\vec x(0),p(0))

J̲rk=λkrk{\underline J}\vec r_k=\lambda_k\vec r_k and lkJ̲=λklk\vec l_k{\underline J}=\lambda_k\vec l_k.

So that the nullspace is spanned by l0\vec l_0.

This section considers dimN(J̲)=1\dim N({\underline J})=1. Hence, the implicite function theorem is not applicable. From the example above it is apparent that x(p)=\vec x'(p)=\infty at the bifurcation. The problem can be circumvented by defining a new “arclength parameter”, p(s)p(s). The bifurcation branch is then a parametric curve, (x(s),p(s))(\vec x(s),p(s)). Without loss of generality the bifurcation is to be at s=0s=0.

If the Jacobian matrix J̲=f{\underline J}=\nabla\vec f is rank-deficient the Lyapunov-Schmidt reduction can be applied. Intutiefely the problem is reduced from a high-dimensional, possibly infinite-dimensional, one to one that has as many dimension as the deffect of f\nabla\vec f.

The nullspace is spanned by the eigenvector, r0r_0 of J̲{\underline J}, corresponding to the eigenvalue 0.

Assume that ff is twice differentiable w.r.t. x\vec x, then differentiate f(x(s),p(s))=0\vec f(\vec x(s),p(s))=0 w.r.t. ss and evaluate at s=0s=0

ddsf(x(s),p(s))=xfx(s)+pfp(s)=J̲x(s)+pfp(s)\frac{\mathrm{d}}{{\mathrm{d}}s}\vec f(\vec x(s),p(s))=\nabla_x\vec fx'(s)+\partial_p\vec fp'(s)={\underline J}\vec x'(s) + \partial_p\vec fp'(s)

Let H̲=xf{\underline H}=\nabla\nabla_x\vec f be the Hessian tensor.

d2dsf=H̲x(s)x(s)+J̲x(s)+p2fp(s)+pfp(s)=0\frac{{\mathrm{d}}^2}{{\mathrm{d}}s}\vec f={\underline H}\vec x'(s)\vec x'(s)+{\underline J}\vec x''(s)+\partial^2_p\vec fp'(s)+\partial_p\vec fp''(s)=0

At s=0s=0 one has p(0)=0p'(0)=0 and x(0)=r0\vec x'(0)=\vec r_0. Projecting onto the left-eigenvector l0\vec l_0 to the eigenvalue 0. at s=0s=0 one finds

l0H̲r0r0+l0pfp(0)=0\vec l_0{\underline H}\vec r_0\vec r_0 + \vec l_0\partial_p\vec fp''(0)=0

or with pfR(J̲)\partial_p\vec f\not\in R({\underline J})

p(0)=l0H̲r0r0l0pfp''(0)=-\frac{\vec l_0{\underline H}\vec r_0\vec r_0}{\vec l_0\partial_p\vec f}.

This is a test function of wether the bifurcation is

  1. subcritical (p(0)<0p''(0)<0)
  2. superctirical (p(0)>0p''(0)>0)
  3. transcritical

Folds

Def:
Let PP be the projection onto the null space

There are several cases to be distinguised.

The fold:
If the rank deficiency is one-dimensional, dimN(J̲)=1\dim N({\underline J})=1.
The Andronov-Hopf:
If the rank deficiency is two-dimensional.

7.3 Stability exchange

At a simple fold one eigenvalue is zero. Study the eigen system of J̲=f(x(s),p(s)){\underline J}=\nabla\vec f(\vec x(s),p(s)) near s=0s=0

J̲(r0+w(s))=λ(s)(r0+w(s)){\underline J}(\vec r_0+\vec w(s)) = \lambda(s)(\vec r_0+\vec w(s)).

With a bit of analysis

λ(0)=l0pf(x(0),p(0))p(0)\lambda'(0)=-\vec l_0\cdot\partial_p\vec f(\vec x(0),p(0))p''(0)

7.3.1 Extended system

Continue the extened system (linear and nonlinear)

f(x(p),p)=0\vec f(\vec x(p),p)=0

f(x,p)w=0\nabla\vec f(\vec x,p)\vec w=0

(f(x,p))𝖳z=0(\nabla\vec f(\vec x,p))^{{\mathsf{T}}}\vec z=0

7.4 Continuation of boundary value problems and periodic orbits

The same procedure than above can be applied to the continuation of periodic orbits and boundary value problems. Define the implicit function to encompass the time derivative

g(x,p)=ddtxTf(x,p)=0\vec g(\vec x,p)=\frac{\mathrm{d}}{{\mathrm{d}}t}\vec x-T\vec f(\vec x,p)=0, with t[0,1]t\in[0,1].

Then proceed as above. Note that the time derivative d/dt{\mathrm{d}}/{\mathrm{d}}t is a linear operator which has a matrix representation just like the Jacobian. in that sense

xddtx=ddtI̲\nabla_x\frac{\mathrm{d}}{{\mathrm{d}}t}\vec x=\frac{\mathrm{d}}{{\mathrm{d}}t}{\underline I}

8 PRC near the centre manifold

8.1 Dynamics in the centre manifold of a saddle-node

At arbitrary parameters the periodic soluiton adjoint to the first variational equation on the limit cycle yields the PRC. I can be calculated numerically with the continuation method. Near a bifurcation, however, if major parts of the dynamics happen in the centre manifold the PRC can be calculated analytically. As an example take the saddle-node on limit cycle bifurcation (SNLC). The spike in this case is a homoclinic orbit to a saddle-node, that enters and leaves via the semi-stable (centre) manifold that is associated with the eigenvalue λ0=0\lambda_0=0 of the Jacobian at the saddle.

Saddle-node on a limit cycle (SNLC). The dynamics on the stable manifold is fast ẋ=λ1x\dot x=\lambda_1x, while the dynamics along the centre subspace is slow ẋ=bx2\dot x=bx^2.

  1. ẋ=f(x)+pg(x)\dot{\vec x}=\vec f(\vec x)+p\,\vec g(\vec x)

(cv̇τṅ)=(II(n,v)n(v)n){c\dot v\choose\tau\dot n}={I-I(n,v)\choose n_\infty(v)-n}

Let there be a saddle-node bifurcation at some value p0p_0.

Expanding the right-hand side around saddle-node fixpoint, x0\vec x_0, yields (Ermentrout 1996)

f(x)=J̲(xx0)+H̲(xx0)(xx0)+...\vec f(\vec x)={\underline J}(\vec x-\vec x_0)+{\underline H}(\vec x-\vec x_0)(\vec x-\vec x_0)+...

Establish the eigen system at the saddle-node J̲=f(x0){\underline J}=\nabla\vec f(\vec x_0)

lkJ̲=λklk\vec l_k{\underline J}=\lambda_k\vec l_k and J̲rk=λkrk{\underline J}\vec r_k=\lambda_k\vec r_k with ljrk=δjkl_j\cdot r_k=\delta_{jk}.

By assumption of a saddle-node the Jacobian has a simple zero with an associated eigenvector.

Def (centre subspace):
The subspace spanned by r0\vec r_0 is called the centre subspace or slow subspace.

Write the dynamics arround the saddle-node as x(t)=x0+yr0\vec x(t) = \vec x_0 + y\vec r_0. then Eq. (60) is

x0̇+ẏr0=yJ̲r0+pg(x0)+y2H̲r0r0\dot{\vec{x}_0}+\dot y\vec r_0=y{\underline J}\vec r_0+p\vec g(\vec x_0)+y^2{\underline H}\vec r_0\vec r_0.

Projecting this equation onto the left eigenvector l0\vec l_0 yields the isolated dynamics along the centre manifold:

  1. ẏ=a+by2\dot y = a + by^2 with a=pl0g(x0)a=p\vec l_0\cdot\vec g(\vec x_0) and b=l0H̲r0r0b=\vec l_0{\underline H}\vec r_0\vec r_0.
Note:
In the literature often yy is suggestively written as vv assuming the quadratic dynamics is a long the voltage dimension. However, it can be shown that the centre manifold of a conductance-based neuron is never parallel to the voltage dimension.

Solution of Eq (61)

The centre manifold is only tangential to the spiking limit cycle dynamics near the saddle. Although the proportion of time spend near the saddle is large at some point the trajectories depart so it is only locally valid.

The formal solution of Eq. (61) with initial condition y(0)=y(0)=-\infty is

  1. y(t)=abtan(abtπ/2)y(t)=\sqrt{\frac ab}\tan(\sqrt{ab}\,t-\pi/2).

Away from the saddle-node the slow manifold accelerates again and the variable yy explodes at π/2\pi/2. This blowing up is like a spike. So for the SNLC centre manifold one can think of the the spike at y=y=\infty and reset to y=y=-\infty. The time it takes from y=y=-\infty to y=y=\infty is finite and given by

Tp=πab{{T_\text{p}}}=\frac\pi{\sqrt{ab}}.

Note that for y(0)=0y(0)=0 it is

  1. y(t)=abtan(abt)y(t)=\sqrt{\frac ab}\tan(\sqrt{ab}\,t)

and

Tp=π2ab{{T_\text{p}}}=\frac\pi{2\sqrt{ab}}.

The bifurcation parameter enter in aa. If qg(x)=[c1I,0,0,...]𝖳q\vec g(\vec x)=[c^{-1}I,0,0,...]^{{\mathsf{T}}}, then the firing rate has the typical square-root scaling

f0=1Tp=π1bl00c1(II0){f_0}=\frac1{{T_\text{p}}}=\pi^{-1}\sqrt{b\vec l_{00}c^{-1}(I-I_0)}.

Let us check this numerically in the pratical.

8.2 PRC near the centre manifold

PRCs of simple 1D reset-models like quadratic equation of Eq (61) can be calculated. The PRC is defined as the phase gradient (Winfree 2001,Kuramoto (1984)) with respect to the state variables. In the unperturbed case we have ϕ̇=f0\dot\phi=f_0, where f0f_0 is the frequency. The PRC is

Z(ϕ)=ϕy=ϕ̇tx=f0ẏ=f0a+by2(f0ϕ)Z(\phi)=\frac{\partial\phi}{\partial y}=\dot\phi\frac{\partial t}{\partial x}=\frac{f_0}{\dot y}=\frac{f_0}{a+by^2({f_0}\phi)}.

Inserting the solution of Eq. ((???)) yields

Z(ϕ)=f0a+atan2(πf0ϕπ/2)=a1f0cos2(πf0ϕπ/2)Z(\phi)=\frac{f_0}{a+a\tan^2(\pi{f_0}\phi-\pi/2)}=a^{-1}{f_0}\cos^2(\pi{f_0}\phi-\pi/2)

=a1f0sin2(πf0ϕ)=f02a(1cos(2πf0ϕ))=a^{-1}{f_0}\sin^2(\pi{f_0}\phi)=\frac{f_0}{2a}(1-\cos(2\pi{f_0}\phi)).

Note that aa depends on the bifurcation parameter, yet bb does not. Hence it may be preferable to write

ZSNLC(ϕ)=b2π2f0(1cos(2πf0ϕ))Z_\mathrm{SNLC}(\phi)=\frac b{2\pi^2{f_0}}(1-\cos(2\pi{f_0}\phi)).

A similar calculation for at the saddle-node loop bifurcation yields

ZSNL(ϕ)=b2π2f0(1cos(πf0(1+ϕ)))Z_\mathrm{SNL}(\phi)=\frac b{2\pi^2{f_0}}(1-\cos(\pi{f_0}(1+\phi))).

9 Phase suscetability to channel noise near saddle-node loop bifurcations

The deterministic part of the conductance-based model neuron is

(v̇ȧi){\dot v\choose\dot a_i}

The Jacobian is J̲=F{\underline J}=\nabla\vec F has eigen system lkJ̲=λklk\vec l_k{\underline J} = \lambda_k\vec l_k and J̲rk=λkrk{\underline J}\vec r_k = \lambda_k\vec r_k, with lkrj=δkj\vec l_k\cdot\vec r_j=\delta_{kj}.

Proposition 1
Neuron models at a saddle-node on invariant cycle bifurcation, i.e., a simple zero eigenvalue, the centre (slow) manifold is given by

r0=(1ddva,k(v))\vec r_0={1\choose\frac{\mathrm{d}}{{\mathrm{d}}v}a_{\infty,k}(v)}.

For the usual strictly increasing activation curves, a,k(v)>0a_{\infty,k}'(v)>0, the centre manifold is not parallel to the voltage or any other state direction.

Proof 1
The special structure of the Jacobian in a conductance-based neuron model, a rank-1 updated diagonal matrix, allows for an explicit calculation of the eigenvector, e0e_0, corresponding to the eigenvalue λ0=0\lambda_0=0. This gives the direction of the centre manifold. At the stationary point this is r0=(1a,k(v))\vec r_0={1\choose a_{\infty,k}'(v)}. As long as the activation functions are stricktly monotonous function of the voltage, a,k(v)>0a_{\infty,k}'(v)>0, the centre manifold has a non-zero component in this gating direction.

The PRC for perturbation along the centre manifold of the SNIC bifurcation is long known. In order to understand perturbations originating from the gating dynamics of different ion channels we need the PRC for arbitrary perturbations.

Proposition 2
The vector of phase-response curves at a SNIC bifurcation is given by

Z(ϕ)=f12(1cos2πϕ)l0\vec Z(\phi)=f^{-\frac12}(1-\cos2\pi\phi)\vec l_0.

The PRC in all dimensions have the same shape, but their relative scaling differs.

Proof 2
See (Ermentrout and Kopell 1986,Ermentrout (1996),Brown, Moehlis, and Holmes (2004),Ermentrout, Glass, and Oldeman (2012)) for derivation of the phase response curve along the centre manifold of the SNIC. Due to Thm. 1. all dimensions will have the typical quadratic slowing near the saddle-node and can be therefore arbitrarily slower than any other state direction. An perturbation ek\vec e_k along the kthk^\mathrm{th} state direction, has according to Thm. 1. always components along the centre manifold (l0ek)r0(\vec l_0\cdot\vec e_k)\vec r_0. All other directions are exponentially contractive and fast compared to the slow dynamics on the centre manifold. A perturbation p\vec p can therefore be decomposed into

j(ljp)rj\sum_j(\vec l_j\cdot\vec p)\vec r_j.

Along the stable manifolds, λj<0\lambda_j<0, all (ljp)eλt(\vec l_j\cdot\vec p)\sim e^{\lambda t}. Compared to the slow dynamics along the centre manifold the exponential reset back to the limit cycle is instantaneous and, hence, does not affect the phase. It is known that along the centre manifold the quadratic dynamics yields a phase model with a phase-response curve that has the functional form 1cos2πϕ1-\cos2\pi\phi.

Theorem 3
For both the SNIC and the SNL bifurcations, the peak of the PRC resides at the saddle-node, i.e.,

dZdϕ(ϕ)|ϕ=ϕSN=0\frac{dZ}{d\phi}(\phi)|_{\phi=\phi_\mathrm{SN}}=0.

Proof 3
In the limit f0f\to0 the isochron density can be made arbitrarily high near the saddle-node. The phase susceptibility in this region will be maximal.

The following holds not only for homoclinic orbits approaching the saddle-node via the semistable centre manifold (the SNIC case), but any homoclinic orbit that approaches via any of the stable manifolds.

The Floquet modes are periodic solution to the first variational system on the limit cycle

  1. Wk(ϕ)=(J̲(ϕ)λkI̲)Wk(ϕ)\vec W_k'(\phi)=({\underline J}(\phi)-\lambda_k{\underline I})\vec W_k(\phi) and Zk(ϕ)=Zk(ϕ)(λkI̲J̲(ϕ))\vec Z_k'(\phi)=\vec Z_k(\phi)(\lambda_k{\underline I}-{\underline J}(\phi))
Theorem 4
For homoclinic orbits attached to a saddle-node the tangent plain to the isochrons at the saddle is spanned by the stable manifolds of the saddle, i.e.
  1. J(ϕSN)Wk(ϕSN)=λkWk(ϕSN)J(\phi_\mathrm{SN})W_k(\phi_\mathrm{SN})=\lambda_kW_k(\phi_\mathrm{SN}).

Hence, Wk(ϕSN)\vec W_k(\phi_\mathrm{SN}) is an eigenvector to the Jacobian. The tangent space to the isochrons is thus T={rk:J̲SNrk=λkrkλk<0}T=\{\vec r_k : {\underline J}_\mathrm{SN}\vec r_k=\lambda_k\vec r_k\wedge\lambda_k<0\}. The PRC is then Z(ϕSN)T\vec Z(\phi_{\mathrm{SN}})\in T^\bot.

Proof 4
For this one shows that the linearised isochron at ϕ=ϕSN\phi=\phi_\mathrm{SN} is a solution to the right eigenvalue problem of the Jacobian at the saddle. Since according Thm. 3. the maximum of Z\vec Z resides at ϕ=ϕSN\phi=\phi_\mathrm{SN} Eq. (65) follows immediately from Eq. (64) with k=0k=0.

10 Event time series and phase descriptions

10.1 Synchronisation and phase locking

If biological oscillators interact their emitted event time series may synchronise. Start with a conductance based models

  1. xi=f(xi)+G(xi,xj)\vec x_i=\vec f(\vec x_i) + \vec G(x_i,x_j)

and couple them with synapses

G(xi,xj)=gsyn(vj)(viEsyn)G(x_i,x_j)=g_\mathrm{syn}(v_j)(v_i-E_\mathrm{syn}).

In Eq (17) the I/O equivalent phase oscillator was derived. Take two phase oscillators ii and jj that are I/O equivalent to Eq (66)

ϕ̇i=fi+Z(ϕi)G(ϕi,ϕj)\dot\phi_i = f_i + Z(\phi_i)G(\phi_i,\phi_j)

ϕ̇j=fj+Z(ϕj)G(ϕj,ϕi)\dot\phi_j = f_j + Z(\phi_j)G(\phi_j,\phi_i)

Def (phase locking)
Two oscillators ii and jj are phase locked (a form of synchronisation) if their phase difference ψij(t)=ϕi(t)ϕj(t)\psi_{ij}(t)=\phi_i(t)-\phi_j(t) is constant in time. This implies there should be a fixpoint ψ̇=0\dot\psi=0.

Two small (n=5n=5) all-to-all coupled networks, showing a fixed phase relation (SNL) and no locking (SNIC).

Def (phase locking index)
Phase locking can be quantified by evaluating the following temporal average

Lij=ei(ϕi(t)ϕj(t))L_{ij}={\langle}e^{{\mathrm{i}}(\phi_i(t)-\phi_j(t))}{\rangle}.

The evolution of the phase difference is

  1. ψ̇=Δf+Z(ϕj+ψij)G(ϕj+ψij,ϕj)Z(ϕj)G(ϕj,ϕj+ψij)\dot\psi=\Delta f+Z(\phi_j+\psi_{ij})G(\phi_j+\psi_{ij},\phi_j)-Z(\phi_j)G(\phi_j,\phi_j+\psi_{ij}).
Def (frequency detuning)
The difference in intrinsic frequencies of the oscillators Δf\Delta f is called frequency detuning.

If the frequency detuning, Δf\Delta f is small, then ψ\psi is a much slower variable than ϕi\phi_i and ϕj\phi_j. Therefore, The variable ϕj\phi_j in Eq. (67) traverses many periods before ψ\psi changes. In other words: ψ\psi “sees” only the average of ϕj\phi_j. One may apply averaging theory to define

H(ψ)=01dϕZ(ϕ+ψ)G(ϕ+ψ,ϕ)H(\psi)=\int_0^1{\mathrm{d}}\phi\,Z(\phi+\psi)G(\phi+\psi,\phi),

then

ψ̇=Δf+H(ψ)H(ψ)=Δf+Hodd(ψ)\dot\psi=\Delta f+H(\psi)-H(-\psi)=\Delta f+H^\mathrm{odd}(\psi).

Note that

HSNLCodd(ψ)=0H_\mathrm{SNLC}^\mathrm{odd}(\psi)=0

and

HSNLodd(ψ)=(1cosπ(1+ψ))(1cos(π(1+ψ)+π))=2cosπψH_\mathrm{SNL}^\mathrm{odd}(\psi)=(1-\cos\pi(1+\psi))-(1-\cos(\pi(1+\psi)+\pi)) = 2\cos\pi\psi,

only the latter has a fixpoint at ψ=1/2\psi=1/2

Stable and unstable fixpoints in the SNL coupling function

\Longrightarrow

  1. SNLC shows no synchronisation for Δf>0\Delta f>0
  2. SNL shows “antiphase” synchronisation. In a network this may lead to phenomena similar to frustration.

10.2 Spike reliability and stimulus entrainment

Not all neuronal populations are coupled. Some sensory neurons like auditory receptors do not have lateral inhibition and just project to the next processing stage. Still many of these neuron receive the same sensory input. Therefore, one can study two neurons that share a common input, or, equivalently one neuron that is stimulated in repetitive trials with the same protocol.

Take two neurons ii and jj. Each neuron has its own intrinsic noise-level σi\sigma_i, but all share and a common stimulus x(t)x(t) and mean firing rate ff,

  1. ϕ̇i=f+Z(ϕi)x(t)+σiξi(t)\dot\phi_i=f+Z(\phi_i)x(t)+\sigma_i\,\xi_i(t).

Remember the neuronal codes. A spike time code was a mapping from a time continuous stimulus x(t)x(t) to a spike train y(t)y(t), i.e., the ouput is a high dimensional vector of spike times. In the following the stimulus that is common to all neurons ii, is assumed to be a small amplitude zero-mean Gaussian process, x(t)=0{\langle}x(t){\rangle}=0, with wide-sense stationary covariance C(tr)=x(t)x(r)C(t-r)={\langle}x(t)x(r){\rangle}. The intrinsic noise has different realisation for each neuron.

Q:
How reliable is this mapping? How close in spike-train space are two stimuli? How well is an idealised up-stream neuron able to distinguish stimuli, based on the information it receives from the down-stream neurons?

These are decoding questions. They are quantified by looking at spike-train metrics (Rossum 2001). In a non-stationary environment on other question maybe useful to ask:

Q:
Given neurons do lock to a stimulus11, but that they are in a random state before the stimulus is presented: How long does it take to reliably lock?

This question it important for up-stream neuron, too, since it determines the minimal window of integration that is required.

Neurons12 are perfectly inphase-locking, if their phase difference is zero, ψ=ϕiϕj=0\psi=\phi_i-\phi_j=0, and stays that way, ψ̇=0\dot\psi=0. For simplicity look at the σ=0\sigma=0 case. WLOG take ϕj\phi_j as the reference. So, in the present case the phase difference evolves as

  1. ψ̇=(Z(ϕj+ψ)Z(ϕj))x(t)=g(ψ,t)\dot\psi=(Z(\phi_j+\psi)-Z(\phi_j))x(t)=g(\psi,t)

In a homogenous (time independent), system information about how fast the locked state is reached and how stable it is is given by the Lyaponov exponent, λ\lambda of the phase difference.

Def. (Lyapunov exponent):
For a deterministic and autonomous13 system and a small enough initial phase difference, the Lyapunov exponent is the reciprocal of the exponential attraction or divergence rate ψ(t)eλt\psi(t)\propto e^{\lambda t}, which is the solution of the linearised dynamics arround the phase fixpoint ψ0\psi_0:

ψ̇=g(ψ0)λψ\dot\psi=\underbrace{g'(\psi_0)}_\lambda\psi or λ=ddψlnψ|ψ=ψ0=g(ψ0)\lambda=\frac{\mathrm{d}}{{\mathrm{d}}\psi}\ln\psi|_{\psi=\psi_0}=g'(\psi_0).

Since there a time-continuous stimulus present one can only define an (time) averaged Lyapunov exponent

Def. (average Lyapunov exponent):
For a time-dependent system, the averaged Lyapunov exponent is
  1. λ=ddψlnψ\bar\lambda={\langle}\frac{\mathrm{d}}{{\mathrm{d}}\psi}\ln\psi{\rangle}

Assume that the neurons are already similar in their phase dynamics, then the right hand side phase difference, Eq. (69) can be expanded arround ϕj\phi_j to obtain

  1. ψ̇=ψZ(ϕj)x(t)\dot\psi=\psi Z'(\phi_j)x(t).

In the absence of intrinsic noise, σ=0\bar\sigma=0, the averaged Lyapunov exponent from Eq (70) is

λ=Z(ϕj)x(t)\bar\lambda={\langle}Z'(\phi_j)x(t){\rangle}.

Note that this is a case for the Novikov-Furutsu-Donsker formula, because Z(ϕ(t))Z'(\phi(t)) is a nonlinear function of a stochastic process, ϕ(t)\phi(t) that depends on the stochastic process x(t)x(t). Therefore,

$\bar\lambda={\langle}Z'(\phi_j)x(t){\rangle}=\int_{-\infty}^t{\mathrm{d}}r\,C(t-r)\left{\langle}\frac{\delta Z'(\phi_j(t))}{\delta x(r)}\right{\rangle}$.

With the chain-rule and the definition in Eq (68) this yields

λ=tdrC(tr)Z(ϕj(t))Z(ϕj(r))\bar\lambda=\int_{-\infty}^t{\mathrm{d}}r\,C(t-r){\langle}Z''(\phi_j(t))Z(\phi_j(r)){\rangle}.

There are different approaches to calculate the remaining average. In an ergodic system the ensemble average can be replaced by temporal averaging, =limT1T0Tdt{\langle}{\rangle}=\lim_{T\to\infty}\frac1T\int_0^T{\mathrm{d}}t, so one gets

λ=limT1T0TdttdrC(tr)Z(ϕj(t))Z(ϕj(r))\bar\lambda=\lim_{T\to\infty}\frac1T\int_0^T{\mathrm{d}}t\int^t_{-\infty}{\mathrm{d}}r\,C(t-r)Z''(\phi_j(t))Z(\phi_j(r)).

Further expansion in x(t)x(t), i.e. to lowest order ϕj(t)=f0t\phi_j(t)={f_0}t, one finds

λ=limT0TdttdrC(tr)Z(f0t)Z(f0r)\bar\lambda=\lim_{T\to\infty}\int_0^T{\mathrm{d}}t\int_{-\infty}^t{\mathrm{d}}r\,C(t-r)Z''({f_0}t)Z({f_0}r).

Example (white-noise stimulus)
In the special case of a white noise simulus, C(tr)=ϵ2δ(tr)C(t-r)=\epsilon^2\delta(t-r), one has

λ=limTϵ20TdtZ(f0t)Z(f0t)\bar\lambda=\lim_{T\to\infty}\epsilon^2\int_0^T{\mathrm{d}}t\,Z''({f_0}t)Z({f_0}t).

Since now we are dealing with periodic functions under the integral this is

λ=f0ϵ20f01dtZ(f0t)Z(f0t)\bar\lambda={f_0}\epsilon^2\int_0^{{f_0}^{-1}}{\mathrm{d}}t\,Z''({f_0}t)Z({f_0}t).

In phase variables this is

λ=ϵ201dϕZ(ϕ)Z(ϕ)=ϵ201dϕ(Z(ϕ))2\bar\lambda=\epsilon^2\int_0^1{\mathrm{d}}\phi\,Z''(\phi)Z(\phi)=-\epsilon^2\int_0^1{\mathrm{d}}\phi\,(Z'(\phi))^2.

In phase locking is guaranteed but the locked state is reached faster if the PRC has large derivatives, i.e. higher Fourier components!

Several alternative derivations exist (Teramae and Tanaka 2004,D. S. Goldobin and Pikovsky (2005),Denis S. Goldobin and Pikovsky (2005))

Note (intrinsic noise)
In the presence of intrinsic noise (σ>0\sigma>0) there is no perfect locking. Nontheless, the phase difference ϕ\phi may converge to a unimodal distribution, peaked around zero. Given a uniform phase density to start with: How long does it take to coverge to the stationary density of phase differences?

10.3 Inter-spike statistics

10.3.1 First passage time (no input and constant noise)

In the case σ>0\sigma>0 and x=0x=0 we have

ϕ̇=f0+σξ(t)\dot\phi={f_0}+\sigma\,\xi(t).

The adjoint Fokker-Planck equation

ṗ(ϕ,t)=σ22ϕ2p(ϕ,t)+fϕp(ϕ,t)\dot p(\phi,t)=\frac{\sigma^2}2\partial^2_\phi p(\phi,t)+f\partial_\phi p(\phi,t) s.t. BC p(1,t)=1p(1,t)=1

Laplace transform the equation

sP(ϕ,s)=σ22ϕ2P(ϕ,s)+fϕP(ϕ,s)sP(\phi,s)=\frac{\sigma^2}2\partial^2_\phi P(\phi,s)+f\partial_\phi P(\phi,s)

A solution is

P(ϕ,s)=exp{f0σ2(11+2sσ2/f02)(1ϕ)}P(\phi,s)=\exp\{\frac{{f_0}}{\sigma^2}(1-\sqrt{1+2s\sigma^2/{f_0}^2})(1-\phi)\}

The inverse Laplace transform of P(0,s)P(0,s) is the inverse Gaussian distribution

p(t)=exp{(tf01)22σ2t}σ2πtp(t)=\frac{\exp\{-\frac{(t{f_0}-1)^2}{2\sigma^2t}\}}{\sigma\sqrt{2\pi t}}.

Many neuron that do not have slow channels (adaptation, pumps, …) can be fitted with this ISI distribution.

10.3.2 Phase dependent noise

Ion channel noise is not constant throughout the inter spike interval. In a simple two state channel the voltage dependent noise variance is

σ2(v)=1Nn(v)(1n(v))\sigma^2(v)=\frac1Nn_\infty(v)(1-n_\infty(v)),

where NN is the number of ion channels and nn_\infty is the steady state activation. Hence one may which to analyse

ϕ̇=f0+σ(ϕ)ξ\dot\phi={f_0}+\sigma(\phi)\xi.

This can not be solved in general but the moments of the ISI distribution are simpler.

The following conserns equations for the statistics of waiting times as in Ref. (Gardiner 2004). Let y(t)y(t) be the process in question, denoting a voltage or a phase variable. The process starts at a particular y0y_0, i.e. the initial distribution is

  1. p(y,t0)=δ(yy0)p(y,t_0)=\delta(y-y_0),

and one is interested in the distribution of times for the process to reach y=y1y=y_1.

If the process is governed by a stochastic differential equation, then it was shown that the associated density p(y,t)p(y,t) is propagated by a specific evolution operator

  1. $\dot p(y,t)={{\mathcal F}}(y)p(y,t)$.

This equation was called the Fokker-Planck equation (FPE, see Eq (51)). Denote the solution of a homogeneous FPE with starting distribution concentrated at one value y0y_0 by p(y,t;y0,t0)p(y,t;y_0,t_0) such that p(y,t0;y0,t0)=δ(yy0)p(y,t_0;y_0,t_0)=\delta(y-y_0) and write its formal solution as

  1. $p(y,t;y_0,t_0) = e^{(t-t_0){{\mathcal F}}(y)}\delta(y-y_0)$.

The goal is to find a relation between the ISI distribution of the neuron model and the FP operator. For that assume the process lives in an interval (y1,y2)(y_1,y_2), where y2y_2 could denote the spike threshold and y1y_1 the resting potential to which an IF-like neuron resets, or the two boundaries encapsulating the periodic domain of the phase oscillator interval, e.g. y1=0y_1=0 and y2=1y_2=1. At time t0t_0, the system is supposed to start inside the interval, y1y0y2y_1\leqslant y_0\leqslant y_2. The probability at time t>t0t>t_0 of still being inside the interval (y1,y2)(y_1,y_2), and thus no spike occurring, is (Gardiner 2004)

G(y0,t)=Pr(y1y(t)y2)=y1y2dỹp(ỹ,t;y0,t0)G(y_0,t)=\Pr(y_1\!\leqslant\!y(t)\!\leqslant\!y_2)=\int\limits_{y_1}^{y_2}{\mathrm{d}}\tilde y\;p(\tilde y,t;y_0,t_0),

with additional condition G(y0,t0)=1G(y_0,t_0)\!=\!1 because we started with y0[y1,y2]y_0\in[y_1,y_2]. The time derivative of G(y0,t)G(y_0,t), i.e. the change in the probability of remaining within (y1,y2)(y_1,y_2), at any given tt measures the exit rate or probability. It is also called

Def (first-passage time density)
Thinking of G(y0,t)G(y_0,t) as the number density of neurons in [y1,y2][y_1,y_2], the change of this density is the flux out of the interval:
  1. q(t,y0)=G(y0,t)tq(t,y_0)= \frac{\partial G(y_0,t)}{\partial t}.

The goal is to find an evolution equaiton for q(t,y0)q(t,y_0). With the help of the formal solution in Eq. (74), it can be shown that the inner product14 of h(y,t)=G(y,t)h(y,t)=G(y,-t) and p(y,t;y0,t0)p(y,t;y_0,t_0) is constant

h,p=dyh(y,t)p(y,t;y0,t0)=dydỹp(ỹ,t;y,t0)p(y,t;y0,t0){\langle}h,p{\rangle}=\int{\mathrm{d}}y\;h(y,t)p(y,t;y_0,t_0)=\iint{\mathrm{d}}y{\mathrm{d}}\tilde y\,p(\tilde y,-t;y,t_0)p(y,t;y_0,t_0)

$=\iint{\mathrm{d}}y{\mathrm{d}}\tilde y\,e^{-t{{\mathcal F}}(\tilde y)}\delta(\tilde y-y)e^{t{{\mathcal F}}(y)}\delta(y-y_0) =\int{\mathrm{d}}y\,\delta(y-y_0)=1$.

Note that the operator $e^{t{{\mathcal F}}}$ commutes with the identity operator δ(yỹ)\delta(y-\tilde y). Taking the time derivative of this constant and using $\dot p={{\mathcal F}}p$ one obtains

$\partial_t{\langle}h,p{\rangle}= {\langle}\dot h,p{\rangle}+{\langle}h,\dot p{\rangle}={\langle}\dot h,p{\rangle}+{\langle}{{\mathcal F}}^\dagger h,p{\rangle}=0$.

Because pp may change according to its initial conditions, the last expression implies that $\dot h=-{{\mathcal F}}^\dagger h$, or that G(y,t)G(y,t) is a solution to the equation (Gardiner 2004)

$\dot G(y,t)={{\mathcal F}}^\dagger G(y,t)$, s.t. G(y,T0)=1I[y1,y2](y)G(y,T_0)=1\!\mathrm{I}_{[y_1,y_2]}(y).

The adjoint operator ${{\mathcal F}}^\dagger$ is also called the infinitesimal generator of the stochastic process. In addition to the boundary condition above, trivially stating that if we start in the boundary the initial probability of inside is one, one may include reflecting boundary conditions at the lower end yG(y,t)|y=y1=0\partial_yG(y,t)|_{y=y_1}=0 and absorbing boundary conditions at the upper end G(y2,t)=0G(y_2,t)=0.

Because partial derivatives and integrals are both linear operators, the equation for qq directly reads the same

  1. $\dot q(y,t)={{\mathcal F}}^\dagger G(y,t)$,

just he bourndary condtion should read q(y2,t)=1q(y_2,t)=1.

Since one of the main objectives in this document is to establish links between the microscopic noise sources such as channel noise and the macroscopic spike jitter one may immediate pose the question: How much information about the underlying diffusion process can we extract from first passage time densities like the ISI distribution? Might there be a unique diffusion process generating it? A sobering answer to the second question was given in (???): No—the solution is not unique, there are several possible diffusion processes that may lead to one and the same passage time density.

Yet, not all is lost. If one takes into account constrains from membrane biophysics, then diffusion process derived is not completely arbitrary. In fact, if the model is derived from first principles, then the free parameters in the model can be related to the ISI statistics.

10.3.3 Moments of the ISI distribution

Instead of attempting to obtain the complete ISI distribution by solving the adjoint Fokker-Planck equation, Eq (76) one may content oneself with the first two moments or the coefficient of variation, which one uses to quantify spike jitter in experiments. Let us set t0=0t_0=0 and Denote the nthn^\text{th} moment of the ISI distribution as (Gardiner 2004)

Tn(y)=0dττnq(τ,y)=0dττn1G(y,τ)T_n(y)=\int_0^\infty{\mathrm{d}}\tau\;\tau^n q(\tau,y)=-\int_0^\infty{\mathrm{d}}\tau\;\tau^{n-1}G(y,\tau),

where the fact was used that for the finite interval [y1,y2][y_1,y_2] exit is guaranteed, i.e., G(y0,)=0G(y_0,\infty)=0. one may multiply both side of Eq. (76) with tnt^n and integrate to obtain a recursive ordinary differential equation for the moments

  1. $nT_{n-1}+{{\mathcal F}}^\dagger T_n=0$, s.t. Tn(y1)=Tn(y2)=0T_n'(y_1)=T_n(y_2)= 0 and T0=1T_0=1.

Here we have imposed reflecting boundary conditions on the lower limit y1y_1 and absorbing boundary conditions on the upper limit y2y_2. These conditions are in agreement with an IF model, which once reaching the spike threshold is reset an unable to move inversely through the spike. As we discussed in the beginning of they can also be applied as an approximation to the phase oscillator if the noise is weak. This equation is also a direct consequence of the Feynman-Kac formula.

In Cpt.~ the Eq.~ will be used to calculate ISI moments of conductance based neurons using a phase reduction. Suppose we have an FP operator ${{\mathcal F}}(\phi)$ for the equivalent phase variable that is accurate to order εk\varepsilon^k in the noise. Then all moment, TkT_k, up to order the kthk^{\textrm{th}} can be obtained accurately. For example if one is interested in ISI variance, the method will require finding a suitable SDEs for the phase variable ϕ(t)\phi(t) that gives the FP operator to second order.

10.4 Moments of the inter-spike interval distribution

The nthn^\mathrm{th} moment, Tn(ϕ)|ϕ=1T_n(\phi)|_{\phi=1}, of the first passage time density is the solution to (Gardiner 2004)

nTn1+F(ϕ)Tn=0nT_{n-1}+F(\phi)T_{n}=0, s.t. BC: Tn(1)=0T_n(1)=0 and T0=1T_0=1,

where the Fokker-Planck backwards operator for the Stratonovich SDE in Eq ((???)) is

F(ϕ)=σ(ϕ)ddϕσ(ϕ)ddϕ+f0ddϕ.F(\phi)=\vec\sigma(\phi)\cdot\frac{\mathrm{d}}{{\mathrm{d}}\phi}\vec\sigma(\phi)\frac{\mathrm{d}}{{\mathrm{d}}\phi}+{f_0}\frac{{\mathrm{d}}}{{\mathrm{d}}\phi}.

Assuming that ϕ:ϵ(ϕ)=f01σ(ϕ)1\forall\phi:\epsilon(\phi)={f_0}^{-1}\vec\sigma(\phi)\ll1, solutions Tn(ϕ)=Tn1+Tn2+...T_n(\phi)=T_{n1}+T_{n2}+... can be sought in a perturbative manner.

10.5 Renewal equation

In a renewal process, all inter-spike intervals are independent, as though each is separately drawn form the ISI distribution. But slow kinetic processes in the neuronal dynamics or long-term correlations in the external stimulus could make the spike train have negative or positive correlations. A point process with such properties would be called a non-renewal.

The ISI distribution alone does not tell us about the correlation between consecutive interspike intervals. Are they independent, negatively or positively correlated? Several types of adaptation currents have time scales spanning orders of magnitude above the spiking period and indeed there contribution to ISI correlations have been analysed . But, for the sake of simplicity, we will ignore the effects on longer times scales and consider a spike train as arising form a renewal process.

In the following we treat the neuron as a threshold device such as an integrate-and-fire neuron or a phase model neuron. We compile here a few known results on renewal processes that we will need in later chapters (e.g., ).

The transition probability p(θ,t;y0,t0)p(\theta,t;y_0,t_0) describes probability a spike occurring at time t=t0t=t_0, when the neuron was in state y0y_0, is followed by a spike at time tt, when the neuron crosses threshold θ\theta. For a stationary renewal process, at any given time after a spike the transition probability p(θ,t)p(\theta, t) in the renewal case can be decomposed into Here qθ(τ)q_\theta(\tau) is shorthand for the interspike interval distribution from y0y_0 at t0=0t_0=0 to threshold θ\theta, corresponding to the transition probability p(θ,τ;x0,0)p(\theta,\tau;x_0,0). The second equality is due to stationarity, which implies a convolution. The spike autocorrelation C(τ)C(\tau) is the probability that given a spike at tt there is an other spike at t+τt+\tau. This is equivalent to the transition probability C(τ)=p(θ,τ;θ,0)C(\tau)=p(\theta,\tau;\theta,0) of being back at the spike threshold after τ\tau times has elapsed. By recursively splitting the transition probability in Eq.~ into all consecutive possible spiking events one ends with The typical approach to isolate the ISI density from Eq.~ is by means of Laplace’s transform f̃(s)=0dtestf(t)\tilde f(s)=\int_0^\infty{\mathrm{d}}t\,e^{-st} f(t), with sICs\in{I\!\!C}, then In some cases the result may be transformed back into time domain, if the Mellin-Bromwich integral exists, that is. The constant cc is to be chosen to the right of the real parts of all the integrand’s singularities. In cases where this integral can not be evaluated explicitly one is stuck with an expression in the Laplace domain, which is not all that bad, as at least individual moments of the time domain distribution as well as the spike power spectrum may be evaluated. Moments are given by The power spectrum of a stationary renewal spike train is the Fourier transform of the spike train autocorrelation Eq.~. First, we can identify C(τ)C(\tau) again in the infinite series and write Then, the Lapace transform can be applied to this linear Volterra integral equation to solve for the spiketrain spectrum

10.5.0.1 The first moment (n=1n=1)

The equation for the first moment is

  1. 1+σ(ϕ)ddϕσ(ϕ)dT1dϕ+f0dT1dϕ=01+\vec\sigma(\phi)\cdot\frac{\mathrm{d}}{{\mathrm{d}}\phi}\vec\sigma(\phi)\frac{{\mathrm{d}}T_1}{{\mathrm{d}}\phi}+{f_0}\frac{{\mathrm{d}}T_1}{{\mathrm{d}}\phi}=0
$\mathcal O(\epsilon^0)$
The zeroth order equation reads 1+f0dT10dψ=01+{f_0}\frac{{\mathrm{d}}T_{10}}{{\mathrm{d}}\psi}=0 and the absorbing BC yields

T10=f01(1ϕ)T_{10}={f_0}^{-1}(1-\phi).

$\mathcal O(\epsilon^2)$
Collecting all second order terms from Eq. (78) and inserting T10T_{10}, results in σ(ϕ)ddϕσ(ϕ)dT10dϕ+f0dT12dϕ\vec\sigma(\phi)\cdot\frac{\mathrm{d}}{{\mathrm{d}}\phi}\vec\sigma(\phi)\frac{{\mathrm{d}}T_{10}}{{\mathrm{d}}\phi}+{f_0}\frac{{\mathrm{d}}T_{12}}{{\mathrm{d}}\phi} =f0dT12dϕf01σ(ϕ)σ(ϕ)=0={f_0}\frac{{\mathrm{d}}T_{12}}{{\mathrm{d}}\phi}-{f_0}^{-1}\vec\sigma(\phi)\cdot\vec\sigma'(\phi)=0. With the BC this is solved by

T12(ϕ)=f02ψ1dθσ(θ)σ(θ)=12f02[σ2(ϕ)σ2(1)]T_{12}(\phi)=-{f_0}^{-2}\int_\psi^1{\mathrm{d}}\theta\;\vec\sigma(\theta)\cdot\vec\sigma'(\theta)=\frac1{2{f_0}^2}[\sigma^2(\phi)-\sigma^2(1)].

10.5.0.2 The second moment (n=2n=2)

The second moment has to solve

  1. 2T1+σ(ϕ)ddϕσ(ϕ)dT2dϕ+f0dT2dϕ=02T_1+\vec\sigma(\phi)\frac{{\mathrm{d}}}{{\mathrm{d}}\phi}\vec\sigma(\phi)\frac{{\mathrm{d}}T_2}{{\mathrm{d}}\phi} +{f_0}\frac{{\mathrm{d}}T_2}{{\mathrm{d}}\phi}=0
$\mathcal O(\epsilon^0)$
To zeroth order the equation is 2T10+f0dT20dϕ=02T_{10}+{f_0}\frac{{\mathrm{d}}T_{20}}{{\mathrm{d}}\phi}=0, with solution

T20=f02(1ϕ)2T_{20}={f_0}^{-2}(1-\phi)^2.

$\mathcal O(\epsilon^2)$
Second order equation is 2T12+σ(ϕ)ddϕσ(ϕ)dT20dϕ+f0dT22dϕ=02T_{12}+\vec\sigma(\phi)\cdot\frac{\mathrm{d}}{{\mathrm{d}}\phi}\vec\sigma(\phi)\frac{{\mathrm{d}}T_{20}}{{\mathrm{d}}\phi}+{f_0}\frac{{\mathrm{d}}T_{22}}{{\mathrm{d}}\phi}=0, or 3σ2(ϕ)σ2(1)2σ(ϕ)σ(ϕ)(1ϕ)+f03dT22dϕ=03\sigma^2(\phi)-\sigma^2(1)-2\vec\sigma(\phi)\cdot\vec\sigma'(\phi)(1-\phi)+{f_0}^3\frac{{\mathrm{d}}T_{22}}{{\mathrm{d}}\phi}=0. This is solved by

T22=2f03ϕ1dθσ2(ϕ)+(σ2(ϕ)σ2(0))(1ϕ)T_{22}=2{f_0}^{-3}\int_{\phi}^1{\mathrm{d}}\theta\,\sigma^2(\phi)+(\sigma^2(\phi)-\sigma^2(0))(1-\phi).

The ISI variance is given by T22(0)+T20(0)(T10(0)+T12(0))2T_{22}(0)+T_{20}(0)-(T_{10}(0)+T_{12}(0))^2, which evaluates to Eq ((???)).

10.6 Spike auto-spectrum

To calculate the spectral coherence, Eq. (41), the spike auto-spectrum Pyy(ω)P_{yy}(\omega) is still needed for normalisation. In general this is complicated, but if the linearity assumption could be extended to the spike trains itself it helps. Assume

y(t)=y0(t)+drg(r)x(tr)y(t)=y_0(t) + \int{\mathrm{d}}r\,g(r)x(t-r).

Then trial averaging, y|x{\langle}\cdot{\rangle}_{y|x} yields a result consistent with our previous linear response setting

r(t)=r0+drg(r)x(tr)r(t)=r_0 + \int{\mathrm{d}}r\,g(r)x(t-r).

Take Y0(ω)+G(ω)X(ω)Y_0(\omega)+G(\omega)X(\omega) and assume that intrinsic noise and stimulus are uncorrelated

Pyy(ω)=(Y0(ω)+G(ω)X(ω))(Y0*(ω)+G*(ω)X*(ω))P_{yy}(\omega)={\langle}(Y_0(\omega)+G(\omega)X(\omega))(Y_0^*(\omega)+G^*(\omega)X^*(\omega)){\rangle}

=Py0y0(ω)+|G(ω)|2Pxx(ω)=P_{y_0y_0}(\omega)+|G(\omega)|^2P_{xx}(\omega)

10.7 Mutual entrainment

Assume two neuron ii and jj, whose spike-trains are y(ϕi(t))=kδ(ϕi(t)k)y(\phi_i(t))=\sum_k\delta(\phi_i(t)-k). The spike dynamics is represented by there I/O-equivalent phase oscillators

  1. ϕi=fi+Z(ϕi)y(ϕj)\phi_i=f_i+Z(\phi_i)y(\phi_j) and ϕj=fj+Z(ϕj)y(ϕi)\phi_j=f_j+Z(\phi_j)y(\phi_i)

10.7.1 Spike metric

10.7.2 Time to Spike

11 Appendix

11.1 Novikov-Furutsu-Donsker formula

A relation between Gaussian noise sources and functions of the state variables in stochastic systems is given by the Novikov-Furutsu-Donsker (NFD) formula. It examines the correlation of a stochastic process ξ(t)\xi(t) at a fixed instance in time, tt, and a function ff of x(t)x(t), which is an other stochastic process that depends on ξ(t)\xi(t) (???,(???),(???)). One of the advantages of the NFD formula is that it is applicable to systems with multiplicative noise, as they arise in several applications involving phase response curves. The result is the following

  1. ${\langle}f[\xi]\xi(t){\rangle}={{\textstyle\frac12}}\int_{-\infty}^\infty {\langle}\xi_t\xi_{t_1}{\rangle}\left{\langle}\left.\frac{{{\updelta}}f[\eta+\xi]}{{{\updelta}}\eta_{t_1}}\right|_{\eta=0}\right{\rangle}{\mathrm{d}}t_1$.

The script uses the formula several times so a formal and very compact derivation follows (???). In many physical examples only the values in the past t1tt_1\leqslant t, influence the functional ff and the integration range can be adjusted accordingly.

Note that this is related to fluctuation-dissipation relations in statistical physics. The state variable xx is a random process that in turn depends on past values of the random process ξ(t)\xi(t). One may, therefore, treat the function ff as a functional of the path ξt1:t1t\xi_{t_1}:\forall t_1\leqslant t. Assuming that the process ξ(t)\xi(t) has a zero mean function, the first step is to write this functional as a Taylor series15 around the deterministic function η(t)=0\eta(t)=0 (omitting the integral domain take it to be understood from -\infty to tt)

f[η+ξ]=f[η]|η=0+k=11k!dt1dtkξ(t1)ξ(tk)(δkf[η]δη(t1)δη(tk))|η=0=(expdtξ(t)δδη(t))f[η]|η=0f[\eta+\xi]=f[\eta]|_{\eta=0} + \sum_{k=1}^\infty \frac1{k!} \int\!\cdots\!\int{\mathrm{d}}t_1\cdots{\mathrm{d}}t_k\;\xi(t_1)\cdots\xi(t_k) \left(\frac{{{\updelta}}^kf[\eta]}{{{\updelta}}\eta(t_1)\cdots{{\updelta}}\eta(t_k)}\right)\Big|_{\eta=0}=\left(\exp{\int {\mathrm{d}}t'\,\xi(t')\frac{{{\updelta}}}{{{\updelta}}\eta(t')}}\right)f[\eta]\big|_{\eta=0}.

The last expression just a formal, compressed way of writing it using the definition of the exponential displacement operator. As f[η]f[\eta] is deterministic it can be yanked from any averaging over the noise ensemble, e.g.

${\langle}f[\eta+\xi]{\rangle}=\left{\langle}\left(\exp{\int {\mathrm{d}}t'\,\xi(t')\frac{{{\updelta}}}{{{\updelta}}\eta(t')}}\right)\right{\rangle}f[\eta]\big|_{\eta=0}$.

Hence, we can formally write

  1. ${\langle}\xi(t)f[\xi]{\rangle}=\left{\langle}\xi(t)\exp{\int {\mathrm{d}}t'\;\xi(t')\frac{{{\updelta}}}{{{\updelta}}\eta(t')}} \right{\rangle}f[\eta]\big|_{\eta=0}$

$=\frac{\left{\langle}\xi(t)\exp\int {\mathrm{d}}t'\;\xi(t')\frac{{{\updelta}}}{{{\updelta}}\eta(t')}\right{\rangle}} {\left{\langle}\exp\int{\mathrm{d}}t'\;\xi(t')\frac{{{\updelta}}}{{{\updelta}}\eta(t')}\right{\rangle}} {\langle}f[\eta+\xi]{\rangle}\big|_{\eta=0}$.

Next, the infinite dimensional Fourier transform of a stochastic process called the characteristic functional is introduced

$\Phi[\lambda]=\left{\langle}\exp\left({\mathrm{i}}\int{\mathrm{d}}t'\;\lambda(t')\xi(t')\right)\right{\rangle}$.

For the, by assumption, Gaussian process ξt\xi_t it is known to be the exponential of a quadratic form

Φ[λ]=exp(12dt1dt2λ(t1)C(t1,t2)λ(t2))\Phi[\lambda]=\exp\left(-\frac12\int{\mathrm{d}}t_1{\mathrm{d}}t_2\; \lambda(t_1)C(t_1,t_2)\lambda(t_2)\right),

which must be real, Φ[λ]IR\Phi[\lambda]\in{I\!\!R}, because the density is symmetric around η(t)=0\eta(t)=0.

With the help of the following identity

$\frac{{\langle}\xi(t)\exp{\mathrm{i}}\int{\mathrm{d}}t'\xi(t')\lambda(t'){\rangle}} {{\langle}\exp{\mathrm{i}}\int{\mathrm{d}}t'\xi(t')\lambda(t'){\rangle}} =\frac{{{\updelta}}}{{\mathrm{i}}{{\updelta}}\lambda} \ln\left{\langle}\exp\;{\mathrm{i}}\int{\mathrm{d}}t'\xi(t')\lambda(t')\right{\rangle}=\frac{{{\updelta}}}{{\mathrm{i}}{{\updelta}}\lambda}\ln\Phi[\lambda]$,

and a formal substitution δ/δη(t)iλ(t){{\updelta}}/{{\updelta}}\eta(t)\to{\mathrm{i}}\lambda(t) we may simplify Eq.~ to

Back substituting iλ(t)δ/δηt{\mathrm{i}}\lambda(t)\to{{\updelta}}/{{\updelta}}\eta_{t} we obtain Eq. (81).

Bibliography

Atick, Joseph J. 1992. “Could Information Theory Provide an Ecological Theory of Sensory Processing?” Network: Computation in Neural Systems 3 (2): 213–51. http://www.tandfonline.com/doi/abs/10.1088/0954-898X_3_2_009.

Barlow, HB. 1961. “Possible Principles Underlying the Transformations of Sensory Messages.” In Sensory Communication, edited by WA Rosenblith, 217–34. MIT Press.

Brown, Eric, Jeff Moehlis, and Philip Holmes. 2004. “On the Phase Reduction and Response Dynamics of Neural Oscillator Populations.” Neural Computation 16 (4): 673–715. doi:10.1162/089976604322860668.

Chacron, Maurice J., Benjamin Lindner, and André Longtin. 2004. “Noise Shaping by Interval Correlations Increases Information Transfer.” Physical Review Letters 92 (8). doi:10.1103/PhysRevLett.92.080601.

Chicone, Carmen. 2006. Ordinary Differential Equations with Applications. Springer Science & Business Media.

Cover, Thomas M., and Joy A. Thomas. 2012. Elements of Information Theory. John Wiley & Sons.

Ermentrout, Bard. 1996. “Type I Membranes, Phase Resetting Curves, and Synchrony.” Neural Computation 8 (5): 979–1001. http://www.mitpressjournals.org/doi/abs/10.1162/neco.1996.8.5.979.

Ermentrout, G. Bard, Leon Glass, and Bart E. Oldeman. 2012. “The Shape of Phase-Resetting Curves in Oscillators with a Saddle Node on an Invariant Circle Bifurcation.” Neural Computation 24 (12): 3111–25. doi:10.1162/NECO_a_00370.

Ermentrout, G., and N. Kopell. 1986. “Parabolic Bursting in an Excitable System Coupled with a Slow Oscillation.” SIAM Journal on Applied Mathematics 46 (2): 233–53. doi:10.1137/0146017.

Fox, Ronald F., and Yan-nan Lu. 1994. “Emergent Collective Behavior in Large Numbers of Globally Coupled Independently Stochastic Ion Channels.” Physical Review E 49 (4): 3421–31. doi:10.1103/PhysRevE.49.3421.

Gabbiani, Fabrizio, and Christof Koch. 1998. “Principles of Spike Train Analysis.” Methods in Neuronal Modeling 12 (4): 313–60. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.383.7571&rep=rep1&type=pdf.

Gardiner, C. W. 2004. Handbook of Stochastic Methods for Physics, Chemistry, and the Natural Sciences. 3rd ed. Springer Series in Synergetics. Berlin ; New York: Springer-Verlag.

Goldobin, D. S., and A. S. Pikovsky. 2005. “Synchronization of Self-Sustained Oscillators by Common White Noise.” Physica A: Statistical Mechanics and Its Applications, New Horizons in Stochastic ComplexityInternational Workshop on New Horizons in Stochastic Complexity, 351 (1): 126–32. doi:10.1016/j.physa.2004.12.014.

Goldobin, Denis S., and Arkady Pikovsky. 2005. “Synchronization and Desynchronization of Self-Sustained Oscillators by Common Noise.” Physical Review E 71 (4). doi:10.1103/PhysRevE.71.045201.

Goldwyn, Joshua H., and Eric Shea-Brown. 2011. “The What and Where of Adding Channel Noise to the Hodgkin-Huxley Equations.” PLoS Comput Biol 7 (11): e1002247. doi:10.1371/journal.pcbi.1002247.

Goldwyn, Joshua H., Nikita S. Imennov, Michael Famulare, and Eric Shea-Brown. 2011. “Stochastic Differential Equation Models for Ion Channel Noise in Hodgkin-Huxley Neurons.” Physical Review E 83 (4): 041908. doi:10.1103/PhysRevE.83.041908.

Golshani, Leila, and Einollah Pasha. 2010. “Rényi Entropy Rate for Gaussian Processes.” Information Sciences 180 (8): 1486–91. doi:10.1016/j.ins.2009.12.012.

Hodgkin, Alan L., and Andrew F. Huxley. 1952. “A Quantitative Description of Membrane Current and Its Application to Conduction and Excitation in Nerve.” The Journal of Physiology 117 (4): 500. http://www.ncbi.nlm.nih.gov/pmc/articles/pmc1392413/.

Izhikevich, Eugene M. 2007. Dynamical Systems in Neuroscience. MIT Press.

Kielhöfer, Hansjörg. 2011. Bifurcation Theory: An Introduction with Applications to Partial Differential Equations. Springer Science & Business Media.

Kuramoto, Y. 1984. Chemical Oscillations, Waves, and Turbulence. Springer Science & Business Media.

Laughlin, S. 1981. “A Simple Coding Procedure Enhances a Neuron’s Information Capacity.” Zeitschrift Fur Naturforschung. Section C, Biosciences 36 (9-10): 910–12.

Linaro, Daniele, Marco Storace, and Michele Giugliano. 2011. “Accurate and Fast Simulation of Channel Noise in Conductance-Based Model Neurons by Diffusion Approximation.” PLoS Comput Biol 7 (3): e1001102. doi:10.1371/journal.pcbi.1001102.

Lindner, Benjamin, Maurice J. Chacron, and André Longtin. 2005. “Integrate-and-Fire Neurons with Threshold Noise: A Tractable Model of How Interspike Interval Correlations Affect Neuronal Signal Transmission.” Physical Review E 72 (2). doi:10.1103/PhysRevE.72.021911.

Orio, Patricio, and Daniel Soudry. 2012. “Simple, Fast and Accurate Implementation of the Diffusion Approximation Algorithm for Stochastic Ion Channels with Multiple States.” PLoS ONE 7 (5): e36670. doi:10.1371/journal.pone.0036670.

Pezo, Danilo, Daniel Soudry, and Patricio Orio. 2014. “Diffusion Approximation-Based Simulation of Stochastic Ion Channels: Which Method to Use?” Frontiers in Computational Neuroscience 8: 139. doi:10.3389/fncom.2014.00139.

Rossum, M. C. W. van. 2001. “A Novel Spike Distance.” Neural Computation 13 (4): 751–63. doi:10.1162/089976601300014321.

Teramae, Jun-nosuke, and Dan Tanaka. 2004. “Robustness of the Noise-Induced Phase Synchronization in a General Class of Limit Cycle Oscillators.” Physical Review Letters 93 (20): 204103. doi:10.1103/PhysRevLett.93.204103.

Winfree, Arthur T. 2001. The Geometry of Biological Time. Springer Science & Business Media.


  1. Fortunately the jelly fish Aglanta digitalae does not care much for dogmas and encodes its swimming patterns in different action potential shapes

  2. Actually it does, if you bring energetic considerations into baring.

  3. There is also plenty of coding in graded potentials going on for example in some of Drosophila melanogaster’s neuron or your retina. C. elegans seems to do completely without spikes.

  4. For now this means that its an invertable matrix; t:R̲1(t)\forall t:\exists{\underline R}^{-1}(t)

  5. Four nucleobases A: Adenosine, G: Guanine, C: Cytosine, T: Tymine.

  6. Here, the alphabet is a firing rate f0IR{f_0}\in{I\!\!R}. It might be more reasonable to think about spike counts in a given time window, which is still a countable set.

  7. Remember the determinant is |K̲|=i=1nλi|{\underline K}|=\prod_{i=1}^n\lambda_i. So ln|K̲|=i=1nlnλi\ln|{\underline K}|=\sum_{i=1}^n\ln\lambda_i. In terms of the matrix-logarithm and the trace the determinant can be expressed as $|{\underline K}|=\exp\tr\ln{\underline K}$.

  8. Because the trace is invariant under similarity transforms $\tr{\underline K}=\sum_{i=1}^n\lambda_i$.

  9. Similar to the particle and wave duality in quantum mechanics.

  10. This can be done formally using Chapman-Enskog or adiabatic elimination procedures. The derivation may be included in to future.

  11. Not perfectly though, because there is the intrinsic noise, ξi(t)\xi_i(t).

  12. This may now refer to one and the same neuron presented with the same frozen stimulus time and again or a population of non-interacting very similar neurons, which get the same input.

  13. independent of tt

  14. The inner product is the inner product on a function space f,g=dxf(x)g(x){\langle}f,g{\rangle}=\int{\mathrm{d}}x\,f(x)g(x).

  15. We are expanding not a function nor a vector-valued function but a functional.