Python code for (numerically) exact solutions to the inverse Ising problem. This code will write out the equations for all desired constraints on correlations and outputted files can be put into any gradient descent solver. This code outputs both Python and MATLAB compatible files. For example, for a system of size 3, we can write the relevant equations

n = 3
terms = [zeros((n))==1,triu(ones((n,n))==1,1),get_3idx(n)]
write_eqns(n,0,terms,writeto="matlab")
write_eqns(n,0,terms,writeto="py")

This will produce two MATLAB files "tosolve3.m" and "probs3.m" and one Python file "tosolve3.py". The file "tosolve3.m" can be fed a vector of parameters where the first \(n(n-1)/2\) corresponds to the couplings \(J\) and last corresponds to the single triplet interaction term.

Python code for simulating and solving parameters of a Hawkes' process using maximum likelihood. I was a bit peeved that there wasn't much "off-the-shelf" for Python, so I was a bit worried about picking up the few that are available. Even more confusingly, the Hawkes demo I found implemented in MATLAB did not agree with another implemention I found by quantLullaby. That being said, this list might be useful reference list.

I haven't vetted all of these. As far as the limited comparison between quantLullaby's and my implementation that I did, they agree. I've implemented two different kernels so far, the exponential and power law kernels.

Basically, the Hawkes' process is a self-exciting point process with an intensity defined as

$$\lim_{\delta\rightarrow 0} \lambda (t) = \frac{1}{\delta} P(N(t+\delta)-N(t)>0) $$
which is to say that the integral of \(\lambda(t)\) is the number of events observed. For our case, the intensity takes the form
$$\lambda(t) = \mu + \alpha\sum_{i=1}^n g(t-t_i)$$
where the two example kernels
\[\begin{align}
g(t-t_i) &= \beta e^{-\beta(t-t_i)} \\
g(t-t_i) &= \frac{t_\mathrm{min}}{\beta-1} (t-t_i)^{1-\beta},\qquad \beta>1
\end{align}\]
typically pulls the \(\beta\) normalization constant in \(g\) into the prefactor \(\alpha\), but I prefer to keep them separate. This way, \(\alpha\) refers to the expected number of children an event can expect to produce in its lifetime.

One way of thinking of simulating the Hawkes' process is that one event happens out of the blue as given by \(\mu\) (or exogenously), and that begins a cascade of "children" as given by the kernel (thus the "self-exciting" process). Each of the children spawned by an initial seed produces their own set of children independently of the others. Thus, each event spawns a cascade independently of the others.

To be more concrete, we can see that for the exogenous, baseline process given by \(\mu\),
\[\begin{align}
\lambda_0(t) &= \mu
\end{align}\]
and we know that a process with a fixed probability of occurrence is simple Poisson and so the cumulative distribution function (which we need for

sampling) is
\[\begin{align}
P_0(t\leq t') = 1 - e^{-\mu\,t'}
\end{align}\]
Similarly, we can deduce that for the exponential kernel for event \(i\) beginning at time \(t_f\) is
\[\begin{align}
P_i(t\leq t') &= 1 - \exp\left( -\alpha \int_{t_f}^{t_f+t'} \beta e^{-\beta(t-t_i)}\,dt \right) \\
&= 1 - \exp\left( -\alpha (e^{-\beta(t_f-t_i)} -e^{-\beta(t_f-t_i+t')}) \right)
\end{align}\]
which we can solve analytically for \(P_i(t\leq t')\). Be careful to remember that this function does not exist for \(t\leq t_i\). I leave the rest and the power law form as an exercise for the reader :)

A straightforward algorithm here is

- Sample from \(P_0(t\leq t')\) to get the first event at \(t_i\) where \(i=1\).
- Sample from \(P_0\) and all \(P_i\) starting at \(t_i\) and pick the smallest valid time to get next event at \(t_{i+1}\). Increment \(i++\).
- Repeat Step 1 til stop criterion.

One thing to note is that in sampling \(P_i\), the \(\alpha\) parameter may make the distribution defective, that is, there is a nonzero probability that \(t' = \infty\). This just means that the event may not spawn any children at small \(\alpha\) values, and so we have to throw those out.

For the simulation, I sample from each of these processes independently to generate a time series. For example, if we wish to sample from both kernels for 100 units of time,

mu0, a0, b0 = [.5,.8,5.]
T = 100
expevents = sim_hawkes_thin(T, [mu0,a0,b0], kernel='exp')
powevents = sim_hawkes_thin(T, [mu0,a0,b0], kernel='pow')

It is always good to have some checks, and I'll describe a couple that I implemented and put into the code. One check is to make sure that we didn't make some mistake in the logic of sampling. One such scenario is to set a history and generate children. Given some time interval \(T\), the expected number of children is
\[\begin{align}
\int_0^T \lambda(t)\,dt &= \mu T + \alpha(1-e^{-\beta T})
\end{align}\]
I implement this as

T, mu, a, b = 0.5, 10, 1, 10
test2(10,[mu,a,b],kernel='exp')
Expected number of children:
1.000
Simulated number of children:
1.001 +/- 0.999

Well, that's good. A more direct test is to compute the average time til the next event. For this, we can use the nice relation that
\[\begin{align}
\langle t \rangle &= \int_0^\infty (1-P(t\leq t'))\,dt'
\end{align}\]
We determine some arbitrary history of events, and see what comes out:

events = array([1,2,3])
test1(events, [mu,a,b], kernel='exp', niters=40)
The average of 40 runs of the simulation returns a waiting time of
0.063 +/- 0.003
The integral returns
0.063 +/- 0.000

It seems that the simulation part works fine for the exponential kernel. The same tests for the power law kernel are below:

T, mu, t0, b, a = 10, 0.1, .01, 1.01, .5
test2(T, [mu,t0,b,a], kernel='pow')
test1(events, [mu,t0,b,a], kernel='pow', niters=40)
Expected number of children:
0.033
Simulated number of children:
0.032 +/- 0.177
The average of 50 runs of the simulation returns a waiting time of
9.778 +/- 0.291
The integral returns
9.540 +/- 0.000

Given the paramters above, the simulations match quite well with the data. As I turn up \(\alpha\), the average number of children generated, we will reach a critical point at \(\alpha=1\), and it becomes quite difficult to get reliable statistics.

It might be worthwhile to note here that I compared the distribution of inter-event times for quantLullaby's, MATLAB's, and my implementations, and I found that with large \(\alpha\), the critical regime, the MATLAB simulation disagreed with the other two; it gave longer wait times than expected. This is probably cause for concern...I'm not sure what the underlying algorithm is, but this is the regime in which numerical precision and algorithmic approximations become visible.

After generating these samples, we can solve for the parameters using maximum likelihood. The log-likelihood is of the form
\[\begin{align}
L(\theta) = -\int_0^t \lambda(t|\theta)dt +\int_0^t \log\lambda(t|\theta)\,dN(t)
\end{align}\]
where \(\theta\) stands for the relevant parameters like \(\mu, \alpha,\beta\) for the exponential kernel. I've had problems with SciPy's default

minimize

algorithm BGFS, but the simplex Nelder-Mead works well

soln = fmin(lambda params:
-loglikeli_pow(expevents, params, T), random.rand(3),
disp=False )

I ran this many different times for increasing values of \(T\) with the parameter values \(\mu,\alpha,\beta = .5,.8,5\). The results are below for the exponential kernel.

As expected, it converges toward the true values with increasing observation length \(T\). A more difficult check is to simulate on parameters closer to the critical regime. Below, \(\mu,\alpha,\beta = 1.3959,1.0075,1/0.5374\).

Back to top