At the intersection of physics and society...

There is an old saying that 'physics is what physicists do.' This doesn't sound very helpful, but it may be getting at an important point. Academic disciplines have a choice to define themselves either by their objects of study or by their style of inquiry. Physics (at its best, I would like to think) is firmly in the second camp. Physicists make it their business to ask certain kinds of questions about Nature, and to seek certain kinds of answers. 'Thinking like a physicist' means something, and we are proud to do it; it is this, above all else, that we try to convey to our students. We are the intellectual heirs of Galileo, taking seriously his evocative claim that the book of Nature is written in the language of mathematics.
- William Bialek, Searching for principles


Lee, Edward D., Daniels, Bryan C., Krakauer, David C. & Flack, Jessica C. "Collective memory in primate conflict implied by temporal scaling collapse." Journal of the Royal Society Interface (2017). Supplementary here.
Sethna, James P., Matthew K. Bierbaum, Karin A. Dahmen, Carl P. Goodrich, Julia R. Greer, Lorien X. Hayden, Jaron P. Kent-Dobias et al. "Deformation of crystals: Connections with statistical physics." arXiv preprint arXiv:1609.05838 (2016).
Lee, Edward D., Chase P. Broedersz, and William Bialek. "Statistical Mechanics of the US Supreme Court." Journal of Statistical Physics, April 10, 2015, 1-27. doi:10.1007/s10955- 015-1253-6. In Wired.

Work in progress

Statistical mechanics of US Congress.
Maximum entropy Python package.


I have worked with a number of wonderful people and friends across disciplines who have all been my mentors. Currently, my primary collaborators at Cornell are Itai Cohen and Giles Hooker and at the Santa Fe Institute Bryan Daniels, Jessica Flack, and David Krakauer.

Back to top

Python code for (numerically) exact solutions to the inverse Ising problem. This code will write out the equations for all desired constraints on correlations and outputted files can be put into any gradient descent solver. This code outputs both Python and MATLAB compatible files. For example, for a system of size 3, we can write the relevant equations
            n = 3
            terms = [zeros((n))==1,triu(ones((n,n))==1,1),get_3idx(n)]
This will produce two MATLAB files "tosolve3.m" and "probs3.m" and one Python file "". The file "tosolve3.m" can be fed a vector of parameters where the first \(n(n-1)/2\) corresponds to the couplings \(J\) and last corresponds to the single triplet interaction term.

Python code for simulating and solving parameters of a Hawkes' process using maximum likelihood. I was a bit peeved that there wasn't much "off-the-shelf" for Python, so I was a bit worried about picking up the few that are available. Even more confusingly, the Hawkes demo I found implemented in MATLAB did not agree with another implemention I found by quantLullaby. That being said, this list might be useful reference list.
I haven't vetted all of these. As far as the limited comparison between quantLullaby's and my implementation that I did, they agree. I've implemented two different kernels so far, the exponential and power law kernels.

Basically, the Hawkes' process is a self-exciting point process with an intensity defined as
$$\lim_{\delta\rightarrow 0} \lambda (t) = \frac{1}{\delta} P(N(t+\delta)-N(t)>0) $$ which is to say that the integral of \(\lambda(t)\) is the number of events observed. For our case, the intensity takes the form $$\lambda(t) = \mu + \alpha\sum_{i=1}^n g(t-t_i)$$ where the two example kernels \[\begin{align} g(t-t_i) &= \beta e^{-\beta(t-t_i)} \\ g(t-t_i) &= \frac{t_\mathrm{min}}{\beta-1} (t-t_i)^{1-\beta},\qquad \beta>1 \end{align}\] typically pulls the \(\beta\) normalization constant in \(g\) into the prefactor \(\alpha\), but I prefer to keep them separate. This way, \(\alpha\) refers to the expected number of children an event can expect to produce in its lifetime.

One way of thinking of simulating the Hawkes' process is that one event happens out of the blue as given by \(\mu\) (or exogenously), and that begins a cascade of "children" as given by the kernel (thus the "self-exciting" process). Each of the children spawned by an initial seed produces their own set of children independently of the others. Thus, each event spawns a cascade independently of the others.

To be more concrete, we can see that for the exogenous, baseline process given by \(\mu\), \[\begin{align} \lambda_0(t) &= \mu \end{align}\] and we know that a process with a fixed probability of occurrence is simple Poisson and so the cumulative distribution function (which we need for sampling) is \[\begin{align} P_0(t\leq t') = 1 - e^{-\mu\,t'} \end{align}\] Similarly, we can deduce that for the exponential kernel for event \(i\) beginning at time \(t_f\) is \[\begin{align} P_i(t\leq t') &= 1 - \exp\left( -\alpha \int_{t_f}^{t_f+t'} \beta e^{-\beta(t-t_i)}\,dt \right) \\ &= 1 - \exp\left( -\alpha (e^{-\beta(t_f-t_i)} -e^{-\beta(t_f-t_i+t')}) \right) \end{align}\] which we can solve analytically for \(P_i(t\leq t')\). Be careful to remember that this function does not exist for \(t\leq t_i\). I leave the rest and the power law form as an exercise for the reader :)

A straightforward algorithm here is
  1. Sample from \(P_0(t\leq t')\) to get the first event at \(t_i\) where \(i=1\).
  2. Sample from \(P_0\) and all \(P_i\) starting at \(t_i\) and pick the smallest valid time to get next event at \(t_{i+1}\). Increment \(i++\).
  3. Repeat Step 1 til stop criterion.
One thing to note is that in sampling \(P_i\), the \(\alpha\) parameter may make the distribution defective, that is, there is a nonzero probability that \(t' = \infty\). This just means that the event may not spawn any children at small \(\alpha\) values, and so we have to throw those out.

For the simulation, I sample from each of these processes independently to generate a time series. For example, if we wish to sample from both kernels for 100 units of time,
    mu0, a0, b0 = [.5,.8,5.]
    T = 100 
    expevents = sim_hawkes_thin(T, [mu0,a0,b0], kernel='exp')
    powevents = sim_hawkes_thin(T, [mu0,a0,b0], kernel='pow')
It is always good to have some checks, and I'll describe a couple that I implemented and put into the code. One check is to make sure that we didn't make some mistake in the logic of sampling. One such scenario is to set a history and generate children. Given some time interval \(T\), the expected number of children is \[\begin{align} \int_0^T \lambda(t)\,dt &= \mu T + \alpha(1-e^{-\beta T}) \end{align}\] I implement this as
    T, mu, a, b = 0.5, 10, 1, 10
    Expected number of children: 
    Simulated number of children:
    1.001 +/- 0.999
Well, that's good. A more direct test is to compute the average time til the next event. For this, we can use the nice relation that \[\begin{align} \langle t \rangle &= \int_0^\infty (1-P(t\leq t'))\,dt' \end{align}\] We determine some arbitrary history of events, and see what comes out:
    events = array([1,2,3])
    test1(events, [mu,a,b], kernel='exp', niters=40)

    The average of 40 runs of the simulation returns a waiting time of 
    0.063 +/- 0.003
    The integral returns 
    0.063 +/- 0.000
It seems that the simulation part works fine for the exponential kernel. The same tests for the power law kernel are below:
    T, mu, t0, b, a = 10, 0.1, .01, 1.01, .5
    test2(T, [mu,t0,b,a], kernel='pow')
    test1(events, [mu,t0,b,a], kernel='pow', niters=40)

    Expected number of children: 
    Simulated number of children:
    0.032 +/- 0.177
    The average of 50 runs of the simulation returns a waiting time of 
    9.778 +/- 0.291
    The integral returns 
    9.540 +/- 0.000
Given the paramters above, the simulations match quite well with the data. As I turn up \(\alpha\), the average number of children generated, we will reach a critical point at \(\alpha=1\), and it becomes quite difficult to get reliable statistics.

It might be worthwhile to note here that I compared the distribution of inter-event times for quantLullaby's, MATLAB's, and my implementations, and I found that with large \(\alpha\), the critical regime, the MATLAB simulation disagreed with the other two; it gave longer wait times than expected. This is probably cause for concern...I'm not sure what the underlying algorithm is, but this is the regime in which numerical precision and algorithmic approximations become visible.

After generating these samples, we can solve for the parameters using maximum likelihood. The log-likelihood is of the form \[\begin{align} L(\theta) = -\int_0^t \lambda(t|\theta)dt +\int_0^t \log\lambda(t|\theta)\,dN(t) \end{align}\] where \(\theta\) stands for the relevant parameters like \(\mu, \alpha,\beta\) for the exponential kernel. I've had problems with SciPy's default
algorithm BGFS, but the simplex Nelder-Mead works well
    soln = fmin(lambda params: 
                 -loglikeli_pow(expevents, params, T), random.rand(3),
                 disp=False )
I ran this many different times for increasing values of \(T\) with the parameter values \(\mu,\alpha,\beta = .5,.8,5\). The results are below for the exponential kernel.
As expected, it converges toward the true values with increasing observation length \(T\). A more difficult check is to simulate on parameters closer to the critical regime. Below, \(\mu,\alpha,\beta = 1.3959,1.0075,1/0.5374\).

Back to top