# Tracy-Widom Law: Fredholm determinants and Airy functions

This is a continuation of the previous post available here. This section is quite different from the previous post. Here we present a series of propositions. The proofs of the propositions are not present here. I may include them later. However the proofs of the propositions (except proposition 4) are not necessary in our proof.

In the previous post, we introduced Hermite polynomials and wave functions. We noticed the Hermite polynomials are orthogonal polynomials. There are a plethora of results involving orthonormal polynomials. We merely state only few of them that we will be needing in the our proof.

Proposition 1: Let $H_n(x)$ be the Hermite polynomial of degree $n$. Let $\psi_n(x)$ be the corresponding wave function. Let $\displaystyle K^{(n)}(x,y)=\sum_{k=0}^{n-1} \psi_k(x)\psi_k(y)$. We have the following results.

(a) $H_n(x+t)=\sum_{k=0}^{n} \binom{n}{k}H_k(x)t^{n-k}$

(b) $\displaystyle \frac1{\sqrt{n}}K^{(n)}(x,y)= \frac{\psi_n(x)\psi_{n-1}(y)-\psi_{n-1}(x)\psi_n(y)}{x-y}$

(c) $\displaystyle K^{(n)}(x,y)=\frac{\psi_n(x)\psi_n'(y)-\psi_n'(x)\psi_n(y)}{x-y}-\frac12\psi_n(x)\psi_n(y)$

(d) $\displaystyle \frac{d}{dx}K^{(n)}(x,x)=-\sqrt{n}\psi_n(x)\psi_{n-1}(x)$

Proof: Not included

Let us recall Theorem 3 once again.

Theorem 3: For any measurable subset $A$ of $\mathbb{R}$,

$\displaystyle P\left(\bigcap_{i=1}^n \{\lambda_i \in A\}\right) =1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!}\int_{A^c}\cdots\int_{A^c} \det_{i,j=1}^k K^{(n)}(x_i,x_j)\prod_{i=1}^k dx_i$

Let us replace $A$ by $A^c$ and rewrite the above result as

$\displaystyle P\left(\bigcap_{i=1}^n \{\lambda_i \not\in A\}\right) =1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!}\int_{A}\cdots\int_{A} \det_{i,j=1}^k K^{(n)}(x_i,x_j)\prod_{i=1}^k dx_i$

Now it is right time to see what exactly are we trying to show if we want to prove weak convergence. For that we need some basic idea about Airy function.

# Airy function

Airy function is defined by

$\displaystyle Ai(x)=\frac1{2\pi i}\int_{C} e^{s^3/3-xs}\,ds$

where $C$ is the contour of integration in the complex plane consisting of the ray joining $e^{-\pi i/3}\infty$ to origin and the ray joining the origin to $e^{\pi i/3}\infty$.

It turns out that Airy function is a very nice function. It is infinitely differentiable. The function satisfies the Airy differential equation:

$\displaystyle \frac{d^2y}{dx^2}-xy=0$

Airy Kernel is defined by $\displaystyle A(x,y)=\frac{Ai(x)Ai'(y)-Ai(y)Ai'(x)}{x-y}$ where the value of the airy kernel at $x=y$ point is determined by continuity.

The first few derivatives of the Airy function decreases exponentially fast. In fact we have the following proposition.

Proposition 2: Fix $x_0 \in \mathbb{R}$. Then there exists a constant $C>0$ possibly depending on $x_0$ such that

$\displaystyle \max(Ai(x),Ai'(x),Ai''(x))\le Ce^{-x} \ \ \ \ \forall \ x\ge x_0$

Proof: Proof is skipped for the time being.

With help of the above proposition we can now prove the following result for the Airy kernel.

Proposition 3: Fix $x_0 \in \mathbb{R}$. We have $\displaystyle \sup_{x,y\ge x_0} e^{x+y}A(x,y) < \infty$.

Now the question arises why Airy function? It turns out that the wave functions behave asymptotically like the Airy function. The precise statement is presented in next proposition.

Proposition 4: The domain of definition for both the Airy function and wave function can be extended analytically to complex plane. Suppose $u\in \mathbb{C}$. Fix $C\ge 1$. Put $t=2\sqrt{n}+\frac{u}{n^{1/6}}$. We have

$\displaystyle \lim_{n\to\infty} \sup_{u \in \mathbb{C}, |u|

Proof: This is most important proposition and it holds the crux of the proof. The proof uses steepest descent method which involves heavy complex analysis. So, unfortunately the proof is beyond our scope at least for the time being. However, I will surely add the proof later once I develop a good theory in complex analysis.

Remark: The Tracy-Widom law has two distinct parts. First we prove the vague convergence, that is, the distribution functions $F_n(t)$ converges to some function $F(t)$ for each $t$. However it is not known whether this function $F(t)$ is distribution or not (hence the convergence is vague). And then we show that $F(t)$ is actually a distribution by finding an equivalent form of $F(t)$ (this gives us the weak convergence). Unfortunately, the weak convergence won’t be covered in this series of posts.

Theorem 4: Fix $-\infty. Then

\displaystyle \begin{aligned} & \lim_{n\to \infty} P\left[n^{2/3}\left(\frac{\lambda_i}{\sqrt{n}}-2\right) \not\in (t,t'), \ \ i=1,2, \ldots, n\right] \\ & =1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!} \int_{t}^{t'}\cdots\int_{t}^{t'} \det_{i,j=1}^k A(x_i,x_j) \prod_{i=1}^k dx_i \end{aligned}

Theorem 5: (Vague convergence)

\displaystyle \begin{aligned} & \lim_{n\to\infty} P\left(n^{2/3}\left(\frac{\max \lambda_i}{\sqrt{n}}-2\right)\le t\right)\\ & =1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!}\int_{t}^{\infty}\cdots\int_t^{\infty} \det_{i,j=1}^k A(x_i,x_j)\prod_{i=1}^k dx_i\\ & =: F_2(t) \end{aligned}

Theorem 4 and 5 are very closed related. In fact if we put $t'=\infty$ (which is not permissible) in Theorem 4 we will get Theorem 5 (Why? This is just simple equivalence of events, we will show this later in next post). We won’t be proving that $F_2$ is a distribution function in this initial series of posts. However, we state the result for weak convergence for completeness.

Theorem 6: (Weak convergence)

$\displaystyle F_2(t)=\exp\left(-\int_t^{\infty} (x-t)q(x)^2\,dx\right)$

where $q(x)$ satisfies

$q''(t)=tq(t)+2q(t)^3, \ \ \ q(t) \sim Ai(t) \ \ \mbox{as} \ t \to \infty$

This distribution is known as the Tracy-Widom distribution.

Observe the similarity between theorem 3 and theorem 4. The tool that will help us to complete the proof of theorem 4 using theorem 3 is Fredholm determinants.

# Fredholm Determinants

Let $X=\mathbb{R}$. Let $\mathcal{B}_{X}$ be the borel $\sigma$-algebra. Let $\nu$ be a complex valued measure on $(X,\mathcal{B}_X)$ such that

$\displaystyle ||\nu||:=\int_{X} |\nu(dx)| < \infty$

In general, $X$ can be locally compact Polish space. But in most of the cases, $X$ would be real line and $\nu$ would be a scalar multiple of the Lebesgue measure on a bounded interval.

A kernel is a Borel measurable, complex valued function $K(x,y)$ defined on $X \times X$ such that

$\displaystyle ||K||:=\sup_{(x,y)\in X\times X} |K(x,y)| <\infty$

Exercise 1: Airy Kernel is a kernel in this sense.

Exercise 2: Hermite Kernel is a kernel in this sense.

With these basic notions, we are ready to state our first lemma that we help us to define Fredholm determinants.

Lemma 2: Fix $n>0$. $F(x,y)$ and $G(x,y)$ are two kernels. Then we have the following two bounds

$\displaystyle \left|\det_{i,j=1}^n F(x_i,y_j) - \det_{i,j=1}^n G(x_i,y_j)\right| \le n^{1+n/2}||F-G||\max(||F||,||G||)^{n-1}$

$\displaystyle \left|\det_{i,j=1}^n F(x_i,y_j)\right|\le n^{n/2}||F||^n$

Proof: Let us recall Hadamard inequality.

Hadamard inequality: For any column vectors $v_1, v_2, \ldots, v_n$ of length $n$ with complex entries, we have

$\displaystyle \det [v_1, v_2, \ldots, v_n] \le \prod_{i=1}^n \sqrt{\bar{v_i}^Tv_i} \le n^{n/2}\prod_{i=1}^n |v_i|_{\infty}$

For a proof of Hadamard inequality one may see here. Note that the second bound of the lemma is an immediate application of the Hadamard inequality. So, we will prove the 1st bound of the lemma only.

======================================================================

Idea of the proof: Let us take $n=3$. Let us write $F(x_i,y_j)$ and $G(x_i,y_j)$ simply as $F_{ij}$ and $G_{ij}$ respectively. Note that determinant is linear wrt to each rows.

\displaystyle \begin{aligned} & \begin{vmatrix} F_{11} & F_{12} & F_{13}\\ F_{21} & F_{22} & F_{23}\\ F_{31} & F_{32} & F_{33} \end{vmatrix} \\ & = \begin{vmatrix} F_{11}-G_{11}+G_{11} & F_{12}-G_{12}+G_{12} & F_{13}-G_{13}+G_{13} \\ F_{21} & F_{22} & F_{23}\\ F_{31} & F_{32} & F_{33}\end{vmatrix}\\ & = \begin{vmatrix} F_{11}-G_{11} & F_{12}-G_{12} & F_{13}-G_{13}\\ F_{21} & F_{22} & F_{23}\\ F_{31} & F_{32} & F_{33} \end{vmatrix}+\begin{vmatrix} G_{11} & G_{12} & G_{13}\\ F_{21} & F_{22} & F_{23}\\ F_{31} & F_{32} & F_{33} \end{vmatrix} \end{aligned}.

Since you can guess what exactly are we going to subtract and add from the terms, we will now drop the subscripts too.

So, vaguely speaking, we obtained that

\displaystyle \begin{aligned} & \begin{vmatrix} F & F & F\\ F & F & F\\ F & F & F \end{vmatrix} \\ & = \begin{vmatrix} F-G & F-G & F-G \\ F & F & F\\ F & F & F \end{vmatrix}+\begin{vmatrix} G & G & G\\ F & F & F\\ F & F & F \end{vmatrix} \end{aligned}

We can apply the same trick to the second determinant on the right side of the above equation. Applying the same trick twice we get

\displaystyle \begin{aligned} & \begin{vmatrix} G & G & G\\ F & F & F\\ F & F & F \end{vmatrix} \\ & = \begin{vmatrix} G & G & G\\ F-G+G & F-G+G & F-G+G\\ F & F & F \end{vmatrix} \\ & = \begin{vmatrix} G & G & G\\ F-G & F-G & F-G\\ F & F & F \end{vmatrix} +\begin{vmatrix} G & G & G\\ G & G & G\\ F & F & F \end{vmatrix} \\ & = \begin{vmatrix} G & G & G\\ F-G & F-G & F-G\\ F & F & F \end{vmatrix} +\begin{vmatrix} G & G & G\\ G & G & G\\ F-G & F-G & F-G \end{vmatrix}+\begin{vmatrix} G & G & G\\ G & G & G\\ G & G & G \end{vmatrix} \end{aligned}

We rearrange the terms to obtain,

\displaystyle \begin{aligned} & \begin{vmatrix} F & F & F\\ F & F & F\\ F & F & F \end{vmatrix} - \begin{vmatrix} G & G & G\\ G & G & G\\ G & G & G \end{vmatrix} \\ & = \begin{vmatrix} F-G & F-G & F-G \\ F & F & F\\ F & F & F \end{vmatrix}+\begin{vmatrix} G & G & G\\ F-G & F-G & F-G\\ F & F & F \end{vmatrix} +\begin{vmatrix} G & G & G\\ G & G & G\\ F-G & F-G & F-G \end{vmatrix} \end{aligned}

Now applying Hadamard inequality on each of the matrices gives the result. Now we write the proof formally. Readers may skip the formal proof. Just for the sake of completeness, and satisfying some curious readers, we present it here.

======================================================================

Formal proof: Define

$\displaystyle H_i^{(k)}(x,y)= \begin{cases} G(x,y) & \mbox{if} \ \ ik \end{cases}$

By linearity of determinant with respect to rows

$\displaystyle \det_{i,j=1}^n F(x_i,y_j)-\det_{i,j=1}^n G(x_i,y_j) = \sum_{k=1}^n \det_{i,j=1}^n H_i^{(k)}(x_i,y_j)$

Applying the Hadamard inequality we have

$\displaystyle \left|\det_{i,j=1}^{n} H_i^{(k)}(x_i,y_j)\right| \le n^{n/2}||F-G||\cdot \max(||F||,||G||)^{n-1}$

Adding up the above inequality over $k$, we get the desired result.

$\square$

Now we are ready define Fredholm determinants. Fix a kernel $K$. Define $\Delta_0=1$ and for $n\ge 1$,

$\displaystyle \Delta_n=\Delta_n(K,\nu):=\int\cdots\int \det_{i,j=1}^n K(x_i,x_j)d\nu(x_1)d\nu(x_2)\ldots d\nu(x_n)$

Note that $|\Delta_n| \le n^{n/2}||K||^n||\nu||^n$. Hence $\Delta_n$ is well defined.

We define the Fredholm determinant associated with kernel $K$ as

$\displaystyle \Delta(K) =\Delta(K,\nu) = \sum_{n=0}^{\infty} \frac{(-1)^n}{n!}\Delta_n(K,\nu)$

One can check that the above series converges absolutely by root test.

We now state the continuity lemma Fredholm determinants which will help us in proving the convergence result.

Lemma 3: Let $K,L$ be two kernels. Then we have

$\displaystyle |\Delta(K)-\Delta(L)| \le C||K-L||$

where $C$ depends on $||K||, ||L||$ and $||\nu||$.

Proof: Observe that

\displaystyle \begin{aligned} & |\Delta(K)-\Delta(L)| \\ & \le \sum_{n=0}^{\infty} \frac1{n!}|\Delta_n(K,\nu)-\Delta_n(L,\nu)| \\ & = \sum_{n=1}^{\infty} \frac1{n!}\left|\int\cdots\int \left(\det_{i,j=1}^n K(x_i,x_j)-\det_{i,j=1}^n L(x_i,x_j)\right)d\nu(x_1)\ldots d\nu(x_n)\right| \\ & \le \sum_{n=1}^{\infty} \frac1{n!} \int \cdots \int \left|\det_{i,j=1}^n K(x_i,x_j)-\det_{i,j=1}^n L(x_i,x_j)\right|d\nu(x_1)\ldots d\nu(x_n) \\ & \le \left[\sum_{n=1}^{\infty} \frac1{n!}n^{1+n/2}\max(||K||,||L||)^{n-1}||\nu||^n\right]||K-L||=C||K-L|| \end{aligned}

The last inequality follows from Lemma 2. Again by root test it follows that the constant $C$ is finite.

$\square$

This result is very useful. If we have fixed kernel $K$ and a varying kernel $K^{(n)}$ such that $K^{(n)}$ converges uniformly to $K$, then $\Delta(K^{(n)}) \to \Delta(K)$. In the next post we will see how this result gives us the required limit in Theorem 4.