Tracy-Widom Law: The starting point

Hi there!

Introduction: RMT deals with matrices with random variables as its entries. RMT is a fascinating field which is expanding rapidly in recent years. In this series of posts, I will be proving the vague convergence of the celebrated Tracy-Widom Law. I will be mostly following the book ‘An Introduction to Random Matrices’ by Anderson et.al.

Ensembles: 

Let T,W be independent N(0,1/2) variables. Define Z=T+iW to be a complex random variable. We say that Z follows standard complex normal distribution and denote it as Z \sim CN(0,1), since E(Z)=0 and E(Z\bar{Z})=1. Let U be a n\times n random matrix with iid CN(0,1)  entries and V be a n\times n random matrix with iid N(0,1)  entries. Define

\displaystyle  X=\frac{V+V^T}{\sqrt{2}},\ \ Y=\frac{U+U^*}{\sqrt2}.

Let P_n^{(1)} denote the law of X.  The random matrix X is said to belong to the Gaussian orthogonal ensemble [\mathcal{H}_n^{(1)}] (GOE(n)). Observe that X is symmetric and X_{1,1} is a N(0,2) random variable while X_{1,2} is N(0,1) random variable.

Let P_n^{(2)} denote the law of Y. Then the random matrix Y is said to belong to the Gaussian unitary ensemble [\mathcal{H}_n^{(2)}] (GUE(n)). Observe that Y is hermitian and Y_{1,1} is a N(0,1) random variable while Y_{1,2} is CN(0,1) random variable.

Why the law P_n^{(\beta)} special?

Because the density at H \in \mathcal{H}_N^{(\beta)} is proportional to \exp\left(-\mbox{tr}\frac{\beta H^2}{4}\right). Thus for the law P_n^{(1)}, the density at H and OHO^T is same for any orthogonal matrix O and for law P_n^{(2)}, the density at H and UHU^{*} is same for any unitary matrix U (and hence the names GOE and GUE). This also enables us to find the joint distribution of the eigenvalues exactly!

Target: Let \lambda_1^n\le \lambda_2^n \le \cdots \le \lambda_n^n be the eigenvalues of a matrix X from GUE/GOE. Then it is known that the empirical distribution of the eigenvalues L_n:=\frac1n\sum_{i=1}^n \delta_{\lambda_i^n/sqrt{n}} converges weakly, in probability to the semicircle distribution \sigma(x)\,dx with density \displaystyle \sigma(x) =\frac1{2\pi}\sqrt{4-x^2}\mathbb{I}_{|x|\le 2}. We also have that \displaystyle \frac1{\sqrt{n}}\lambda_n^n \stackrel{a.e.}{\to} 2. What we are interested in the limiting distribution of this largest eigenvalue. In fact what we intend to show is that \displaystyle n^{2/3}\left(\frac{\lambda_n^n}{\sqrt{n}}-2\right) converges weakly. In the initial series of posts we will be only showing the vague convergence of above expression.

Let us adapt this convention: If we are using the notation \lambda_1^n, \ldots \lambda_n^n for eigenvalues then they are ordered. If we use \lambda_1, \ldots, \lambda_n for eigenvalues then they are unordered.

Notations: Let A=((a_{ij}))_{i,j=1}^n be a square matrix of order n. The determinant of A will be denoted as

|A|=\det(A)=\det\limits_{i,j=1}^n a_{ij}

By laplace expansion we also have the following formula for the determinants.

\det\limits_{i,j=1}^n a_{i,j}=\sum\limits_{\sigma \in S_n} \epsilon_{\sigma} \prod\limits_{i=1}^{n} a_{i,\sigma(i)}

Theorem 1: Let X\in \mathcal{H}_n^{(\beta)} be a random matrix with law P_n^{(\beta)} (\beta=1,2). Let \lambda_1^n \le \lambda_2^n \le \ldots \le \lambda_n^n denotes the ordered eigenvalues of the random matrix X. The joint distribution of the ordered eigenvalues has density with respect to Lebesgue measure which equals

\displaystyle \hat\rho_{n}^{(\beta)}(x_1,x_2,\ldots,x_n):=n!C_n^{(\beta)}|\Delta(x)|^{\beta}\prod_{i=1}^n e^{-\beta x_i^2/4}\mathbb{I}_{x_1\le x_2\le \ldots \le x_n}

where C_n^{(\beta)} is the normalizing constant and \Delta(x) is Vandermonde determinant associated with x=(x_1,x_2,\ldots,x_n) which equals to \displaystyle \det_{i,j=1}^n x_i^{j-1}=\prod_{1\le i<j\le n} (x_j-x_i).

Proof: The proof will be skipped. Essentially the proof involves the change of variable formula. It takes a certain suitable transformation with a smooth inverse. Then we obtain the required density by integrating out the extra variables. The \Delta(x) part emerges as  the jacobian of the transformation and e^{-\beta x_i^2/4} appears by making the change of variables. However to justify it clearly, lot more mathematics is needed.

Remark: What we are interested in, is the distribution of the unordered eigenvalues which is obtained by symmetrizing the density \hat\rho_n^{(\beta)}. We say the joint density of the unordered eigenvalues is given by

\displaystyle \rho_n^{(\beta)}:=C_n^{(\beta)}|\Delta(x)|^{\beta}\prod_{i=1}^n e^{-\beta x_i^2/4}

Technically speaking, this is just a symmetrized valid density. This should not be called as the joint density of the unordered eigenvalues as integrating this joint density n-1 times gives us a marginal density. But that marginal density is the density of which eigenvalue? The best we can say is that if we choose one eigenvalue randomly, it is the density of that randomly chosen eigenvalue.

Hermite Polynomials

From now on, we will concentrate only on GUE. We now introduce Hermite Polynomials and associated wave function.

The n-th Hermite polynomial is defined as H_n(x): =(-1)^ne^{x^2/2}\frac{d^n}{dx^n}e^{-x^2/2}. Hermite polynomials are actually polynomials! In fact H_n(x) is a monic polynomial of degree n. The associated n-th normalized wave function is defined as

\displaystyle \psi_n(x):=\frac{e^{-x^2/4}H_n(x)}{\sqrt{\sqrt{2\pi}n!}}

That weird \sqrt{\sqrt{2\pi}n!} scalling factor is used so that \{\psi_n(x)\} forms an orthonormal basis for Sp\{x^k\exp(-x^2/4)\}, that is, we have

\int_{\mathbb{R}}\psi_k(x)\psi_l(x)\,dx=\delta_{kl}

Let us define  \displaystyle K^{(n)}(x,y)=\sum_{k=0}^{n-1}\psi_k(x)\psi_k(y) to be the Hermite Kernel. With this K^{(n)} in hand, we have a different expression for \rho_n^{(2)}.

Theorem 2: Consider the density \rho_n^{(\beta)}(x_1,x_2,\ldots,x_n) stated in the remark of the previous theorem. We have

\displaystyle \rho_{n}^{(2)}(x_1,x_2,\ldots,x_n)= \frac1{n!}\det_{k,l=1}^n K^{(n)}(x_k,x_l)

where \displaystyle K^{(n)}(x,y):=\sum_{k=0}^{n-1}\psi_k(x)\psi_k(y).

Proof: We need the following lemma to prove this theorem.

Lemma 1: For any square-integrable functions f_1,f_2,\ldots,f_n and g_1,g_2,\ldots,g_n on the real line, we have

\displaystyle \frac1{n!}\int\cdots\int \det_{i,j=1}^n \left(\sum_{k=1}^n f_k(x_i)g_k(x_j)\right)\prod_{i=1}^n dx_i
\displaystyle =\frac1{n!}\int\cdots\int\det_{i,j=1}^n f_i(x_j)\det_{i,j=1}^n g_i(x_j)\prod_{i=1}^n dx_i=\det_{i,j=1}^n\int f_i(x)g_j(x)\,dx

Proof of Lemma: The first equality follows easily due to the fact that |AB|=|A||B|. Enough to prove the last equality. By laplace expansion we have

\displaystyle  \begin{aligned} & \int\cdots\int\det_{i,j=1}^n f_i(x_j)\det_{i,j=1}^n g_i(x_j)\prod_{i=1}^n dx_i\\ & =\int\cdots\int \sum_{\sigma,\tau\in S_n} \epsilon_{\sigma} \epsilon_{\tau} \prod_{i=1}^n f_{\sigma(i)}(x_i)\prod_{i=1}^n g_{\tau(i)}(x_i)\prod_{i=1}^n dx_i \ \ \ \mbox{(Laplace Expansion)} \\ & = \sum_{\sigma,\tau\in S_n} \epsilon_{\sigma} \epsilon_{\tau} \int\cdots\int  \prod_{i=1}^n f_{\sigma(i)}(x_i)g_{\tau(i)}(x_i)\prod_{i=1}^n dx_i\\ & = \sum_{\sigma,\tau\in S_n} \epsilon_{\sigma} \epsilon_{\tau} \prod_{i=1}^n\int   f_{\sigma(i)}(x_i)g_{\tau(i)}(x_i) dx_i  \ \ \ \mbox{(The expression factors out)}  \\ & = \sum_{\sigma,\tau\in S_n} \epsilon_{\sigma} \epsilon_{\tau} \prod_{i=1}^n\int   f_{\sigma(i)}(x)g_{\tau(i)}(x)\,dx\ \ \ (x_i \mbox{ is a dummy variable})  \\ & = n!\sum_{\tau\in S_n}  \epsilon_{\tau} \prod_{i=1}^n\int   f_{i}(x)g_{\tau(i)}(x) dx \\ & = n!\det_{i,j=1}^n\int f_i(x)g_j(x)\,dx \ \ \ \mbox{(Laplace expansion in reverse direction)}\end{aligned}

In the penultimate line, we permutated each product such that the  f_i‘s are in order and then we take the over \sigma which gives rise to n!. Note that the \tau is different from the one in above line.

\square

Proof of Theorem 2: The Vandermonde matrix is given by

\begin{pmatrix} 1 & x_1 & x_1^2 & \ldots & x_1^{n-1}\\ 1 & x_1 & x_1^2 & \ldots & x_1^{n-1}\\ \vdots & \vdots & \vdots & \vdots & \vdots \\ 1 & x_1 & x_1^2 & \ldots & x_1^{n-1} \end{pmatrix}

Note that if we replace i-th column (C_i say) of the matrix by [C_i + \mbox{some linear combination of first (i-1) columns}], then the value of the determinant will remain same. So, we can replace i-th column by a monic polynomial of degree i. We will replace the columns with Hermite polynomials.

So, we have \displaystyle |\Delta(x)|^2=\left(\det_{i,j=1}^n x_i^{j-1} \right)^2 = \left(\det_{i,j=1}^n H_{j-1}(x_i)\right)^2.

Hence \displaystyle \rho_n^{(2)}(x_1,x_2,\ldots,x_n) \propto \left(\det_{i,j=1}^n H_{j-1}(x_i)\right)^2\prod_{i=1}^n e^{-x_i^2/2}=\left(\det_{i,j=1}^n H_{j-1}(x_i)e^{-x_i^2/4}\right)^2.

Again, \displaystyle \left(\det_{i,j=1}^n H_{j-1}(x_i)e^{-x_i^2/4}\right)^2 \propto \left(\det_{i,j=1}^n \psi_{j-1}(x_i)\right)^2=\det_{i,j=1}^n \sum_{k=0}^{n-1}  \psi_k(x_i)\psi_k(x_j)=\det_{i,j=1}^n K^{(n)}(x_i,x_j).

where the penultimate equality follows from the fact that |AB|=|A||B|.

Now if we plug f_i=g_i=\psi_{i-1} in the lemma, then we have

\displaystyle \begin{aligned} & \frac1{n!}\int\cdots\int \det_{i,j=1}^n K^{(n)}(x_i,x_j)\prod_{i=1}^n dx_i\\ & =\frac1{n!}\int\cdots\int \det_{i,j=1}^n \left(\sum_{k=1}^n \psi_{k-1}(x_i)\psi_{k-1}(x_j)\right)\prod_{i=1}^n dx_i \\ & = \det_{i,j=1}^n \int \psi_{i-1}(x)\psi_{j-1}(x)\,dx =\det_{i,j=1}^n \delta_{ij}=1\end{aligned}

Hence \frac1{n!} is the normalising constant. This completes the proof of theorem 2.

\square

Theorem 3: For any measurable subset A of \mathbb{R},

\displaystyle P\left(\bigcap_{i=1}^n \{\lambda_i \in A\}\right) =1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!}\int_{A^c}\cdots\int_{A^c} \det_{i,j=1}^k K^{(n)}(x_i,x_j)\prod_{i=1}^k dx_i

where \lambda_i‘s are the unordered eigenvalues of X belonging to GUE. [Note that it doesn’t matter whether it is ordered or unordered.]

Proof:  By theorem 2, we have

\displaystyle \begin{aligned} P_n^{(2)}\left(\bigcap_{i=1}^n \{\lambda_i \in A\}\right) & = \int_A\cdots \int_A \rho_n^{(2)}(x_1,x_2,\ldots,x_n)\prod_{i=1}^n\,dx_i \\ & = \frac1{n!}\int_A\cdots\int_A \det_{i,j=1}^n K^{(n)}(x_i,x_j) \prod_{i=1}^n dx_i\\ & = \frac1{n!}\int_A\cdots\int_A \det_{i,j=1}^n \left(\sum_{k=0}^{n-1} \psi_k(x_i)\psi_k(x_j)\right)\prod_{i=1}^n dx_i\\ & = \det_{i,j=0}^{n-1} \int_A \psi_i(x)\psi_j(x)\,dx\end{aligned}

The last equality above follows from Lemma 1.

======================================================================
Let us now pause for a bit. We digress a bit and discuss a formula involving determinants. Consider a square matrix A=((a_{ij}))_{i,j=1}^n. Suppose we want write \det(I-A) in increasing degrees a_{ij}‘s. Indeed, \det( I - A) is an inhomogeneous polynomial in the a_{ij}. The term of degree 0 is obtained from the term of degree 0 in the product of the diagonal elements of I-A, therefore it is 1.

The term of degree 1 is obtained by taking the 1 from n-1 diagonal entries and multiply them by the variable a_{ii} in the remaining diagonal entry, in all the possible ways (that are \binom{n}{1} = n); this gives rise to

\displaystyle (-1)^1 \sum\limits_{1\leq v_1\leq n} a_{v_1 v_1} 

The term of degree 2 is obtained by taking n-2 diagonal 1‘s multiplied by the 2 \times 2 minor obtained from the remaining two rows/columns; you have \binom{n}{2} possible ways to do that, as many as the 2 \times 2 minors centered on the diagonal; this gives rise to

\displaystyle (-1)^2 \sum\limits_{1\leq v_1<v_2\leq n} \det_{i,j=1}^2 a_{v_i v_j}

In general, the term of degree k is obtained by taking n-k diagonal 1‘s multiplied by the complementary k\times k minor centered on the diagonal. This gives rise to

\displaystyle (-1)^k \sum\limits_{1\leq v_1<v_2<\cdots<v_k\leq n} \det_{i,j=1}^k a_{v_i v_j}

So we have the following formula

\displaystyle \det(I-A)=1+\sum_{k=1}^{n} (-1)^k\sum\limits_{1\leq v_1<v_2<\cdots<v_k\leq n}\det_{i,j=1}^k a_{v_i v_j}

======================================================================

Coming back to the proof of the theorem, let us observe that using the above formula we have

\displaystyle \begin{aligned} & \det_{i,j=0}^{n-1} \int_{A} \psi_{i}(x)\psi_j(x)\,dx \\ & = \det_{i,j=0}^{n-1} \left(\delta_{ij}-\int_{A^c} \psi_i(x)\psi_j(x)\,dx\right)\\ & = 1+\sum_{k=1}^{n} (-1)^k \sum\limits_{0\leq v_1 <v_2<\cdots<v_k\leq n-1} \det_{i,j=1}^k \int_{A^c}\psi_{v_i}(x)\psi_{v_j}(x)\,dx\\ & = 1+\sum_{k=1}^n (-1)^k \sum\limits_{0\leq v_1<v_2<\cdots<v_k\leq n-1} \frac1{k!}\int_{A^c}\cdots\int_{A^c} \left(\det_{i,j=1}^k \psi_{v_i}(x_j)\right)^2\prod_{j=1}^k dx_j \end{aligned}

where the last equality follows again from Lemma 1.

\displaystyle \begin{aligned} & 1+\sum_{k=1}^n (-1)^k \sum\limits_{0\leq v_1<v_2<\cdots<v_k\leq n-1} \frac1{k!}\int_{A^c}\cdots\int_{A^c} \left(\det_{i,j=1}^k \psi_{v_i}(x_j)\right)^2\prod_{j=1}^k dx_j \\ & = 1+\sum_{k=1}^n \frac{(-1)^k}{k!} \int_{A^c}\cdots\int_{A^c} \sum\limits_{0\leq v_1<v_2<\cdots<v_k\leq n-1} \left(\det_{i,j=1}^k \psi_{v_i}(x_j)\right)^2\prod_{j=1}^k dx_j \end{aligned}

Note that ((K^{(n)}(x_i,x_j)))_{i,j=1}^k=((\psi_j(x_i)))_{\stackrel{i=1,\ldots,k}{j=1,\ldots,n}}((\psi_i(x_j)))_{\stackrel{i=1,\ldots,n}{j=1,\ldots,k}}. The above expression inside the integral is just the Cauchy-Binet’s expansion of the \displaystyle  \det_{i,j=1}^k K^{(n)}(x_i,x_j). Hence we have

\displaystyle \begin{aligned}  & 1+\sum_{k=1}^n \frac{(-1)^k}{k!} \int_{A^c}\cdots\int_{A^c} \sum\limits_{0\leq v_1<v_2<\cdots<v_k\leq n-1} \left(\det_{i,j=1}^k \psi_{v_i}(x_j)\right)^2\prod_{j=1}^k dx_j \\ & = 1+\sum_{k=1}^n \frac{(-1)^k}{k!} \int_{A^c}\cdots\int_{A^c} \det_{i,j=1}^k K^{(n)}(x_i,x_j)\prod_{j=1}^k dx_j \\ & = 1+\sum_{k=1}^{\infty} \frac{(-1)^k}{k!} \int_{A^c}\cdots\int_{A^c} \det_{i,j=1}^k K^{(n)}(x_i,x_j)\prod_{j=1}^k dx_j\end{aligned}

where the sum can be taken to infinity because the matrix ((K^{(n)}(x_i,x_j)))_{i,j=1}^k is of rank at most n. Hence the determinant is zero for k>n. This completes the proof.

\square

Why this nasty expression of the Theorem 3 is useful at all? It turns out that the right hand side of the theorem 2 is very closely related to the Fredholm determinant of K^{(n)} which has been studied extensively. We will look into some of the key properties of Fredholm determinant in next post.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s