Theorem 2. Given a connected graph G = (N,E), partition its nodes into N- and N+ using the spectral bisection algorithm. Then N- is connected. If no v(2)(n) = 0, N+ is also connected.
To prove this theorem, we need several other standard results from linear algebra, some of which we state without proof.
Definition. The spectral radius rho(A) of a matrix A is the largest absolute value of any eigenvalue:
rho(A) = max_i | lambda_i (A) |
Definition. A nonnegative matrix A is a matrix all of whose entries are nonnegative. This is written A >= 0. A positive matrix A is a matrix all of whose entries are positive. We also refer to nonnegative and positive vectors, with similar notation.
Definition. The graph G(A) of an n-by-n matrix A is a graph with n nodes, and an edge e=(i,j) if and only if A(i,j) != 0.
Lemma 1.Let A by an n-by-n nonnegative matrix, and suppose G(A) is connected. Then sum_{m=0}^{n-1} A^m is a positive matrix.
Proof of Lemma 1. The (i,j) entry of A^m is a sum of many terms of the form
A(i,k(1)) * A(k(1),k(2)) * A(k(2),k(3)) *...* A(k(m-2),k(m-1)) * A(k(m-1),j)
where the sum is over all n^(m-1) combinations 1 <= k(q) <= n, 1 <= q <= m-1.
Each such term is nonnegative since A is nonnegative.
Consider m=2. Then A(i,k)*A(k,j) will be
positive if there is a path from i to j in G(A) of length 2,
namely a path through k. Similarly, (A^m)(i,j) will be positive if there
is a path of length m connecting i and j. If G(A) is connected, then there
is a path of length at most n-1 connected every pair of nodes. The statement
of the lemma follows. QED
Definition. A symmetric matrix with all nonnegative eigenvalues is called positive semidefinite. If the eigenvalues are all positive, it is called positive definite.
Lemma 2. If A is n-by-n and symmetric with eigenvalues lambda(1) <= ... <= lambda(n), then
lambda(1) = min_{v != 0} v'*A*v / v'*v
lambda(n) = max_{v != 0} v'*A*v / v'*v
Proof of Lemma 2. It follows simply from the eigendecomposition A = Q*Lambda*Q', where Q is an orthogonal matrix whose columns are eigenvectors, and Lambda = diag(lambda(1),...,lambda(n)), using the substitution
v'*A*v / v'*v = v'*Q*Lambda*Q'*v / v'*Q*Q'*v = y'*Lambda*y / y'*y
= sum_{i=1}^n lambda(i)*y(i)^2 / sum_{i=1}^n y(i)^2
Details are left to the reader.
Cauchy Interlace Theorem (R. Horn and C. Johnson, "Matrix Analysis", 1988). Let A be an n-by-n symmetric matrix with eigenvalues lambda(1) <= ... <= lambda(n). Let B = A(1:n-1,1:n-1), the leading (n-1)-by-(n-1) submatrix of A. Let the eigenvalues of B be mu(1) <= ... <= mu(n-1). Then for all i, lambda(i) <= mu(i) <= lambda(i+1). Applying this result recursively, we can show that if C = A(i:j, i:j) for any i and j, and the eigenvalues of C are xi(1) <= ... <= xi(j-i+1), then A has at least k eigenvalues <= xi(k). In particular lambda(1) <= xi(1).
Corollary to the Cauchy Interlace Theorem. Let the symmetric matrix A be positive (semi)definite. Then any submatrix C=A(i:j,i:j) is also positive (semi)definite.
Lemma 3. If A is symmetric and positive (semi)definite, so is X'*A*X for any nonsingular matrix X.
Proof of Lemma 3. From Lemma 2, the smallest eigenvalue of X'*A*X is
min_{v != 0} v'*X'*A*X*v / v'*v
= min_{v != 0} ( v'*X'*A*X*v / v'*X'*X*v ) * ( v'*X'*X*v / v'*v )
>= min_{v != 0} ( v'*X'*A*X*v / v'*X'*X*v ) * min _{v != 0 } ( v'*X'*X*v / v'*v )
= min_{Xv != 0} ( v'*X'*A*X*v / v'*X'*X*v ) * min _{v != 0 } ( v'*X'*X*v / v'*v )
= lambda(1)(A) * lambda(1)(X'*X)
Since v'*X'*X*v = (X*v)'*(X'*v) is a sum of squares, it is nonnegative.
Thus lambda(1)(X'*X) >= 0.
Since X is nonsingular, so is X'*X, so it can't have a zero eigenvalue.
Thus lambda(1)(X'*X) > 0. The result follows. QED
Lemma 4. If A is symmetric matrix with rho(A) < 1, then I-A is invertible and
inv(I-A) = sum_{i=0}^{infinity} A^i
Proof of Lemma 4. Since the eigenvalues of A are strictly between -1 and 1, the eigenvalues of I-A are strictly between 0 and 2, so I-A is positive definite and so nonsingular. Writing the eigendecomposition A = Q*Lambda*Q', we see that A^i = Q*Lambda^i*Q', so the entries of A^i go to zero geometrically, like rho(A)^i or faster. Thus sum_{i=0}^{infinity} A^i converges. Since
(I-A) * sum_{i=0}^m A*i = I - A^{m+1}
it is easy to see that S(m) = sum_{i=0}^m A*i converges to inv(I-A),
since (I-A)*S(m) converges to I. QED
Partial proof of Theorem 2. (M. Fiedler, "A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory", Czech. Math. J. 25:619--637, 1975.) We consider the special (but generic) case where v(2) is unique (modulo multiplication by a scalar) and v(2) has only nonzero entries. We will use proof by contradiction: Assume that N+ is not connected, and in fact consists of k connected components. Suppose for illustration that k=2 (the general case is no harder). Then we can renumber the rows and columns of A so that
n1 n2 n3
[ A11 0 A13 ] n1 [ v1 ] n1
A = [ 0 A22 A23 ] n2 , v(2) = [ v2 ] n2
[ A13' A12' A33 ] n3 [ v3 ] n3
where v1 > 0, v2 > 0 and v3 < 0. The two zero blocks in A occur
because there are no edges connecting the first n1 nodes (the first
connected component of N+) and the following n2 nodes (the second
connected component of N+). Then A*v(2) = lambda(2)*v(2) implies
A11*v1 + A13*v3 = lambda(2)*v1 (1)
Note that A13 <= 0, and v3 < 0, so each term in the product A13*v3 is nonnegative and thus A13*v3 >= 0. In fact A13*v3 is nonzero, since otherwise A13 would have to be zero, and so the first n1 nodes alone would form a connected component of G, contradicting our assumption that G is connected.
By the Corollary above, A11 is positive semidefinite since A is. Now let eps be any positive number. Then adding eps*v1 to both sides of (1) yields
(eps*I + A11)*v1 + A13*v3 = (eps+lambda(2))*v1 (2)
The eigenvalues of eps*I + A11 are all at least eps, so eps*I + A11 is
positive definite.
Write eps*I + A11 = D - N, where D is diagonal, and N >= 0 is zero on the
diagonal (-N holds all the offdiagonal entries of eps*I + A11).
Now
eps*I + A11 = D - N
= Dh * ( I - inv(Dh)*N*inv(Dh) ) * Dh
where Dh = D^(1/2) = diag(sqrt(D(1,1)),...,sqrt(D(n1,n1)))
= Dh * (I-M) * Dh
where M = inv(Dh)*N*inv(Dh)
By Lemma 3, I-M is positive definite since D-N is positive definite
and Dh is nonsingular.
Since the eigenvalues of I-M are 1 minus the eigenvalues of M,
the eigenvalues of M must be less than 1. All the
eigenvalues of M must also be greater than -1, because by Lemma 2,
lambda(1)(M) = min_{v != 0} v'*M*v / v'*v
>= min_{v != 0} -|v|'*M*|v| / v'*v
since M >= 0
= -max_{v != 0} |v|'*M*|v| / v'*v
>= -max_{v != 0} v'*M*v / v'*v
= -lambda(n1)(M)
> -1
Thus | lambda(j)(M) | <= rho(M) < 1 for all j. By Lemma 4,
Y = inv(eps*I + A11) = inv(Dh) * inv(I-M) * inv(Dh)
= inv(Dh) * ( sum_{i=0}^infinity M^i ) * inv(Dh)
is nonnegative, since M and M^i are nonnegative. By Lemma 1, Y is positive.
Multiplying equation (2) by Y yields
v1 + Y*A13*v3 = Y*(eps+lambda(2))*v1
Multiplying by v1' yields
v1'*v1 + v1'*Y*A13*v3 = (eps+lambda(2) * v1'*Y*v1
so by Lemma 2
(eps+lambda(2)) * lambda(n1)(Y)
= max_{v != 0} (eps+lambda(2)) * v'*Y*v / v'*v
>= (eps+lambda(2))* v1'*Y*v1 / v1'*v1
= (v1'*v1 + v1'*Y*A13*v3) / v1'*v1
= 1 + v1'*Y*A13*v3 / v1'*v1
As stated above, A13*v3 >= 0 and is nonzero. Since Y>0, Y*A13*v3 > 0,
and so v1'*Y*A13*v3 > 0. Thus
(eps+lambda(2)) * lambda(n1)(Y) > 1Since the eigenvalues of Y are positive and the reciprocals of the eigenvalues of eps*I + A11, we get
(eps+lambda(2)) / lambda(1)(eps*I + A11) > 1Since lambda(1)(eps*I + A11) = eps + lambda(1)(A11), we can rearrange to get
lambda(1)(A11) < lambda(2)The same logic apples to A22, so lambda(1)(A22) < lambda(2). Thus, the leading n1+n2 -by- n1+n2 submatrix of A,
[ A11 0 ]
[ 0 A22 ]
has two eigenvalues less than lambda(2). By the Cauchy Interlace Theorem,
this means A has two eigenvalues less than lambda(2). But this contradicts
the fact that lambda(2) is the second smallest eigenvalue of A. This
contradiction proves the theorem. QED