实二次型

实二次型

定义

含有 nn 个实变量 x1,x2,,xnx_1,\, x_2,\, \cdots,\, x_n 的在某个数域上的二次齐次多项式

f(x1,x2,,xn)=i=1nj=1naijxixj(aij=aji)f(x_1,\, x_2,\, \cdots,\, x_n) = \sum_{i=1}^n \sum_{j=1}^n a_{ij} x_i x_j \qquad (a_{ij} = a_{ji})

称为二次型,若全部 aijRa_{ij} \in \mathbb{R},则称为实二次型,若全部 aijCa_{ij} \in \mathbb{C},则称为复二次型

f(x1,x2,,xn)=i=1nj=1naijxixj=i=1naiixi2+i<j(aij+aji)xixj=i=1naiixi2平方项+2i<jaijxixj交叉项\begin{aligned} f(x_1,\, x_2,\, \cdots,\, x_n) &= \sum_{i=1}^n \sum_{j=1}^n a_{ij} x_i x_j \\ &= \sum_{i=1}^n a_{ii} x_i^2 + \sum_{i < j}(a_{ij} + a_{ji}) x_i x_j \\ &= \sum_{i=1}^n a_{ii} \underbrace{x_i^2}_{\text{平方项}} + 2 \sum_{i < j} a_{ij} \underbrace{x_i x_j}_{\text{交叉项}} \end{aligned}

二次型 ff 的矩阵表示形式

f(x1,x2,,xn)=[x1x2xn][a11a12a1na21a22a2nan1an2ann][x1x2xn]=xAx\begin{aligned} f(x_1, x_2, \cdots, x_n) &= \begin{bmatrix} x_1 & x_2 & \cdots & x_n \end{bmatrix} \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix}\\ &= \bm{x}^\intercal \bm{A} \bm{x} \end{aligned}

其中 A\bm{A} 为对称矩阵,称为二次型 ff 的矩阵A\bm{A} 的秩称为二次型 ff 的秩,即 r(f)r(A)\rank(f) \coloneqq \rank(\bm{A})

实二次型的标准形

为化简二次型矩阵(仅保留平方项,消去交叉项),要进行变量代换。

{x1=c11y1+c12y2++c1nynx2=c21y1+c22y2++c2nynxn=cn1y1+cn2y2++cnnyn\left\lbrace\begin{aligned} x_1 &= c_{11} y_1 + c_{12} y_2 + \cdots + c_{1n} y_n \\ x_2 &= c_{21} y_1 + c_{22} y_2 + \cdots + c_{2n} y_n \\ &\kern{0.25em}\vdots \\ x_n &= c_{n1} y_1 + c_{n2} y_2 + \cdots + c_{nn} y_n \end{aligned}\right.

P=[cij]n\bm{P} = [c_{ij}]_nnn 阶可逆矩阵,考虑非退化线性变换(又称非奇异线性变换,即 P\bm{P} 可逆,由此可得 y=P1x\bm{y} = \bm{P}^{-1} \bm{x}

x=Py\bm{x} = \bm{P} \bm{y}

其中 y=[y1y2yn]\bm{y} = \begin{bmatrix} y_1 & y_2 & \cdots & y_n \end{bmatrix}^\intercal 为新变量,从而

f(x)=xAx=x=Py(Py)A(Py)=y(PAP)y=yBy\begin{aligned} f(\bm{x}) = \bm{x}^\intercal \bm{A} \bm{x} &\xlongequal{\bm{x} = \bm{P} \bm{y}} (\bm{P} \bm{y})^\intercal \bm{A} (\bm{P} \bm{y}) \\ &= \bm{y}^\intercal (\bm{P}^\intercal \bm{A} \bm{P}) \bm{y} \\ &= \bm{y}^\intercal \bm{B} \bm{y} \end{aligned}

A,B\bm{A},\, \bm{B} 为同阶方阵,若存在可逆矩阵 P\bm{P} 使得

B=PAP\bm{B} = \bm{P}^\intercal \bm{A} \bm{P}

则称 A\bm{A}B\bm{B} 合同(或 A\bm{A} 合同于 B\bm{B}),B\bm{B} 称为 A\bm{A}合同矩阵P\bm{P}A\bm{A}B\bm{B}合同变换矩阵。显然 B\bm{B} 也合同于 A\bm{A},因为有

A=(P1)BP1\bm{A} = \left(\bm{P}^{-1}\right)^\intercal \bm{B} \bm{P}^{-1}

至此已学习的矩阵关系
  1. 等价矩阵B=PAQ\bm{B} = \bm{P} \bm{A} \bm{Q}A,B\bm{A},\, \bm{B}m×nm \times n 矩阵,P,Q\bm{P},\, \bm{Q} 分别为 mm 阶和 nn 阶可逆矩阵)
  2. 相似矩阵B=P1AP\bm{B} = \bm{P}^{-1} \bm{A} \bm{P}A,B\bm{A},\, \bm{B}nn 阶方阵,P\bm{P}nn 阶可逆矩阵)
  3. 合同矩阵B=PAP\bm{B} = \bm{P}^\intercal \bm{A} \bm{P}A,B\bm{A},\, \bm{B}nn 阶方阵,P\bm{P}nn 阶可逆矩阵)

只包含平方项的二次型称为标准形,即

f(x1,x2,,xn)=i=1ndiixi2f(x_1,\, x_2,\, \cdots,\, x_n) = \sum_{i=1}^n d_{ii} x_i^2

设实二次型 f(x1,x2,,xn)=xAxf(x_1,\, x_2,\, \cdots,\, x_n) = \bm{x}^\intercal \bm{A} \bm{x},则存在非退化线性变换 x=Py\bm{x} = \bm{P} \bm{y},使得 ff 化为标准形。

等价于证明存在可逆矩阵将实对称矩阵合同变换为实对角矩阵

因为 A\bm{A} 为实对称矩阵,所以存在正交矩阵 P\bm{P},使得

PAP=P1AP=[λ1λ2λn]\begin{aligned} \bm{P}^\intercal \bm{A} \bm{P} &= \bm{P}^{-1} \bm{A} \bm{P} \\ &= \begin{bmatrix} \lambda_1 & & & \\ & \lambda_2 & & \\ & & \ddots & \\ & & & \lambda_n \end{bmatrix} \end{aligned}

从而令正交变换 x=Py\bm{x} = \bm{P} \bm{y},则有

f(x)=xAx=(Py)A(Py)=y(PAP)y=y[λ1λ2λn]y=λ1y12+λ2y22++λnyn2\begin{aligned} f(\bm{x}) &= \bm{x}^\intercal \bm{A} \bm{x} \\ &= (\bm{P} \bm{y})^\intercal \bm{A} (\bm{P} \bm{y}) \\ &= \bm{y}^\intercal (\bm{P}^\intercal \bm{A} \bm{P}) \bm{y} \\ &= \bm{y}^\intercal \begin{bmatrix} \lambda_1 & & & \\ & \lambda_2 & & \\ & & \ddots & \\ & & & \lambda_n \end{bmatrix} \bm{y} \\ &= \lambda_1 y_1^2 + \lambda_2 y_2^2 + \cdots + \lambda_n y_n^2 \end{aligned}

显然标准型是不唯一的。

化标准形方法

正交变换法(特征值法)

  1. 求出实二次型的矩阵 A\bm{A} 的特征值 λ1,λ2,,λn\lambda_1,\, \lambda_2,\, \cdots,\, \lambda_n 及相应的特征向量 α1,α2,,αn\bm{\alpha}_1,\, \bm{\alpha}_2,\, \cdots,\, \bm{\alpha}_n。(若同一特征值特征向量不正交,则使用施密特正交化进行正交。显然经过线性变换后仍然是特征向量。而不同特征值的特征向量,已经在笔记 5 证明是正交的了)[1]

  2. 取正交矩阵 P=[α1α2αn]\bm{P} = \begin{bmatrix} \bm{\alpha}_1 & \bm{\alpha}_2 & \cdots & \bm{\alpha}_n \end{bmatrix},则

    PAP=[λ1λ2λn]\bm{P}^\intercal \bm{A} \bm{P} = \begin{bmatrix} \lambda_1 & & & \\ & \lambda_2 & & \\ & & \ddots & \\ & & & \lambda_n \end{bmatrix}

  3. 取正交变换 x=Py\bm{x} = \bm{P} \bm{y}


  1. 实际做题中,对三维方阵可能会出现两个特征值,在算完重特征值的特征向量后,可运用向量叉乘,直接得出第三个正交向量,而不用再去回去慢慢算矩阵。这运用了「实对称矩阵不同特征值对应特征向量正交」的性质。 ↩︎

配方法

  1. 若二次型含有平方项,如 a11x12a_{11} x_1^2,则把所有含 x1x_1 的项集中进行配方,即令

{y1=x1+a12a11x2++a1na11xny2=x2yn=xn\left\lbrace\begin{aligned} y_1 &= x_1 + \dfrac{a_{12}}{a_{11}} x_2 + \cdots + \dfrac{a_{1n}}{a_{11}} x_n \\ y_2 &= x_2 \\ &\kern{0.25em}\vdots \\ y_n &= x_n \end{aligned}\right.

从而

f=a11y12+f1(y2,y3,,yn)f = a_{11} y_1^2 + f_1(y_2,\, y_3,\, \cdots,\, y_n)

  1. 若二次型不含平方项,取非零交叉项,如 2a12x1x22a_{12} x_1 x_2,则令

{x1=y1+y2x2=y1y2x3=y3xn=yn\left\lbrace\begin{aligned} x_1 &= y_1 + y_2\\ x_2 &= y_1 - y_2 \\ x_3 &= y_3 \\ &\kern{0.25em}\vdots \\ x_n &= y_n \end{aligned}\right.

从而

f=2a12y122a12y22+F(y1,y2,y3,y4,,yn)f = 2a_{12} y_1^2 - 2a_{12} y_2^2 + F(y_1, y_2, y_3,\, y_4,\, \cdots,\, y_n)

  1. 重复上述过程,直至化为标准形。

合同变换法

没时间了,下节课再写。

  1. 利用二次型矩阵 A\bm{A},构造 2n×n2n \times n 矩阵 B=[AE]\bm{B} = \begin{bmatrix} \bm{A} \\ \bm{E} \end{bmatrix}
  2. B\bm{B} 做一次初等列变换,然后再做一次同类型的初等行变换,即 [P1E][AE]P1=[P1AP1P1]\begin{bmatrix} \bm{P}_1^\intercal & \\ & \bm{E} \end{bmatrix} \begin{bmatrix} \bm{A} \\ \bm{E} \end{bmatrix} \bm{P}_1 = \begin{bmatrix} \bm{P}_1^\intercal \bm{A} \bm{P}_1 \\ \bm{P}_1 \end{bmatrix}。操作若干次化为 [ΛP]\begin{bmatrix} \bm{\Lambda} \\ \bm{P} \end{bmatrix} 的形式,其中 Λ\bm{\Lambda} 为对角矩阵。(原理即为 PAP=Λ\bm{P}^\intercal \bm{A} \bm{P} = \bm{\Lambda},且 P=P1P2Pk\bm{P} = \bm{P}_1 \bm{P}_2 \cdots \bm{P}_{k}
  3. 令线性变换 x=Py\bm{x} = \bm{P} \bm{y},则原二次型变为标准型 yΛy\bm{y}^\intercal \bm{\Lambda} \bm{y}

二次型的规范形

实二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n) 经过非退化的实线性变换得到如下形式的二次型

z12++zp2zp+12zr2(rn)z_1^2 + \cdots + z_p^2 - z_{p+1}^2 - \cdots - z_r^2\qquad(r \le n)

称为实二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)实规范形rr 称为实二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)pp 称为实二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)正惯性指数rpr - p 称为实二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)负惯性指数

复二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n) 经过非退化的复线性变换得到如下形式的二次型

z12++zr2(rn)z_1^2 + \cdots + z_r^2\qquad(r \le n)

称为复二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)复规范形rr 称为复二次型 f(x1,x2,,xn)f(x_1, x_2, \cdots, x_n)

惯性定理

任意实二次型 f(x1,x2,,xn)=xAxf(x_1, x_2, \cdots, x_n) = \bm{x}^\intercal \bm{A} \bm{x},都可用适当的非退化线性变换,化成实规范形

z12++zp2zp+12zr2(rn)z_1^2 + \cdots + z_p^2 - z_{p+1}^2 - \cdots - z_r^2\qquad(r \le n)

其中 r=r(A)r = \rank(\bm{A}),且实规范形唯一。


证明:

存在性:存在可逆矩阵 P\bm{P} 使得 PAP=[d1d2dn]\bm{P}^\intercal \bm{A} \bm{P} = \begin{bmatrix} d_1 & & & \\ & d_2 & & \\ & & \ddots & \\ & & & d_n \end{bmatrix},其中 diRd_i \in \R

d1>0,,dp>0;  dp+1<0,,dr<0;  dr+1=0,,dn=0d_1 > 0,\, \cdots,\, d_p > 0;\; d_{p + 1} < 0,\, \cdots,\, d_r < 0;\; d_{r + 1} = 0,\, \cdots,\, d_n = 0

f=x=Py(d1y12++dpyp2)+(dp+1yp+12++dryr2)+(dr+1yr+12++dnyn2)\begin{aligned} f \xlongequal{\bm{x} = \bm{P}\bm{y}} &\left( d_1 y_1^2 + \cdots + d_p y_p^2 \right) +\\ &\left( d_{p+1} y_{p+1}^2 + \cdots + d_r y_r^2 \right) +\\ &\left( d_{r+1} y_{r+1}^2 + \cdots + d_n y_n^2 \right) \end{aligned}

取线性变换 y=Qz\bm{y} = \bm{Q} \bm{z} 使得

{z1=d1y1zp=dpypzp+1=dp+1yp+1zr=dryrzr+1=yr+1zn=yn\left\lbrace\begin{aligned} z_1 &= \sqrt{d_1} y_1\\ &\kern{0.25em}\vdots\\ z_p &= \sqrt{d_p} y_p\\ z_{p+1} &= \sqrt{-d_{p+1}} y_{p+1}\\ &\kern{0.25em}\vdots\\ z_r &= \sqrt{-d_r} y_r\\ z_{r+1} &= y_{r+1}\\ &\kern{0.25em}\vdots\\ z_n &= y_n \end{aligned}\right.

从而有

f=x=Py,y=Qzz12++zp2zp+12zr2f \xlongequal{\bm{x} = \bm{P} \bm{y},\, \bm{y} = \bm{Q} \bm{z}} z_1^2 + \cdots + z_p^2 - z_{p+1}^2 - \cdots - z_r^2

唯一性:即证 pp 唯一确定。

假设存在两个非退化线性变换,使得 x=P1z,x=P2w\bm{x} = \bm{P}_1 \bm{z},\, \bm{x} = \bm{P}_2 \bm{w},即

f=x=P1zz12++zp2zp+12zr2f=x=P2ww12++wq2wq+12wr2\begin{aligned} f &\xlongequal{\bm{x} = \bm{P}_1 \bm{z}} z_1^2 + \cdots + z_p^2 - z_{p+1}^2 - \cdots - z_r^2\\ f &\xlongequal{\bm{x} = \bm{P}_2 \bm{w}} w_1^2 + \cdots + w_q^2 - w_{q+1}^2 - \cdots - w_r^2 \end{aligned}

即要证明 p=qp = q。反证法,假设 p>qp > q

因为 x=P1z=P2w\bm{x} = \bm{P}_1 \bm{z} = \bm{P}_2 \bm{w},则 w=P21P1z \bm{w} = \bm{P}_2^{-1} \bm{P}_1 \bm{z},从而对 w=(P21P1)z\bm{w} = (\bm{P}_2^{-1}\bm{P}_1)\bm{z},有

z12++zp2zp+12zr2=w12++wq2wq+12wr2(1)\begin{aligned} & z_1^2 + \cdots + z_p^2 - z_{p+1}^2 - \cdots - z_r^2 = \\ & w_1^2 + \cdots + w_q^2 - w_{q+1}^2 - \cdots - w_r^2 \tag{1} \end{aligned}

C=P21P1\bm{C} = \bm{P}_2^{-1} \bm{P}_1,即 w=Cz\bm{w} = \bm{C} \bm{z},则

[w1w2wn]=[c11c12c1nc21c22c2ncn1cn2cnn][z1z2zn]\begin{bmatrix} w_1 \\ w_2 \\ \vdots \\ w_n \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} & \cdots & c_{1n} \\ c_{21} & c_{22} & \cdots & c_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ c_{n1} & c_{n2} & \cdots & c_{nn} \end{bmatrix} \begin{bmatrix} z_1 \\ z_2 \\ \vdots \\ z_n \end{bmatrix}

C=[C1q×pC2q×(np)C3(nq)×pC4(nq)×(np)],z=[z1pz2np]\bm{C} = \begin{bmatrix} \bm{C}_1^{q \times p} & \bm{C}_2^{q \times (n - p)} \\ \bm{C}_3^{(n - q) \times p} & \bm{C}_4^{(n - q) \times (n - p)} \end{bmatrix},\quad \bm{z} = \begin{bmatrix} \bm{z}_1^{p} \\ \bm{z}_2^{n - p} \end{bmatrix}

构造齐次方程组

{c11z1+c12z2++c1pzp=0c21z1+c22z2++c2pzp=0cq1z1+cn2z2++cqpzp=0\left\lbrace\begin{aligned} c_{11} z_1 + c_{12} z_2 + \cdots + c_{1p} z_p &= 0\\ c_{21} z_1 + c_{22} z_2 + \cdots + c_{2p} z_p &= 0\\ &\kern{0.25em}\vdots\\ c_{q1} z_1 + c_{n2} z_2 + \cdots + c_{qp} z_p &= 0 \end{aligned}\right.

C1z1=θ\bm{C}_1 \bm{z}_1 = \bm{\theta}

由于 q<pq < p,方程个数小于未知元个数,则存在非零解 (z1,z2,,zp)=(t1,t2,,tp)(z_1, z_2, \cdots, z_p) = (t_1, t_2, \cdots, t_p)

z=(t1,t2,,tp,0,,0)\bm{z} = (t_1, t_2, \cdots, t_p, 0, \cdots, 0),则

w=Cz=[C1C2C3C4][z1pθnp]=[C1z1C3z1]=[θqC3z1]=(0,,0q,sq+1,,sn)\begin{aligned} \bm{w} &= \bm{C}\bm{z}\\ &= \begin{bmatrix} \bm{C}_1 & \bm{C}_2 \\ \bm{C}_3 & \bm{C}_4 \end{bmatrix} \begin{bmatrix} \bm{z}_1^{p}\\ \bm{\theta}^{n - p} \end{bmatrix}\\ &= \begin{bmatrix} \bm{C}_1 \bm{z}_1\\ \bm{C}_3 \bm{z}_1 \end{bmatrix}\\ &= \begin{bmatrix} \bm{\theta}^{q}\\ \bm{C}_3 \bm{z}_1 \end{bmatrix}\\ &= (\overbrace{0, \cdots, 0}^{q}, s_{q+1}, \cdots, s_n) \end{aligned}

代入 (1)(1) 式,得

t12++tp2=sq+12sn2t_1^2 + \cdots + t_p^2 = - s_{q+1}^2 - \cdots - s_n^2

左边 t12++tp2>0t_1^2 + \cdots + t_p^2 > 0,右边 sq+12sn20- s_{q+1}^2 - \cdots - s_n^2 \le 0,矛盾。

pqp \le q,同理可证 qpq \le p,从而 p=qp = q

若实对称矩阵 A,B\bm{A},\, \bm{B} 合同,则二次型 xAx\bm{x}^\intercal\bm{A}\bm{x}xBx\bm{x}^\intercal\bm{B}\bm{x} 有相同的秩、正惯性指数和负惯性指数。


证明:

存在可逆矩阵 P\bm{P} 使得

PAP=[111100]\bm{P}^\intercal \bm{A} \bm{P} = \begin{bmatrix} 1 & & & & & & & & \\ & \ddots & & & & & & & \\ & & 1 & & & & & & \\ & & & -1 & & & & & \\ & & & & \ddots & & & & \\ & & & & & -1 & & & \\ & & & & & & 0 & & \\ & & & & & & & \ddots & \\ & & & & & & & & 0 \end{bmatrix}

而又存在可逆矩阵 Q\bm{Q} 使得 B=QAQ\bm{B} = \bm{Q}^\intercal \bm{A} \bm{Q},即 A=(Q1)BQ1\bm{A} = (\bm{Q}^{-1})^\intercal \bm{B} \bm{Q}^{-1},从而

PAP=P(Q1)BQ1P=(Q1P)B(Q1P)=[111100]\begin{aligned} \bm{P}^\intercal \bm{A} \bm{P} &= \bm{P}^\intercal (\bm{Q}^{-1})^\intercal \bm{B} \bm{Q}^{-1} \bm{P}\\ &= (\bm{Q}^{-1} \bm{P})^\intercal \bm{B} (\bm{Q}^{-1} \bm{P})\\ &= \begin{bmatrix} 1 & & & & & & & & \\ & \ddots & & & & & & & \\ & & 1 & & & & & & \\ & & & -1 & & & & & \\ & & & & \ddots & & & & \\ & & & & & -1 & & & \\ & & & & & & 0 & & \\ & & & & & & & \ddots & \\ & & & & & & & & 0 \end{bmatrix} \end{aligned}

R=Q1P\bm{R} = \bm{Q}^{-1} \bm{P},则 R\bm{R} 可逆,且

RBR=[111100]\bm{R}^\intercal \bm{B} \bm{R} = \begin{bmatrix} 1 & & & & & & & & \\ & \ddots & & & & & & & \\ & & 1 & & & & & & \\ & & & -1 & & & & & \\ & & & & \ddots & & & & \\ & & & & & -1 & & & \\ & & & & & & 0 & & \\ & & & & & & & \ddots & \\ & & & & & & & & 0 \end{bmatrix}

从而 xAx\bm{x}^\intercal \bm{A} \bm{x}xBx\bm{x}^\intercal \bm{B} \bm{x} 有相同的秩、正惯性指数和负惯性指数。

正定二次型

f(x1,x2,,xn)=xAxf(x_1, x_2, \cdots, x_n) = \bm{x}^\intercal \bm{A} \bm{x} 为实二次型,若对任意实向量 xθ\bm{x} \ne \bm{\theta},都有

f(x)=xAx>0f(\bm{x}) = \bm{x}^\intercal \bm{A} \bm{x} > 0

则称 f(x)f(\bm{x})正定二次型A\bm{A}正定矩阵

同理可定义负定二次型半正定二次型

正定矩阵一定是实对称矩阵

f(x1,x2,x3)=x12+x22f(x_1, x_2, x_3) = x_1^2 + x_2^2 就不是正定二次型,因为 f(0,0,a)=0f(0, 0, a) = 0 对任意 aR\{0}a \in \R\backslash\left\lbrace 0 \right\rbrace 成立,而 (0,0,a)θ(0, 0, a) \ne \bm{\theta}

该二次型称为半正定二次型,二次型矩阵称为半正定矩阵,因为对任意实向量 xθ\bm{x} \ne \bm{\theta},都有

f(x)=xAx0f(\bm{x}) = \bm{x}^\intercal \bm{A} \bm{x} \ge 0

同理有半负定二次型半负定矩阵

A\bm{A}nn 阶正定矩阵,P\bm{P}nn 阶可逆矩阵,则 PAP\bm{P}^\intercal \bm{A} \bm{P} 也是正定矩阵。

即合同变换不改变正定性。


证明:

因为 A\bm{A} 为正定矩阵,故对任意 xθ\bm{x} \ne \bm{\theta},都有

xAx>0\bm{x}^\intercal \bm{A} \bm{x} > 0

即证对任意 yθ\bm{y} \ne \bm{\theta},都有

y(PAP)y>0\bm{y}^\intercal (\bm{P}^\intercal \bm{A} \bm{P}) \bm{y} > 0

x=Py\bm{x} = \bm{P} \bm{y},则 xθ\bm{x} \ne \bm{\theta},从而

y(PAP)y=(Py)A(Py)=xAx>0\begin{aligned} \bm{y}^\intercal (\bm{P}^\intercal \bm{A} \bm{P}) \bm{y} &= (\bm{P} \bm{y})^\intercal \bm{A} (\bm{P} \bm{y})\\ &= \bm{x}^\intercal \bm{A} \bm{x}\\ &> 0 \end{aligned}

顺序主子式

设矩阵 A=[aij]n×n\bm{A} = \left[ a_{ij} \right]_{n \times n},称如下行列式

a11a12a1ka21a22a2kak1ak2akk\begin{vmatrix} a_{11} & a_{12} & \cdots & a_{1k} \\ a_{21} & a_{22} & & a_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ a_{k1} & a_{k 2} & \cdots & a_{kk} \end{vmatrix}

A\bm{A}kk 阶顺序主子式。显然 AMn(R)\bm{A} \in M_{n}(\R) 的顺序主子式有 nn 个。

A\bm{A}nn 阶实对称矩阵,则以下结论等价:

  1. A\bm{A} 为正定矩阵
  2. A\bm{A} 的特征值全为正
  3. A\bm{A} 的正惯性指数为 nn
  4. A\bm{A} 的各阶顺序主子式全为正
  5. A\bm{A} 合同于单位矩阵 E\bm{E}(存在可逆矩阵 P\bm{P},使得 A=PP\bm{A} = \bm{P}^\intercal \bm{P}

证明:(只证明部分)

  1.     \implies 4.:

A\bm{A} 正定,取 x=(x1,,xk,0,,0)\bm{x} = (x_1, \cdots, x_{k}, 0, \cdots, 0),其中 xix_i 不全为零,则有 xθ\bm{x} \ne \bm{\theta},则 xAx>0\bm{x}^\intercal \bm{A} \bm{x} > 0,即

[x1xk][a11a1kak1akk][x1xk]>0[x1xk]Ak[x1xk]>0\begin{aligned} \begin{bmatrix} x_1 & \cdots & x_k \end{bmatrix} \begin{bmatrix} a_{11} & \cdots & a_{1k} \\ \vdots & \ddots & \vdots \\ a_{k1} & \cdots & a_{kk} \end{bmatrix} \begin{bmatrix} x_1 \\ \vdots \\ x_k \end{bmatrix} &> 0\\ \begin{bmatrix} x_1 & \cdots & x_k \end{bmatrix} \bm{A}_k \begin{bmatrix} x_1 \\ \vdots \\ x_k \end{bmatrix} &> 0 \end{aligned}

AkMk(R)\bm{A}_{k} \in M_{k}(\R) 为对称、正定矩阵,从而 Ak>0|\bm{A}_{k}| > 0,即 Ak\bm{A}_{k} 的顺序主子式全为正。

  1.     \implies 1.:

设实对称矩阵 AMn(R)\bm{A} \in M_n(\R)

数学归纳法,n=1n = 1 时显然成立。

假设 n=m1n = m - 1 时成立,当 n=mn = m 时,有

A=[Bββamm]\bm{A} = \begin{bmatrix} \bm{B} & \bm{\beta}\\ \bm{\beta}^\intercal & a_{mm} \end{bmatrix}\\

其中 BMm1(R)\bm{B} \in M_{m-1}(\R) 为对称正定矩阵,β=[a1mam1,m]Rm1\bm{\beta} = \begin{bmatrix} a_{1m} \\ \vdots \\ a_{m-1, m} \end{bmatrix} \in \R^{m-1}ammRa_{mm} \in \R

构造 P=[EB1βθ1]\bm{P} = \begin{bmatrix} \bm{E} & -\bm{B}^{-1} \bm{\beta} \\ \bm{\theta}^\intercal & 1 \end{bmatrix},则有

PAP=[E(θ)(B1β)1][Bββamm][EB1βθ1]=[EθβB11][Bββamm][EB1βθ1]=[BβθammβB1β][EB1βθ1]=[BθθammβB1β]\begin{aligned} \bm{P}^\intercal \bm{A} \bm{P} &= \begin{bmatrix} \bm{E} & (\bm{\theta}^\intercal)^\intercal \\ \left(-\bm{B}^{-1} \bm{\beta}\right)^\intercal & 1 \end{bmatrix} \begin{bmatrix} \bm{B} & \bm{\beta}\\ \bm{\beta}^\intercal & a_{mm} \end{bmatrix} \begin{bmatrix} \bm{E} & -\bm{B}^{-1} \bm{\beta} \\ \bm{\theta}^\intercal & 1 \end{bmatrix}\\ &= \begin{bmatrix} \bm{E} & \bm{\theta} \\ - \bm{\beta}^\intercal \bm{B}^{-1} & 1 \end{bmatrix} \begin{bmatrix} \bm{B} & \bm{\beta}\\ \bm{\beta}^\intercal & a_{mm} \end{bmatrix} \begin{bmatrix} \bm{E} & -\bm{B}^{-1} \bm{\beta}^\intercal \\ \bm{\theta}^\intercal & 1 \end{bmatrix}\\ &= \begin{bmatrix} \bm{B} & \bm{\beta}\\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix} \begin{bmatrix} \bm{E} & -\bm{B}^{-1} \bm{\beta} \\ \bm{\theta}^\intercal & 1 \end{bmatrix}\\ &= \begin{bmatrix} \bm{B} & \bm{\theta}\\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix}\\ \end{aligned}

因为 A>0\left\lvert \bm{A} \right\rvert > 0,则

PAP=PAP=P2A>0\begin{aligned} \left\lvert \bm{P}^\intercal \bm{A} \bm{P} \right\rvert &= \left\lvert \bm{P}^\intercal \right\rvert \left\lvert \bm{A} \right\rvert \left\lvert \bm{P} \right\rvert\\ &= \left\lvert \bm{P} \right\rvert^2 \left\lvert \bm{A} \right\rvert\\ &> 0 \end{aligned}

B(ammβB1β)>0\left\lvert \bm{B} \right\rvert\left(a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta}\right) > 0,而 B>0\left\lvert \bm{B} \right\rvert > 0,则有 ammβB1β>0a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} > 0

B\bm{B} 为正定矩阵,存在可逆矩阵 QMm1(R)\bm{Q} \in M_{m - 1}(\R) 使得 QBQ=Em1\bm{Q}^\intercal \bm{B} \bm{Q} = \bm{E}_{m - 1},令

C=[Qθθ1]\bm{C} = \begin{bmatrix} \bm{Q} & \bm{\theta}\\ \bm{\theta}^\intercal & 1 \end{bmatrix}

则有

C(PAP)C=[Qθθ1][BθθammβB1β][Qθθ1]=[QBQθθammβB1β]=[Em1θθammβB1β]=(PC)A(PC)\begin{aligned} \bm{C}^\intercal \left( \bm{P}^\intercal \bm{A} \bm{P} \right) \bm{C} &= \begin{bmatrix} \bm{Q}^\intercal & \bm{\theta}\\ \bm{\theta}^\intercal & 1 \end{bmatrix} \begin{bmatrix} \bm{B} & \bm{\theta}\\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix} \begin{bmatrix} \bm{Q} & \bm{\theta}\\ \bm{\theta}^\intercal & 1 \end{bmatrix}\\ &= \begin{bmatrix} \bm{Q}^\intercal \bm{B} \bm{Q} & \bm{\theta} \\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix}\\ &= \begin{bmatrix} \bm{E}_{m - 1} & \bm{\theta} \\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix}\\ &= \left( \bm{P} \bm{C} \right) ^\intercal \bm{A} \left( \bm{P} \bm{C} \right)\\ \end{aligned}

D=PC\bm{D} = \bm{P} \bm{C},则有 DAD=[Em1θθammβB1β]\bm{D}^\intercal \bm{A} \bm{D} = \begin{bmatrix} \bm{E}_{m - 1} & \bm{\theta} \\ \bm{\theta}^\intercal & a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} \end{bmatrix},由 ammβB1β>0a_{mm} - \bm{\beta}^\intercal \bm{B}^{-1} \bm{\beta} > 0,故 A\bm{A} 为正定矩阵。

  1.     \implies 5.:

f=x=Pyλ1y12++λnyn2f \xlongequal{\bm{x} = \bm{P} \bm{y}} \lambda_1 y_1^2 + \cdots + \lambda_n y_n^2,其中 λi>0\lambda_i > 0A\bm{A} 的特征值。

则有正交矩阵 Q\bm{Q} 使

A=Q[λ1λn]Q=Q[λ1λn][λ1λn]Q=(Q[λ1λn])(Q[λ1λn])=PP \begin{aligned} \bm{A} &= \bm{Q} \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{bmatrix} \bm{Q}^\intercal\\ &= \bm{Q} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \bm{Q}^\intercal \\ &= \left( \bm{Q} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \right) \left( \bm{Q} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \right)^\intercal \\ &= \bm{P}^\intercal \bm{P} \end{aligned}

其中 P=(Q[λ1λn])\bm{P} = \left( \bm{Q} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \right)^\intercal

推论

正定矩阵行列式大于零。

正定矩阵 A\bm{A} 主对角线元素 aii>0a_{ii} > 0


证明:

存在可逆矩阵 P\bm{P},使

A=PP=[α1αn][α1αn]=[α1α1α1αnαnα1αnαn]\begin{aligned} \bm{A} &= \bm{P}^\intercal \bm{P}\\ &= \begin{bmatrix} \bm{\alpha}_1^\intercal \\ \vdots \\ \bm{\alpha}_n^\intercal \end{bmatrix} \begin{bmatrix} \bm{\alpha}_1 & \cdots & \bm{\alpha}_n \end{bmatrix}\\ &= \begin{bmatrix} \bm{\alpha}_1^\intercal \bm{\alpha}_1 & \cdots & \bm{\alpha}_1^\intercal \bm{\alpha}_n \\ \vdots & \ddots & \vdots \\ \bm{\alpha}_n^\intercal \bm{\alpha}_1 & \cdots & \bm{\alpha}_n^\intercal \bm{\alpha}_n \end{bmatrix} \end{aligned}

从而有 aii=αiαi=αi2>0a_{ii} = \bm{\alpha}_i^\intercal \bm{\alpha}_i = \| \bm{\alpha}_i \|^2 > 0。(可逆矩阵,不可能为零向量)

主子式

设矩阵 A=[aij]n×n\bm{A} = \left[ a_{ij} \right]_{n \times n},取 i1<i2<<iki_1 < i_2 < \cdots < i_{k},则如下行列式

ai1i1ai1i2ai1ikai2i1ai2i2ai2ikaiki1aiki2aikik\begin{vmatrix} a_{i_1 i_1} & a_{i_1 i_2} & \cdots & a_{i_1 i_k} \\ a_{i_2 i_1} & a_{i_2 i_2} & \cdots & a_{i_2 i_k} \\ \vdots & \vdots & \ddots & \vdots \\ a_{i_k i_1} & a_{i_k i_2} & \cdots & a_{i_k i_k} \end{vmatrix}

称为 A\bm{A} 的一个 kk 阶主子式

正定矩阵 A\bm{A}kk 阶主子式全为正。


证明:

A\bm{A} 正定,取 x=(x1,x2,,xn)\bm{x} = (x_1, x_2, \cdots, x_n)^\intercal,其中 xi1,xi2,,xikx_{i_1},\, x_{i_2},\, \cdots,\, x_{i_k} 不全为零,其余 nkn - k 个分量为零,则有 xθ\bm{x} \ne \bm{\theta},则 xAx>0\bm{x}^\intercal \bm{A} \bm{x} > 0,即

[xi1xik][ai1i1ai1ikaiki1aikik][xi1xik]>0[xi1xik]B[xi1xik]>0\begin{aligned} \begin{bmatrix} x_{i_1} & \cdots & x_{i_k} \end{bmatrix} \begin{bmatrix} a_{i_1 i_1} & \cdots & a_{i_1 i_k} \\ \vdots & \ddots & \vdots \\ a_{i_k i_1} & \cdots & a_{i_k i_k} \end{bmatrix} \begin{bmatrix} x_{i_1} \\ \vdots \\ x_{i_k} \end{bmatrix} &> 0\\ \begin{bmatrix} x_{i_1} & \cdots & x_{i_k} \end{bmatrix} \bm{B} \begin{bmatrix} x_{i_1} \\ \vdots \\ x_{i_k} \end{bmatrix} &> 0 \end{aligned}

BMk(R)\bm{B} \in M_k(\R) 为对称、正定矩阵,从而 B>0|\bm{B}| > 0,即 B\bm{B} 的顺序主子式全为正。

i1,i2,,iki_1,\, i_2,\, \cdots,\, i_k 的任意性,可知 A\bm{A}kk 阶主子式全为正。

上面写正定矩阵合同于单位矩阵,即存在可逆矩阵 P\bm{P},使得 A=PP\bm{A} = \bm{P}^\intercal \bm{P}

实际上有更强的结论,存在正定矩阵 B\bm{B},使得 A=BB\bm{A} = \bm{B}^\intercal \bm{B},从而 A=B2\bm{A} = \bm{B}^2


证明:

    \impliedby

A=B2    A=BB\bm{A} = \bm{B}^2 \implies \bm{A} = \bm{B}^\intercal \bm{B},其中 B\bm{B} 为正定矩阵。从而 A\bm{A} 正定。

xAx=xBBx=(Bx)Bx=Bx2>0\bm{x}^\intercal \bm{A} \bm{x} = \bm{x}^\intercal \bm{B}^\intercal \bm{B} \bm{x} = \left( \bm{B} \bm{x} \right)^\intercal \bm{B} \bm{x} = \| \bm{B} \bm{x} \|^2 > 0,从而 A\bm{A} 正定。

    \implies

A\bm{A} 为正定矩阵,存在正交矩阵 P\bm{P}PP=E\bm{P} \bm{P}^\intercal = \bm{E})使得

A=P[λ1λn]P=P[λ1λn][λ1λn]P=PΛΛP=PΛPPΛP=(PΛP)(PΛP)\begin{aligned} \bm{A} &= \bm{P}^\intercal \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{bmatrix} \bm{P}\\ &= \bm{P}^\intercal \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \begin{bmatrix} \sqrt{\lambda_1} & & \\ & \ddots & \\ & & \sqrt{\lambda_n} \end{bmatrix} \bm{P}\\ &= \bm{P}^\intercal \bm{\Lambda} \bm{\Lambda} \bm{P}\\ &= \bm{P}^\intercal \bm{\Lambda} \bm{P} \bm{P}^\intercal \bm{\Lambda} \bm{P}\\ &= \left( \bm{P}^\intercal \bm{\Lambda} \bm{P} \right)^\intercal \left( \bm{P}^\intercal \bm{\Lambda} \bm{P} \right)\\ \end{aligned}

B=PΛP\bm{B} = \bm{P}^\intercal \bm{\Lambda} \bm{P},则有 A=BB=B2\bm{A} = \bm{B}^\intercal \bm{B} = \bm{B}^2

A\bm{A} 为实对称矩阵,则当实数 tt 充分大时,A+tE\bm{A} + t \bm{E} 为正定矩阵。


证明:

由于 A\bm{A} 为实对称矩阵,存在正交矩阵 P\bm{P} 使得

A=P[λ1λn]P\bm{A} = \bm{P}^\intercal \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{bmatrix} \bm{P}

A+tE=P[λ1λn]P+tPP=PΛP+P(tE)P=P(Λ+tE)P=P[λ1+tλn+t]P\begin{aligned} \bm{A} + t \bm{E} &= \bm{P}^\intercal \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{bmatrix} \bm{P} + t \bm{P}^\intercal \bm{P}\\ &= \bm{P}^\intercal\bm{\Lambda}\bm{P} + \bm{P}^\intercal (t \bm{E}) \bm{P}\\ &= \bm{P}^\intercal \left( \bm{\Lambda} + t \bm{E} \right) \bm{P}\\ &= \bm{P}^\intercal \begin{bmatrix} \lambda_1 + t & & \\ & \ddots & \\ & & \lambda_n + t \end{bmatrix} \bm{P} \end{aligned}

t>maxi=1,,n{λi}t > \max\limits_{i = 1,\, \cdots,\, n} \left\{ \left\lvert \lambda_i \right\rvert \right\} 时,A+tE\bm{A} + t \bm{E} 的特征值全为正,从而 A+tE\bm{A} + t \bm{E} 为正定矩阵。

ff 既不是半正定二次型,也不是半负定二次型,则称 ff不定二次型,称 A\bm{A}不定矩阵

A\bm{A} 为负定矩阵     \iff A-\bm{A} 为正定矩阵。

注意负定矩阵行列式不一定为负,实际上有

(1)ra11a12a1ra21a22a2rar1ar2arr>0(-1)^r \begin{vmatrix} a_{11} & a_{12} & \cdots & a_{1r} \\ a_{21} & a_{22} & & a_{2r} \\ \vdots & \vdots & \ddots & \vdots \\ a_{r1} & a_{r 2} & \cdots & a_{rr} \end{vmatrix} > 0

对实对称矩阵 A\bm{A},以下结论等价:

  1. A\bm{A} 为半正定矩阵
  2. ff 正惯性指数 =r(A)= \rank(\bm{A})
  3. A\bm{A} 特征值均不小于零
  4. 存在实矩阵 B\bm{B},使得 A=BB\bm{A} = \bm{B}^\intercal \bm{B}
  5. A\bm{A} 所有主子式不小于零(不是顺序主子式!)[1]

  1. 顺序主子式全不小于零,不一定为半正定矩阵,如 [0001]\begin{bmatrix} 0 & 0 \\ 0 & -1 \end{bmatrix} ↩︎