相关文章
奇异值分解介绍
奇异值分解的应用及MATLAB实现(图像压缩、谱聚类、潜在语义分析、主成分分析、背景删除等)
几何视角
在奇异值分解中,
U
\mathbf{U}
U和
V
\mathbf{V}
V为酉矩阵,即
U
\mathbf{U}
U和
U
H
\mathbf{U}^ \text{H}
UH的乘积为单位矩阵,
V
\mathbf{V}
V和
V
H
\mathbf{V}^ \text{H}
VH的乘积也是单位矩阵。从空间向量角度来看,
U
=
(
u
1
,
u
2
,
…
,
u
n
)
\boldsymbol{U}=(\boldsymbol{u}_{1},\boldsymbol{u}_{2},\ldots,\boldsymbol{u}_{n})
U=(u1,u2,…,un)为
C
n
\mathbf{C}^{n}
Cn的标准正交基,
V
=
(
V
1
,
V
2
,
…
,
V
D
)
\boldsymbol{V}=(\boldsymbol{V}_{1},\boldsymbol{V}_{2},\ldots,\boldsymbol{V}_{D})
V=(V1,V2,…,VD)为
C
D
\mathbf{C}^{D}
CD的标准正交基。
根据等式
(
1.4
)
(1.4)
(1.4)右边三个矩阵的形态,从集合视角,酉矩阵
U
\mathbf{U}
U和
V
\mathbf{V}
V的作用是旋转,而对角矩阵的作用是缩放,不同于特征值分解中的“旋转(
Q
T
\mathbf{Q}^\text{T}
QT)-> 缩放(
Λ
\mathbf{Λ}
Λ )->旋转(
Q
\mathbf{Q}
Q)”,奇异值分解中的“旋转(
V
H
\mathbf{V}^\text{H}
VH)-> 缩放(
S
\mathbf{S}
S )->旋转(
U
\mathbf{U}
U)”,一个明显的区别就是
V
H
\boldsymbol{V}^\text{H}
VH旋转发生在
C
D
\mathbf{C}^{D}
CD空间,而
U
\boldsymbol{U}
U旋转发生在
C
n
\mathbf{C}^{n}
Cn空间里。
下面结合一个例子说明这种分步几何变换,给定矩阵
A
\mathbf{A}
A,以及
e
1
\mathbf{e_1}
e1和
e
2
\mathbf{e_2}
e2两个单位向量:
A
=
[
1.625
0.6495
0.6495
0.875
]
,
e
1
=
[
1
0
]
,
e
2
=
[
0
1
]
A=\begin{bmatrix}1.625&0.6495\\0.6495&0.875\end{bmatrix}, \quad e_1=\begin{bmatrix}1\\0\end{bmatrix},\quad e_2=\begin{bmatrix}0\\1\end{bmatrix}
A=[1.6250.64950.64950.875],e1=[10],e2=[01]
对矩阵
A
\mathbf{A}
A进行奇异值分解可得:
A
=
U
S
V
T
=
[
0.866
−
0.5
0.5
0.866
]
[
2
0
0
0.5
]
[
0.866
−
0.5
0.5
0.866
]
A=USV^\mathrm{T}=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}
A=USVT=[0.8660.5−0.50.866][2000.5][0.8660.5−0.50.866]
e
1
\mathbf{e_1}
e1和
e
2
\mathbf{e_2}
e2两个单位向量先通过
V
T
\mathbf{V}^\text{T}
VT进行旋转,得到:
V
T
e
1
=
[
0.866
0.5
−
0.5
0.866
]
[
1
0
]
=
[
0.866
−
0.5
]
V
T
e
2
=
[
0.866
0.5
−
0.5
0.866
]
[
0
1
]
=
[
0.5
0.866
]
(
2.1
)
\begin{gathered}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}0.866&0.5\\-0.5&0.866\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix}=\begin{bmatrix}0.866\\-0.5\end{bmatrix}\\\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}0.866&0.5\\-0.5&0.866\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix}=\begin{bmatrix}0.5\\0.866\end{bmatrix}\end{gathered} \quad(2.1)
VTe1=[0.866−0.50.50.866][10]=[0.866−0.5]VTe2=[0.866−0.50.50.866][01]=[0.50.866](2.1)
在式
(
2.1
)
(2.1)
(2.1)的基础上,再用对角矩阵
S
\mathbf{S}
S进行缩放,可得:
S
V
T
e
1
=
[
2
0
0
0.5
]
[
0.866
−
0.5
]
=
[
1.732
−
0.25
]
S
V
T
e
2
=
[
2
0
0
0.5
]
[
0.5
0.866
]
=
[
1
0.433
]
(
2.2
)
\begin{gathered}\boldsymbol{S}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.866\\-0.5\end{bmatrix}=\begin{bmatrix}1.732\\-0.25\end{bmatrix}\\\boldsymbol{S}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.5\\0.866\end{bmatrix}=\begin{bmatrix}1\\0.433\end{bmatrix}\end{gathered} \quad(2.2)
SVTe1=[2000.5][0.866−0.5]=[1.732−0.25]SVTe2=[2000.5][0.50.866]=[10.433](2.2)
在之前旋转(
V
T
\mathbf{V}^\text{T}
VT)和缩放(
S
\mathbf{S}
S)的基础上,最后利用
U
\mathbf{U}
U进行旋转,得到:
A
e
1
=
U
S
V
T
e
1
=
[
0.866
−
0.5
0.5
0.866
]
[
1.732
−
0.25
]
=
[
1.625
0.6495
]
\boldsymbol{Ae_1} = \boldsymbol{USV}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}1.732\\-0.25\end{bmatrix}=\begin{bmatrix}1.625\\0.6495\end{bmatrix}
Ae1=USVTe1=[0.8660.5−0.50.866][1.732−0.25]=[1.6250.6495]
A
e
2
=
U
S
V
T
e
2
=
[
0.866
−
0.5
0.5
0.866
]
[
1
0.433
]
=
[
0.6495
0.875
]
\boldsymbol{Ae_2} = \boldsymbol{USV}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}1\\0.433\end{bmatrix}=\begin{bmatrix}0.6495\\0.875\end{bmatrix}
Ae2=USVTe2=[0.8660.5−0.50.866][10.433]=[0.64950.875]
上述过程如图2-1所示:
上述图表实现:
%% ==========1.绘制单位圆和两个单位向量==========
% 设置theta为从0到2pi的等间隔点
theta = linspace(0, 2*pi, 100);
% 计算圆的x和y坐标
circle_x1 = cos(theta);
circle_x2 = sin(theta);
% 定义两个单位向量
X_vec = [1, 0;
0, 1];
% 绘制原始圆和两个向量
figure;
plot(circle_x1, circle_x2, 'k--', 'LineWidth', 0.5); hold on;
% 绘制两个向量
quiver(0, 0, X_vec(1,1), X_vec(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, X_vec(2,1), X_vec(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
% 添加坐标轴
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('Original', 'Interpreter', 'latex');
grid on;
hold off;
%% ==========2.绘制线性变换后的圆和向量==========
% 定义矩阵A
A = [1.6250, 0.6495;
0.6495, 0.8750];
% 计算线性变换后的圆的坐标
X_circle = [circle_x1; circle_x2]';
transformed_circle = X_circle * A';
% 计算线性变换后的向量
transformed_vec = X_vec * A';
% 绘制线性变换后的圆和向量
figure;
plot(transformed_circle(:,1), transformed_circle(:,2), 'k--', 'LineWidth', 0.5); hold on;
% 绘制变换后的两个向量
quiver(0, 0, transformed_vec(1,1), transformed_vec(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec(2,1), transformed_vec(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
% 添加坐标轴
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$A$', 'Interpreter', 'latex');
grid on;
hold off;
%% ==========3.使用 SVD 进行分解和绘图==========
% 进行SVD分解
[U, S, V] = svd(A);
% 调整符号
V(:,1) = -V(:,1);
U(:,1) = -U(:,1);
% 打印SVD结果
disp('=== U ===');
disp(U);
disp('=== S ===');
disp(S);
disp('=== V ===');
disp(V);
% 绘制V^T作用下的圆和向量
figure;
transformed_circle_v = X_circle * V;
transformed_vec_v = X_vec * V;
plot(transformed_circle_v(:,1), transformed_circle_v(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_v(1,1), transformed_vec_v(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_v(2,1), transformed_vec_v(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$V^T$', 'Interpreter', 'latex');
grid on;
hold off;
% 绘制SV^T作用下的圆和向量
figure;
transformed_circle_sv = X_circle * V * S;
transformed_vec_sv = X_vec * V * S;
plot(transformed_circle_sv(:,1), transformed_circle_sv(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_sv(1,1), transformed_vec_sv(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_sv(2,1), transformed_vec_sv(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$SV^T$', 'Interpreter', 'latex');
grid on;
hold off;
% 绘制USV^T作用下的圆和向量
figure;
transformed_circle_usv = X_circle * V * S * U';
transformed_vec_usv = X_vec * V * S * U';
plot(transformed_circle_usv(:,1), transformed_circle_usv(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_usv(1,1), transformed_vec_usv(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_usv(2,1), transformed_vec_usv(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$USV^T$', 'Interpreter', 'latex');
grid on;
hold off;
%% ==========4.逐步计算从 e1 和 e2 到最终结果==========
% 定义单位向量e1和e2
e1 = [1; 0];
e2 = [0; 1];
% 逐步计算
VT_e1 = V' * e1;
VT_e2 = V' * e2;
S_VT_e1 = S * VT_e1;
S_VT_e2 = S * VT_e2;
U_S_VT_e1 = U * S_VT_e1;
U_S_VT_e2 = U * S_VT_e2;
% 打印结果
disp('=== VT_e1 ===');
disp(VT_e1);
disp('=== VT_e2 ===');
disp(VT_e2);
disp('=== S_VT_e1 ===');
disp(S_VT_e1);
disp('=== S_VT_e2 ===');
disp(S_VT_e2);
disp('=== U_S_VT_e1 ===');
disp(U_S_VT_e1);
disp('=== U_S_VT_e2 ===');
disp(U_S_VT_e2);
参考文献
[1] Visualize-ML. 2024. Visualize-ML/Book4_Power-of-Matrix. Retrieved from [GitHub Repository https://github.com/Visualize-ML/Book4_Power-of-Matrix].