Bootstrap

从几何视角看奇异值分解(MATLAB实现)

相关文章

奇异值分解介绍
奇异值分解的应用及MATLAB实现(图像压缩、谱聚类、潜在语义分析、主成分分析、背景删除等)

几何视角

在奇异值分解中, U \mathbf{U} U V \mathbf{V} V为酉矩阵,即 U \mathbf{U} U U H \mathbf{U}^ \text{H} UH的乘积为单位矩阵, V \mathbf{V} V V H \mathbf{V}^ \text{H} VH的乘积也是单位矩阵。从空间向量角度来看, U = ( u 1 , u 2 , … , u n ) \boldsymbol{U}=(\boldsymbol{u}_{1},\boldsymbol{u}_{2},\ldots,\boldsymbol{u}_{n}) U=(u1,u2,,un) C n \mathbf{C}^{n} Cn的标准正交基, V = ( V 1 , V 2 , … , V D ) \boldsymbol{V}=(\boldsymbol{V}_{1},\boldsymbol{V}_{2},\ldots,\boldsymbol{V}_{D}) V=(V1,V2,,VD) C D \mathbf{C}^{D} CD的标准正交基。
根据等式 ( 1.4 ) (1.4) (1.4)右边三个矩阵的形态,从集合视角,酉矩阵 U \mathbf{U} U V \mathbf{V} V的作用是旋转,而对角矩阵的作用是缩放,不同于特征值分解中的“旋转( Q T \mathbf{Q}^\text{T} QT)-> 缩放( Λ \mathbf{Λ} Λ )->旋转( Q \mathbf{Q} Q)”,奇异值分解中的“旋转( V H \mathbf{V}^\text{H} VH)-> 缩放( S \mathbf{S} S )->旋转( U \mathbf{U} U)”,一个明显的区别就是 V H \boldsymbol{V}^\text{H} VH旋转发生在 C D \mathbf{C}^{D} CD空间,而 U \boldsymbol{U} U旋转发生在 C n \mathbf{C}^{n} Cn空间里。
下面结合一个例子说明这种分步几何变换,给定矩阵 A \mathbf{A} A,以及 e 1 \mathbf{e_1} e1 e 2 \mathbf{e_2} e2两个单位向量:
A = [ 1.625 0.6495 0.6495 0.875 ] , e 1 = [ 1 0 ] , e 2 = [ 0 1 ] A=\begin{bmatrix}1.625&0.6495\\0.6495&0.875\end{bmatrix}, \quad e_1=\begin{bmatrix}1\\0\end{bmatrix},\quad e_2=\begin{bmatrix}0\\1\end{bmatrix} A=[1.6250.64950.64950.875],e1=[10],e2=[01]
对矩阵 A \mathbf{A} A进行奇异值分解可得:
A = U S V T = [ 0.866 − 0.5 0.5 0.866 ] [ 2 0 0 0.5 ] [ 0.866 − 0.5 0.5 0.866 ] A=USV^\mathrm{T}=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix} A=USVT=[0.8660.50.50.866][2000.5][0.8660.50.50.866]
e 1 \mathbf{e_1} e1 e 2 \mathbf{e_2} e2两个单位向量先通过 V T \mathbf{V}^\text{T} VT进行旋转,得到:
V T e 1 = [ 0.866 0.5 − 0.5 0.866 ] [ 1 0 ] = [ 0.866 − 0.5 ] V T e 2 = [ 0.866 0.5 − 0.5 0.866 ] [ 0 1 ] = [ 0.5 0.866 ] ( 2.1 ) \begin{gathered}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}0.866&0.5\\-0.5&0.866\end{bmatrix}\begin{bmatrix}1\\0\end{bmatrix}=\begin{bmatrix}0.866\\-0.5\end{bmatrix}\\\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}0.866&0.5\\-0.5&0.866\end{bmatrix}\begin{bmatrix}0\\1\end{bmatrix}=\begin{bmatrix}0.5\\0.866\end{bmatrix}\end{gathered} \quad(2.1) VTe1=[0.8660.50.50.866][10]=[0.8660.5]VTe2=[0.8660.50.50.866][01]=[0.50.866](2.1)
在式 ( 2.1 ) (2.1) (2.1)的基础上,再用对角矩阵 S \mathbf{S} S进行缩放,可得:
S V T e 1 = [ 2 0 0 0.5 ] [ 0.866 − 0.5 ] = [ 1.732 − 0.25 ] S V T e 2 = [ 2 0 0 0.5 ] [ 0.5 0.866 ] = [ 1 0.433 ] ( 2.2 ) \begin{gathered}\boldsymbol{S}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.866\\-0.5\end{bmatrix}=\begin{bmatrix}1.732\\-0.25\end{bmatrix}\\\boldsymbol{S}\boldsymbol{V}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}2&0\\0&0.5\end{bmatrix}\begin{bmatrix}0.5\\0.866\end{bmatrix}=\begin{bmatrix}1\\0.433\end{bmatrix}\end{gathered} \quad(2.2) SVTe1=[2000.5][0.8660.5]=[1.7320.25]SVTe2=[2000.5][0.50.866]=[10.433](2.2)
在之前旋转( V T \mathbf{V}^\text{T} VT)和缩放( S \mathbf{S} S)的基础上,最后利用 U \mathbf{U} U进行旋转,得到:
A e 1 = U S V T e 1 = [ 0.866 − 0.5 0.5 0.866 ] [ 1.732 − 0.25 ] = [ 1.625 0.6495 ] \boldsymbol{Ae_1} = \boldsymbol{USV}^\mathrm{T}\boldsymbol{e}_1=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}1.732\\-0.25\end{bmatrix}=\begin{bmatrix}1.625\\0.6495\end{bmatrix} Ae1=USVTe1=[0.8660.50.50.866][1.7320.25]=[1.6250.6495]
A e 2 = U S V T e 2 = [ 0.866 − 0.5 0.5 0.866 ] [ 1 0.433 ] = [ 0.6495 0.875 ] \boldsymbol{Ae_2} = \boldsymbol{USV}^\mathrm{T}\boldsymbol{e}_2=\begin{bmatrix}0.866&-0.5\\0.5&0.866\end{bmatrix}\begin{bmatrix}1\\0.433\end{bmatrix}=\begin{bmatrix}0.6495\\0.875\end{bmatrix} Ae2=USVTe2=[0.8660.50.50.866][10.433]=[0.64950.875]
上述过程如图2-1所示:
图2-1 奇异值分解的几何视角
上述图表实现:

%%  ==========1.绘制单位圆和两个单位向量==========
% 设置theta为从0到2pi的等间隔点
theta = linspace(0, 2*pi, 100);

% 计算圆的x和y坐标
circle_x1 = cos(theta);
circle_x2 = sin(theta);

% 定义两个单位向量
X_vec = [1, 0;
         0, 1];

% 绘制原始圆和两个向量
figure;
plot(circle_x1, circle_x2, 'k--', 'LineWidth', 0.5); hold on;

% 绘制两个向量
quiver(0, 0, X_vec(1,1), X_vec(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, X_vec(2,1), X_vec(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);

% 添加坐标轴
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('Original', 'Interpreter', 'latex');
grid on;
hold off;

%% ==========2.绘制线性变换后的圆和向量==========
% 定义矩阵A
A = [1.6250, 0.6495;
     0.6495, 0.8750];

% 计算线性变换后的圆的坐标
X_circle = [circle_x1; circle_x2]';
transformed_circle = X_circle * A';

% 计算线性变换后的向量
transformed_vec = X_vec * A';

% 绘制线性变换后的圆和向量
figure;
plot(transformed_circle(:,1), transformed_circle(:,2), 'k--', 'LineWidth', 0.5); hold on;

% 绘制变换后的两个向量
quiver(0, 0, transformed_vec(1,1), transformed_vec(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec(2,1), transformed_vec(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);

% 添加坐标轴
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$A$', 'Interpreter', 'latex');
grid on;
hold off;

%% ==========3.使用 SVD 进行分解和绘图==========
% 进行SVD分解
[U, S, V] = svd(A);

% 调整符号
V(:,1) = -V(:,1);
U(:,1) = -U(:,1);

% 打印SVD结果
disp('=== U ===');
disp(U);
disp('=== S ===');
disp(S);
disp('=== V ===');
disp(V);

% 绘制V^T作用下的圆和向量
figure;
transformed_circle_v = X_circle * V;
transformed_vec_v = X_vec *  V;
plot(transformed_circle_v(:,1), transformed_circle_v(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_v(1,1), transformed_vec_v(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_v(2,1), transformed_vec_v(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$V^T$', 'Interpreter', 'latex');
grid on;
hold off;

% 绘制SV^T作用下的圆和向量
figure;
transformed_circle_sv = X_circle * V * S;
transformed_vec_sv = X_vec *  V * S;
plot(transformed_circle_sv(:,1), transformed_circle_sv(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_sv(1,1), transformed_vec_sv(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_sv(2,1), transformed_vec_sv(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$SV^T$', 'Interpreter', 'latex');
grid on;
hold off;

% 绘制USV^T作用下的圆和向量
figure;
transformed_circle_usv = X_circle * V * S * U';
transformed_vec_usv = X_vec *  V * S * U';
plot(transformed_circle_usv(:,1), transformed_circle_usv(:,2), 'k--', 'LineWidth', 0.5); hold on;
quiver(0, 0, transformed_vec_usv(1,1), transformed_vec_usv(1,2), 'Color', [0, 0.4392, 0.7529], 'MaxHeadSize', 1);
quiver(0, 0, transformed_vec_usv(2,1), transformed_vec_usv(2,2), 'Color', [1, 0, 0], 'MaxHeadSize', 1);
ax = gca;
ax.XColor = 'k';
ax.YColor = 'k';
ax.XLim = [-2.5, 2.5];
ax.YLim = [-2.5, 2.5];
axis equal;
xlabel('$x_1$', 'Interpreter', 'latex');
ylabel('$x_2$', 'Interpreter', 'latex');
title('$USV^T$', 'Interpreter', 'latex');
grid on;
hold off;

%% ==========4.逐步计算从 e1 和 e2 到最终结果==========
% 定义单位向量e1和e2
e1 = [1; 0];
e2 = [0; 1];

% 逐步计算
VT_e1 = V' * e1;
VT_e2 = V' * e2;

S_VT_e1 = S * VT_e1;
S_VT_e2 = S * VT_e2;

U_S_VT_e1 = U * S_VT_e1;
U_S_VT_e2 = U * S_VT_e2;

% 打印结果
disp('=== VT_e1 ===');
disp(VT_e1);
disp('=== VT_e2 ===');
disp(VT_e2);
disp('=== S_VT_e1 ===');
disp(S_VT_e1);
disp('=== S_VT_e2 ===');
disp(S_VT_e2);
disp('=== U_S_VT_e1 ===');
disp(U_S_VT_e1);
disp('=== U_S_VT_e2 ===');
disp(U_S_VT_e2);

参考文献

[1] Visualize-ML. 2024. Visualize-ML/Book4_Power-of-Matrix. Retrieved from [GitHub Repository https://github.com/Visualize-ML/Book4_Power-of-Matrix].

;