解决高斯分布的单峰性,引入多个高斯模型加权平均拟合数据
几何角度
-
p(x)=k=1∑KαkN(μk,Σk),∑α=1
生成模型角度
-
利用离散变量 z 来选择来自哪一个高斯分布
-
p(x)=z∑p(x,z)=k=1∑Kp(x,z=k)=k=1∑Kp(z=k)p(x∣z=k)
-
得到 $$p(x)=\sum\limits_{k=1}^Kp_k\mathcal{N}(x|\mu_k,\Sigma_k)$$
[[MLE]] 求解
-
求解上面的 px
-
θMLE=θargmaxlogp(X)=argmaxθi=1∑Nlogp(xi)=argmaxθi=1∑Nlogk=1∑KpkN(xi∣μk,Σk)
-
log 中有连加号存在,无法求出解析解。
[[EM]] 求解
-
EM 目标求解:θ(t+1)=argmaxθ∫Zlog[p(X,Z∣θ)]p(Z∣X,θ(t))dZ=argmaxθEz∣x,θt[logp(X,Z∣θ)]
-
E-Step
-
Q(θ,θt)=z∑[logi=1∏Np(xi,zi∣θ)]i=1∏Np(zi∣xi,θt)
+ 对第一个累加号展开,第一项为:
+ $$\sum\limits_z\log p(x_1,z_1|\theta)\prod\limits_{i=1}^Np(z_i|x_i,\theta^t)=\sum\limits_z\log p(x_1,z_1|\theta)p(z_1|x_1,\theta^t)\prod\limits_{i=2}^Np(z_i|x_i,\theta^t)\\
=\sum\limits_{z_1}\log p(x_1,z_1|\theta)
p(z_1|x_1,\theta^t)\sum\limits_{z_2,\cdots,z_K}\prod\limits_{i=2}^Np(z_i|x_i,\theta^t)\\
=\sum\limits_{z_1}\log p(x_1,z_1|\theta)p(z_1|x_1,\theta^t)
+ 后面与 1 无关求和结果为1
+ $$Q(\theta,\theta^t)=\sum\limits_{i=1}^N\sum\limits_{z_i}\log p(x_i,z_i|\theta)p(z_i|x_i,\theta^t)$$
+ $$p(x,z|\theta)=p(z|\theta)p(x|z,\theta)=p_z\mathcal{N}(x|\mu_z,\Sigma_z)$$
+ $$p(z|x,\theta^t)=\frac{p(x,z|\theta^t)}{p(x|\theta^t)}=\frac{p_z^t\mathcal{N}(x|\mu_z^t,\Sigma_z^t)}{\sum\limits_kp_k^t\mathcal{N}(x|\mu_k^t,\Sigma_k^t)}$$
+ $$Q=\sum\limits_{i=1}^N\sum\limits_{z_i}\log p_{z_i}\mathcal{N(x_i|\mu_{z_i},\Sigma_{z_i})}\frac{p_{z_i}^t\mathcal{N}(x_i|\mu_{z_i}^t,\Sigma_{z_i}^t)}{\sum\limits_kp_k^t\mathcal{N}(x_i|\mu_k^t,\Sigma_k^t)}$$
-
M-Step
-
Q=k=1∑Ki=1∑N[logpk+logN(xi∣μk,Σk)]p(zi=k∣xi,θt)
-
pkt+1=argmaxpkk=1∑Ki=1∑N[logpk+logN(xi∣μk,Σk)]p(zi=k∣xi,θt) s.t. k=1∑Kpk=1
-
化简 $$p_k{t+1}=\mathop{argmax}_{p_k}\sum\limits_{k=1}K\sum\limits_{i=1}^N\log p_kp(z_i=k|x_i,\thetat) s.t. \sum\limits_{k=1}Kp_k=1$$
-
[[Lagrange Multiplier]] $$L(p_k,\lambda)=\sum\limits_{k=1}K\sum\limits_{i=1}N\log p_kp(z_i=k|x_i,\thetat)-\lambda(1-\sum\limits_{k=1}Kp_k)$$
-
[[Lagrange]] $$L(p_k,\lambda)=\sum\limits_{k=1}K\sum\limits_{i=1}N\log p_kp(z_i=k|x_i,\thetat)-\lambda(1-\sum\limits_{k=1}Kp_k)$$
-
∂pk∂L=i=1∑Npk1p(zi=k∣xi,θt)+λ=0
\Rightarrow\lambda=-N$$
+ $$p_k^{t+1}=\frac{1}{N}\sum\limits_{i=1}^Np(z_i=k|x_i,\theta^t)$$
+ $$\mu_k,\Sigma_k$$ 无约束参数,直接求导
对比 MLE 和 EM 在求解 GMM 问题上的区别
-
θMLE=θargmaxlogp(X)=argmaxθi=1∑Nlogp(xi)=argmaxθi=1∑Nlogk=1∑KpkN(xi∣μk,Σk)
-
Q=k=1∑Ki=1∑N[logpk+logN(xi∣μk,Σk)]p(zi=k∣xi,θt)
-
极大似然需要计算 $$P(X)$$,EM 计算 $$P(X,Z)$$。$$P(X) = \sum _Z P(X,Z)$$,每次单独考虑 $$P(X,Z)$$,避免在 log 中求和。
E 和 M 具体的含义还没有整理!