PRML/2.24

varθ[θ]=ED[varθ[θD]]+varD[Eθ[θD]]\operatorname{var}_\theta[\boldsymbol{\theta}]=\mathbb{E}_{\mathcal{D}}\left[\operatorname{var}_\theta[\boldsymbol{\theta} \mid \mathcal{D}]\right]+\operatorname{var}_{\mathcal{D}}\left[\mathbb{E}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \mathcal{D}]\right]
左转根据方差定义拆开

  • varθ[θ]=Eθ[θ2]Eθ2[θ]\operatorname{var}_\theta[\boldsymbol{\theta}]=\mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]

要证明 ED[varθ[θD]]+varD[Eθ[θD]]=Eθ[θ2]Eθ2[θ]\mathrm{E}_D\left[\operatorname{var}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]+\operatorname{var}_D\left[\mathrm{E}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]=\mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]

  • 左边第一项

    • ED[varθ[θD]]=varθ[θD]p(D)dD=(Eθ[θ2D]Eθ2[θD])p(D)dD=Eθ[θ2D]p(D)dDEθ2[θD]p(D)dD=θ2p(θD)dθp(D)dDED[Eθ2[θD]]=θ2p(θ)dθED[Eθ2[θD]]=Eθ[θ2]ED[Eθ2[θD]]\begin{aligned} \mathrm{E}_D\left[\operatorname{var}_\theta[\boldsymbol{\theta} \mid D]\right] & =\int \operatorname{var}_\theta[\boldsymbol{\theta} \mid D] p(D) \mathrm{d} D \\ & =\int\left(\mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2 \mid D\right]-\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right) p(D) \mathrm{d} D \\ & =\int \mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2 \mid D\right] p(D) \mathrm{d} D-\int \mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D] p(D) \mathrm{d} D \\ & =\iint \boldsymbol{\theta}^2 p(\boldsymbol{\theta} \mid D) \mathrm{d} \boldsymbol{\theta} p(D) \mathrm{d} D-\mathrm{E}_D\left[\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right] \\ & =\int \boldsymbol{\theta}^2 p(\boldsymbol{\theta}) \mathrm{d} \boldsymbol{\theta}-\mathrm{E}_D\left[\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right] \\ & =\mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}_D\left[\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right]\end{aligned}
  • 左边第二项

    • varD[Eθ[θD]]=ED[Eθ2[θD]]ED2[Eθ[θD]]=ED[Eθ2[θD]]Eθ2[θ]\begin{aligned} \operatorname{var}_D\left[\mathrm{E}_\theta[\boldsymbol{\theta} \mid D]\right] & =\mathrm{E}_D\left[\mathrm{E}_\theta^2[\theta \mid D]\right]-\mathrm{E}_D^2\left[\mathrm{E}_\theta[\theta \mid D]\right] \\ & =\mathrm{E}_D\left[\mathrm{E}_\theta^2[\theta \mid D]\right]-\mathrm{E}_\theta^2[\theta]\end{aligned}
  • 两项相加

    • ED[varθ[θD]]+varD[Eθ[θD]]=Eθ[θ2]Eθ2[θ]\mathrm{E}_D\left[\operatorname{var}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]+\operatorname{var}_D\left[\mathrm{E}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]=\mathrm{E}_{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]
作者

Ryen Xiang

发布于

2024-10-05

更新于

2024-10-05

许可协议


网络回响

评论