PRML/2.24
$\operatorname{var}\theta[\boldsymbol{\theta}]=\mathbb{E}{\mathcal{D}}\left[\operatorname{var}\theta[\boldsymbol{\theta} \mid \mathcal{D}]\right]+\operatorname{var}{\mathcal{D}}\left[\mathbb{E}_{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \mathcal{D}]\right]$
左转根据方差定义拆开
- $\operatorname{var}\theta[\boldsymbol{\theta}]=\mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}_{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]$
要证明 $\mathrm{E}D\left[\operatorname{var}{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]+\operatorname{var}D\left[\mathrm{E}{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]=\mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]$
左边第一项
- $\begin{aligned} \mathrm{E}D\left[\operatorname{var}\theta[\boldsymbol{\theta} \mid D]\right] & =\int \operatorname{var}\theta[\boldsymbol{\theta} \mid D] p(D) \mathrm{d} D \ & =\int\left(\mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2 \mid D\right]-\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right) p(D) \mathrm{d} D \ & =\int \mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2 \mid D\right] p(D) \mathrm{d} D-\int \mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D] p(D) \mathrm{d} D \ & =\iint \boldsymbol{\theta}^2 p(\boldsymbol{\theta} \mid D) \mathrm{d} \boldsymbol{\theta} p(D) \mathrm{d} D-\mathrm{E}D\left[\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right] \ & =\int \boldsymbol{\theta}^2 p(\boldsymbol{\theta}) \mathrm{d} \boldsymbol{\theta}-\mathrm{E}D\left[\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right] \ & =\mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}D\left[\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta} \mid D]\right]\end{aligned}$
左边第二项
- $\begin{aligned} \operatorname{var}D\left[\mathrm{E}\theta[\boldsymbol{\theta} \mid D]\right] & =\mathrm{E}D\left[\mathrm{E}\theta^2[\theta \mid D]\right]-\mathrm{E}D^2\left[\mathrm{E}\theta[\theta \mid D]\right] \ & =\mathrm{E}D\left[\mathrm{E}\theta^2[\theta \mid D]\right]-\mathrm{E}_\theta^2[\theta]\end{aligned}$
两项相加
- $\mathrm{E}D\left[\operatorname{var}{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]+\operatorname{var}D\left[\mathrm{E}{\boldsymbol{\theta}}[\boldsymbol{\theta} \mid \boldsymbol{D}]\right]=\mathrm{E}{\boldsymbol{\theta}}\left[\boldsymbol{\theta}^2\right]-\mathrm{E}{\boldsymbol{\theta}}^2[\boldsymbol{\theta}]$