model uncertainty implementation notes

  • Model uncertainty can be obtained from dropout NN models.

Our approximate predictive distirbution is
$$q(y^* | x^*) = \int p(y^*| x^*, w) q(w) dw$$
where $w = \{ W_i \}_{i=1}^L$ is the set of random variables for a model with $L$ layers.

Sample $T$ sets of Bernoulli distributed random vectors, which gives us $\{ W_1^t,...,W_L^t \}_{t=1}^T$. Then
$$E_{q(y^*|x^*)} (y^*) \approx \frac{1}{T} \sum_{t=1}^T \hat{y}^*(x^*, W_1^t,...,W_L^t)$$

This is equivalent to performing $T$ stochastic forward passes through the network and averaging the results.

As for variance,

This is equivelent to the sample variance of $T$ stochastic forward passes through the NN plus the inverse model precision.

Model precision can be found by the following identity:
$$\tau = \frac{p l^2}{2 N \lambda}$$

We can estimate our predictive log-likelihood by Monte Carlo integration of the predictive probability of $y$, which is an estimate of how well the model fits the mean and uncertainty.

For regression, this is given by
$$\log p(y^*|x^*, X, Y) \approx logsumexp \left ( - \frac{1}{2} \tau ||y-\hat{y}||^2$$

Procedure

Given point $x$:

  • Drop units at test time
  • Repeat $T$ times
  • Look at mean and sample variance