Precise Bayesian Neural Networks
Carlos Stein Brito
Published: 2025/6/24
Abstract
Despite its long history, Bayesian neural networks (BNNs) and variational training remain underused in practice: standard Gaussian posteriors misalign with network geometry, KL terms can be brittle in high dimensions, and implementations often add complexity without reliably improving uncertainty. We revisit the problem through the lens of normalization. Because normalization layers neutralize the influence of weight magnitude, we model uncertainty \emph{only in weight directions} using a von Mises-Fisher posterior on the unit sphere. High-dimensional geometry then yields a single, interpretable scalar per layer--the effective post-normalization noise $\sigma_{\mathrm{eff}}$--that (i) corresponds to simple additive Gaussian noise in the forward pass and (ii) admits a compact, dimension-aware KL in closed form. We derive accurate, closed-form approximations linking concentration $\kappa$ to activation variance and to $\sigma_{\mathrm{eff}}$ across regimes, producing a lightweight, implementation-ready variational unit that fits modern normalized architectures and improves calibration without sacrificing accuracy. This dimension awareness is critical for stable optimization in high dimensions. In short, by aligning the variational posterior with the network's intrinsic geometry, BNNs can be simultaneously principled, practical, and precise.