Research

Transport Maps for Accelerated Bayesian Computation

See more on DSpace@MIT

Abstract

Bayesian inference provides a probabilistic framework for combining prior knowledge with mathematical models and observational data. Characterizing a Bayesian posterior probability distribution can be a computationally challenging undertaking, however, particularly when evaluations of the posterior density are expensive and when the posterior has complex non-Gaussian structure. This thesis addresses these challenges by developing new approaches for both exact and approximate posterior sampling. In particular, we make use of deterministic couplings between random variables–i.e., transport maps–to accelerate posterior exploration. Transport maps are deterministic transformations between (probability) measures. We introduce new algorithms that exploit these transformations as a fundamental tool for Bayesian inference. At the core of our approach is an ecient method for constructing transport maps using only samples of a target distribution, via the solution of a convex optimization problem. We first demonstrate the computational eciency and accuracy of this method, exploring various parameterizations of the transport map, on target distributions of low-to-moderate dimension. Then we introduce an approach that composes sparsely parameterized transport maps with rotations of the parameter space, and demonstrate successful scaling to much higher dimensional target distributions. With these building blocks in place, we introduce three new posterior sampling algorithms. First is an adaptive Markov chain Monte Carlo (MCMC) algorithm that uses a transport map to dene an ecient proposal mechanism. We prove that this algorithm is ergodic for the exact target distribution and demonstrate it on a range of parameter inference problems, showing multiple order-of-magnitude speedups over current stateof- the-art MCMC techniques, as measured by the number of effectively independent samples produced per model evaluation and per unit of wall clock time. Second, we introduce an algorithm for inference in large-scale inverse problems with multiscale structure. Multiscale structure is expressed as a conditional independence relationship that is naturally induced by many multiscale methods for the solution of partial differential equations, such as the multiscale finite element method (MsFEM). Our algorithm exploits the offline construction of transport maps that represent the joint distribution of coarse and ne-scale parameters. We evaluate the accuracy of our approach via comparison to single-scale MCMC on a 100-dimensional problem, then demonstrate the algorithm on an inverse problem from ow in porous media that has over 105 spatially distributed parameters. Our last algorithm uses offline computation to construct a transport map representation of the joint data-parameter distribution that allows for ecient conditioning on data. The resulting algorithm has two key attributes: first, it can be viewed as a “likelihood-free” approximate Bayesian computation (ABC) approach, in that it only requires samples, rather than evaluations, of the likelihood function. Second, it is designed for approximate inference in near-real-time. We evaluate the eciency and accuracy of the method, with demonstration on a nonlinear parameter inference problem where excellent posterior approximations can be obtained in two orders of magnitude less online time than a standard MCMC sampler.