Judge: Bayesian bagging to generate uncertainty intervals: a catcher framing story
From SABR member Jonathan Judge at Baseball Prospectus on March 7, 2018:
This post confronts a familiar problem: the need for speedier estimation of Bayesian inference. Markov Chain Monte Carlo (MCMC) provides superior accuracy with reliable uncertainty estimates, but the process can be too time-consuming for some applications. Oft-recommended alternatives can be unacceptably inaccurate (posterior simulation) or fail to converge (variational inference). To help fill this gap, I propose a Bayesian block bootstrap aggregator to estimate uncertainty intervals around (transformed) group-level parameters, as computed by the (g)lmer function of the lme4 package in the R programming environment.
The procedure estimates the spread of the posterior for group members by sampling with replacement from a Dirichlet distribution in blocks of taken pitches from entire plate appearances. The model is re-run on each set of resampled data, and the modeled (and transformed) intercepts are then aggregated to create an approximate posterior for all modeled groups. In a fraction of the time required for full MCMC, this Bayesian ‘bagging’ procedure can reasonably estimate the uncertainty around the most likely contributions of catchers in framing borderline baseball pitches. The similarity of this process to frequentist bootstrapping should make it approachable for practitioners unfamiliar with Bayesian estimation.
This page was last updated March 7, 2018 at 2:35 pm MST.