I'm drawn to questions of how much useful information can be extracted from difficult data. This puts me in and around the intersection of a few research areas including Bayesian analysis, partial identification, causal inference, measurement error, and evidence synthesis. Many of the motivations for, and applications of, my work arise in epidemiology, public health, and biostatistics.
Using the laws of probability to summarize knowledge of what you don't observe given what you do observe - it's such a general and fundamental strategy. What could possibly go wrong? Well, ... nothing! You have a decision-theoretic guarantee that the resulting inference strategy has best-possible average performance. The nuance is that the averaging here is with respect to a used-specified distribution over the unobserved stuff, a.k.a, the prior distribution. So we discuss - specification of a prior distribution - blessing or curse?
I have a long-standing interest in identification issues in Bayesian analysis. I posit that in many observational data settings, realistic modelling of uncertainties at hand will only yield a partially identified model (hereafter abbreviated PID). A PID model is characterized by complete knowledge of the distributional law of observables only ruling out some values of the target parameter. So an infinite data sample might reveal only an interval of possible values for the target. Consequently, the large-sample limit of the posterior distribution on the target will be a non-degenerate distribution, and it behooves us to determine whether this distribution is usefully narrow or uselessly wide. In practical terms, it seems identification should be viewed more an issue of extent (e.g., how wide is the limiting posterior) than a no=bad, yes=good issue.
An emergent research theme lies in the domain of causal inference. This field has evolved in a decidedly non-Bayesian manner, particularly via an emphasis on methods based on inverse-probability weighting (propensity scores), as such techniques generally cannot be arrived at in a Bayesian manner. Against this, however, we must remember that Bayesian procedures are optimal in a decision-theoretic sense. And there is a dearth of literature looking at how sub-optimal non-Bayesian causal inference methods might be in this sense. There seems to be general acknowledgement that (i), Bayesian strategies excel in the management of complex uncertainties, and (ii), causal inference problems are indeed characterized by the involvement of complex uncertainties. Yet the potential for Bayesian methods in these problems remains largely untapped.
I have long-standing interests in using Bayesian methods to adjust for the fact that the data actually in hand may not be the data one wishes were in hand. In particular, I've worked on problems of adjusting for the reality that an explanatory variable may be poorly measured. Interesting questions arise here concerning how much must be known about the nature and extent of mismeasurement, in order to effect a useful adjustment.
I have fledgling interests in what might be described as "evidence synthesis," again largely from a Bayesian perspective. I've recently encountered several public health applications where a single data source does not inform the target of inferential interest, whereas a combination of data sources does. Questions about information flow in such settings are tantalizing. In a related vein, I have come to appreciate that network meta-analysis, in addition to being a "killer app" for Bayes, presents very interesting questions on how information flows.right