Propensity score (PS)-based models are everywhere these days. While these methods are useful for controlling for unobserved confounders in observational data and for reducing dimensionality in big datasets, it is imperative that analysts should use good judgement when applying and interpreting PS analyses. This is the topic of my recent methods article in ISPOR’s *Value and Outcomes Spotlight*.

I became interested in PS methods during my Master’s thesis work on statin drug use and heart structure and function, which has just been published in *Pharmacoepidemiology and Drug Safety*. To estimate long-term associations between these two variables, I used the Multi-Ethnic Study of Atherosclerosis (MESA)*, *an observational cohort of approximately 6000 individuals with rich covariates, subclinical measures of cardiovascular disease, and clinical outcomes over 10+ years of follow-up. We initially used traditional multivariable linear regression to estimate the association between statin initiation and progression of left ventricular mass over time but found that using PS methods allowed for better control for unobserved confounding. After we generated PS for the probability of starting a statin, we used matching procedures to match initiators and non-initiators, and estimated an average treatment effect in the treated. Estimates from both traditional regressions and PS-matching procedures found a small, dose-dependent protective effect of statins against left ventricular structural dysfunction. This finding of very modest association contrasts with findings from much smaller, short-term studies.

I did my original analyses using Stata, where there are a few packages for PS including *psmatch2* and *teffects*. My analysis used *psmatch2*, which is generally considered inferior to *teffects* because it does not provide proper standard errors. I got around this limitation, however, by bootstrapping confidence intervals, which were all conservative compared with *teffects *confidence intervals.

Recently, I gathered the gumption to redo some of the aforementioned analysis in R. Coding in R is a newly acquired skill of mine, and I wanted to harness some of R’s functionality to build nicer figures. I found this R tutorial from Simon Ejdemyr on propensity score methods in R to be particularly useful. Rebuilding my propensity scores with a logistic model that included approximately 30 covariates and 2389 participant observations, I first wanted to check the region of common support. The region of common support is the overlap between the distributions of PS for the exposed versus unexposed, which indicates the comparability of the two groups. Sometimes, despite fitting the model with every variable you can, PS overlap can be quite bad and matching can’t be done. But I was able to get acceptable overlap on values of PS for statin initiators and non-initiators (see Figure 1). Using the R package *MatchIt *to do nearest neighbor matching with replacement, my matched dataset was reduced to 1670, where all statin initiators matched. I also checked covariate balance conditional on PS in statin initiator and non-initiator groups. Examples are in Figure 2. In these plots, the LOWESS smoother is effectively calculating a mean of the covariate level at the propensity score. I expect the means for statin initiators and non-initiators to be similar, so the smooths should be close. In the ends of the age distribution, I see some separation, which is likely to be normal tail behavior. Formal statistical tests can also be used to test covariates balance in the newly matched groups.

Please see my website for additional info about my work.