This package implements the estimators proposed in Barkley et al. (2017), Causal Inference from Observational Studies with Clustered Interference for estimating the causal effects of different treatment policies in the presence of partial or clustered interference. The package is available on CRAN with a companion website
In causal inference, when one individual’s treatment may affect another individual’s outcome, it’s often called interference. In most applications, it is assumed that there is no interference whatsoever. In some applications this must be relaxed - e.g., as in infectious disease research.
A relaxation of the assumption of “no interference” is to assume that individuals may be partitioned into distinct clusters of individuals (e.g., households, or classrooms, etc.) such that there may be interference within the clusters, but not between the clusters. Historically, this assumption has been referred to as partial interference after Sobel (2006).
Barkley et al. (2017) introduces the terminology clustered interference to refer to this same assumption. This phrase may be sufficiently descriptive of the underlying assumption, and perhaps clarifies the presumed restriction of interference to clusters.
Barkley et al. (2017) proposes new causal estimands for defining treatment effects in the context of observational studies when there may be interference or spillover effects between units in the same cluster. The manuscript also introduces IPTW estimators for thos estimands, which are implemented in ‘clusteredinterference’.
A version of this manuscript is available on arXiv at 1711.04834:
Barkley, B. G., Hudgens, M. G., Clemens, J. D., Ali, M., and Emch, M. E. (2017). Causal inference from observational studies with clustered interference. arXiv preprint arXiv:1711.04834. URL https://arxiv.org/abs/1711.04834.
This package is now on CRAN!
install.packages("clusteredinterference")
Or, visit the GitHub repo:
# devtools::install_github("BarkleyBG/clusteredinterference")
library(clusteredinterference)
Estimation is carried out with one function:
set.seed(1113)
causal_fx <- policyFX(
data = toy_data,
formula = Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID,
alphas = c(.15, .25),
k_samps = 1
)
The estimates of causal estimands are printed in a tidy dataframe:
causal_fx
#> ------------- causal estimates --------------
#> estimand estimate se LCI UCI
#> mu(0.15) 0.6985 0.0893 0.5234 0.8736
#> mu(0.25) 0.6664 0.0702 0.5287 0.8041
#> mu0(0.15) 0.7157 0.0917 0.5360 0.8954
#> mu0(0.25) 0.6869 0.0775 0.5350 0.8388
#> mu1(0.15) 0.1619 0.0429 0.0779 0.2460
#> mu1(0.25) 0.2440 0.0536 0.1389 0.3491
#> OE(0.25,0.15) -0.0321 0.0275 -0.0861 0.0219
#> OE(0.15,0.25) 0.0321 0.0275 -0.0219 0.0861
#> ... and 4 more rows ...
#> ---------------------------------------------
Use summary()
for a little more information:
summary(causal_fx)
#> ------------- causal estimates --------------
#> estimand estimate se LCI UCI
#> mu(0.15) 0.6985 0.0893 0.5234 0.8736
#> mu(0.25) 0.6664 0.0702 0.5287 0.8041
#> mu0(0.15) 0.7157 0.0917 0.5360 0.8954
#> mu0(0.25) 0.6869 0.0775 0.5350 0.8388
#> mu1(0.15) 0.1619 0.0429 0.0779 0.2460
#> mu1(0.25) 0.2440 0.0536 0.1389 0.3491
#> OE(0.25,0.15) -0.0321 0.0275 -0.0861 0.0219
#> OE(0.15,0.25) 0.0321 0.0275 -0.0219 0.0861
#>
#> ... and 4 more rows ...
#>
#> -------------- treatment model -------------
#> Generalized linear mixed model fit by maximum likelihood (Adaptive
#> Gauss-Hermite Quadrature, nAGQ = 2) [glmerMod]
#> Family: binomial ( logit )
#> Formula: Treatment ~ Age + Distance + (1 | Cluster_ID)
#> Data: data
#> AIC BIC logLik deviance df.resid
#> 137.0345 147.3743 -64.5172 129.0345 94
#> Random effects:
#> Groups Name Std.Dev.
#> Cluster_ID (Intercept) 1.18
#> Number of obs: 98, groups: Cluster_ID, 30
#> Fixed Effects:
#> (Intercept) Age Distance
#> -1.44609 -0.00851 0.26097
#>
#> ------------- propensity scores -------------
#> 1 2 3 4 5 6 7 8 9 10
#> 0.105 0.162 0.086 0.102 0.167 0.045 0.244 0.0934 0.0765 0.197
#> 11 12 13 14 15 16 17 18 19 20
#> 0.0653 0.281 0.104 0.365 0.0867 0.198 0.207 0.106 0.0847 0.134
#> 21 22 23 24 25 26 27 28 29 30
#> 0.103 0.111 0.105 0.302 0.0434 0.0943 0.0443 0.0512 0.13 0.263
#> ---------------------------------------------
Note that Treatment ~ Age + Distance + (1 | Cluster_ID)
in the the middle of the formula
argument is sent to lme4::glmer()
to specify the form of the (logit-link binomial) treatment model.
The policyFX()
output list includes an element, formula
, for the Formula
object:
causal_fx$formula
#> Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID
The output list also includes an element, model
, which is the fitted glmerMod
S4 model object. Here we can see that the middle of formula
was passed into the glmer()
logit-link binomial mixed model:
causal_fx$model@call
#> lme4::glmer(formula = Treatment ~ Age + Distance + (1 | Cluster_ID),
#> data = data, family = stats::binomial, nAGQ = nAGQ)
The fitted model estimates three fixed effects (intercept, a term for Age
and a term for Distance
) and one random effect (for Cluster_ID
):
The vignette provides more information on the formal arguments:
vignette("estimate-policyFX")
A changelog is found in the NEWS.md
file. Version history is also tracked by the release tags for this GitHub repo.
inferference
package for related estimators from the following articles:
geex
package for estimating equations.inferference
, geex
, and for comments and suggestions that were helpful in the creation of ‘clusteredinterference’.