Client says: “With cell culture assays, we are trying to experimentally break down these interactions in controlled environments. In this example (see plot below), what we really want to test is: Did the stim change the pg/mL compared to control, and did the drug change the response to the stim. We give the drug alone to experimentally tease apart that interaction. In this example, I would say that the drug by itself did not affect control cells, but it inhibited the response to the stim.” (emphasis mine)
For example:
StatLab associate says: My initial thoughts are, for question 1 (Did the stim change the pg/mL compared to control, and did the drug change the response to the stim) to run a two-way ANOVA with an interaction to see the main effects of the drug and stim and if the stim changes the effect of the drug. In clients’s example, he would likely see that “Drug” is not significant, “Stim” is significant, and “Drug + Stim” is not significant. (emphasis mine)
We can simulate data based on the figure to help us think about this.
drug <- factor(c(0, 0, 0,
1, 1, 1,
0, 0, 0,
1, 1, 1))
stim <- factor(c(0, 0, 0,
0, 0, 0,
1, 1, 1,
1, 1, 1))
set.seed(10)
pgmL <- rnorm(n = 12, mean = 10 + 0.2*(drug == "1") +
50*(stim == "1") -
40*(drug == "1")*(stim == "1"), sd = 2)
d <- data.frame(pgmL, drug, stim)
# calculate means
d2 <- aggregate(pgmL ~ drug + stim, data = d, FUN = mean)
Recreate plot without error bars, which appear to be SD bars not SE.
library(ggplot2)
ggplot() +
aes(x = interaction(drug, stim), y = pgmL) +
geom_col(data = d2, alpha = 1/5, color = "black") +
geom_point(data = d, position = position_jitter(width = 0.3, height = 0)) +
scale_x_discrete(labels = c("ctrl", "drug", "stim", "drug+stim"))
In this case I think an interaction plot would be more suitable. When drug = 1, the effect of stim is greatly reduced.
interaction.plot(x.factor = d$stim, trace.factor = d$drug, response = d$pgmL)
We can run a 2-way ANOVA as follows.
m <- lm(pgmL ~ drug * stim, data = d)
If we look at the model coefficients, it does seem the effect of drug is not significant relative to the control. However, this assumes stim = 0. The interaction is large and significant, so it may be the drug has an effect when stim = 1. That’s what we saw in the interaction plot. Likewise, the significant stim coefficient suggests that stim does change pg/mL relative to control (when drug = 0).
summary(m)
##
## Call:
## lm(formula = pgmL ~ drug * stim, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.7181 -1.1546 0.4882 0.8076 1.4049
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 8.9754 0.7689 11.673 2.64e-06 ***
## drug1 1.2813 1.0874 1.178 0.272
## stim1 48.8923 1.0874 44.964 6.61e-11 ***
## drug1:stim1 -37.8817 1.5378 -24.634 7.88e-09 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.332 on 8 degrees of freedom
## Multiple R-squared: 0.997, Adjusted R-squared: 0.9959
## F-statistic: 883.9 on 3 and 8 DF, p-value: 2.011e-10
If we look at the ANOVA table for this analysis, we see both main effects and the interaction are significant. This may seem contradictory to the model summary output. However, it simply shows that both variables and their interaction explain the variability in the outcome.
anova(m)
## Analysis of Variance Table
##
## Response: pgmL
## Df Sum Sq Mean Sq F value Pr(>F)
## drug 1 935.57 935.57 527.52 1.370e-08 ***
## stim 1 2691.27 2691.27 1517.47 2.073e-10 ***
## drug:stim 1 1076.27 1076.27 606.85 7.878e-09 ***
## Residuals 8 14.19 1.77
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can do posthoc comparisons to see if the drug changes the response to the stim. This means looking at the mean response of stim conditional on drug. We see below that the effect of stim changes dramatically depending on drug. The effect of stim drops from 48.9 to 11.
library(emmeans)
emmeans(m, specs = ~ stim | drug) |>
pairs(reverse = TRUE)
## drug = 0:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 48.9 1.09 8 44.964 <0.0001
##
## drug = 1:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 11.0 1.09 8 10.126 <0.0001
We can also look at the difference of those differences as follows. This shows the difference is about -37.9 with a 95% CI of [-41.4,-34.4].
emmeans(m, specs = ~ stim | drug) |>
pairs(reverse = TRUE) |>
pairs(simple = "drug", reverse = TRUE) |>
confint()
## contrast = stim1 - stim0:
## contrast1 estimate SE df lower.CL upper.CL
## drug1 - drug0 -37.9 1.54 8 -41.4 -34.3
##
## Confidence level used: 0.95
We can visualize this model with an effect plot. This shows standard error bars. We can clearly see the effect of stim greatly reduced when drug = 1.
library(ggeffects)
ggpredict(m, terms = c("stim", "drug")) |> plot()
Client says: “This example is the same as the previous one, but we’ve just added a second drug. Again, the drug B alone is more of an experimental control, but here we do see that it reduces the signal compared to the control. Meanwhile, Drug B doesn’t impact whatever the stim is doing. Also an important note, we would NOT be looking at what does Drug A and Drug B do together, or what Drug A and Drug B do to Stim.” (emphasis mine)
StatLab associate says: For question 2, Client doesn’t have values for all interactions (2x2x2) and is not interested in what Drug A and Drug B do together, or what Drug A and Drug B do to Stim. It seems to me that these could be treated as different treatments, and a one-way ANOVA could be used? He would need to do post-hoc testing to look at differences between treatments. Could he also fit a model where:
Y ~ Drug A + Drug B + Stim + Drug A:Stim + Drug B:stim
Would this better model the interactions instead of thinking of them as “treatment groups”?
We can simulate data based on the figure to help us think about this.
drugA <- factor(
c(0, 0, 0,
1, 1, 1,
0, 0, 0,
0, 0, 0,
1, 1, 1,
0, 0, 0))
drugB <- factor(
c(0, 0, 0,
0, 0, 0,
1, 1, 1,
0, 0, 0,
0, 0, 0,
1, 1, 1))
stim <- factor(
c(0, 0, 0,
0, 0, 0,
0, 0, 0,
1, 1, 1,
1, 1, 1,
1, 1, 1))
set.seed(2)
pgmL <- rnorm(n = 18, mean = 10 +
0.2*(drugA == "1") +
-4*(drugB == "1") +
50*(stim == "1") -
40*(drugA == "1")*(stim == "1") -
0.8*(drugB == "1")*(stim == "1"), sd = 2)
d <- data.frame(pgmL, drugA, drugB, stim)
# calculate means
d2 <- aggregate(pgmL ~ drugA + drugB + stim, data = d, FUN = mean)
Recreate plot without error bars, which appear to be SD bars not SE.
library(ggplot2)
ggplot() +
aes(x = interaction(drugA, drugB, stim), y = pgmL) +
geom_col(data = d2, alpha = 1/5, color = "black") +
geom_point(data = d, position = position_jitter(width = 0.3, height = 0)) +
scale_x_discrete(labels = c("ctrl", "drugA", "drugB", "stim",
"drugA+stim", "drugB+stim"))
Since he doesn’t want all two way interactions, we can also visualize this using two separate interaction plots. In the first plot, it’s easy to see how the effect of drug A reduces the effect of stim.
interaction.plot(x.factor = d$stim, trace.factor = d$drugA, response = d$pgmL,
main = "Drug A and Stim interaction")
In this plot, there appears to be a mild interaction, but this may be due to sampling variability.
interaction.plot(x.factor = d$stim, trace.factor = d$drugB, response = d$pgmL,
main = "Drug B and Stim interaction")
I think the StatLab associate’s proposed model is the cleanest way to analyze this question:
Y ~ Drug A + Drug B + Stim + Drug A:Stim + Drug B:stim
m2 <- lm(pgmL ~ drugA + drugB + stim + drugA:stim + drugB:stim, data = d)
The model summary tells us the drug B and stim interaction is small and uncertain.
summary(m2)
##
## Call:
## lm(formula = pgmL ~ drugA + drugB + stim + drugA:stim + drugB:stim,
## data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.6910 -1.4360 -0.1096 1.0930 3.3312
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 10.584 1.373 7.707 5.50e-06 ***
## drugA1 -1.103 1.942 -0.568 0.581
## drugB1 -2.949 1.942 -1.518 0.155
## stim1 50.257 1.942 25.878 6.77e-12 ***
## drugA1:stim1 -39.305 2.747 -14.311 6.65e-09 ***
## drugB1:stim1 -3.623 2.747 -1.319 0.212
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.379 on 12 degrees of freedom
## Multiple R-squared: 0.9922, Adjusted R-squared: 0.989
## F-statistic: 305.8 on 5 and 12 DF, p-value: 3.24e-12
The same basic result can be seen in the ANOVA table.
anova(m2)
## Analysis of Variance Table
##
## Response: pgmL
## Df Sum Sq Mean Sq F value Pr(>F)
## drugA 1 1350.5 1350.5 238.7178 2.769e-09 ***
## drugB 1 68.0 68.0 12.0152 0.004663 **
## stim 1 5815.0 5815.0 1027.8397 5.355e-13 ***
## drugA:stim 1 1405.7 1405.7 248.4732 2.201e-09 ***
## drugB:stim 1 9.8 9.8 1.7399 0.211765
## Residuals 12 67.9 5.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Since he’s interested in what Drug A and Drug B do to Stim, we can use this model for some post-hoc comparisons using emmeans. We should do it separately for drug A and drug B. First let’s look at the effect of stim conditional on drug A. We see the effect of stim is reduced quite a bit when drug A is administered.
# Hold drug B at 0 since drug B is not used with drug A
emmeans(m2, specs = ~ stim | drugA, at = list(drugB = "0")) |>
pairs(reverse = TRUE)
## drugA = 0:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 50.3 1.94 12 25.878 <0.0001
##
## drugA = 1:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 11.0 1.94 12 5.639 0.0001
Take the difference in those differences. This reduction is estimated to be in the range of [-45.3, -33.3] with 95% confidence.
emmeans(m2, specs = ~ stim | drugA, at = list(drugB = "0")) |>
pairs(reverse = TRUE) |>
pairs(simple = "drugA", reverse = TRUE) |>
confint()
## contrast = stim1 - stim0:
## contrast1 estimate SE df lower.CL upper.CL
## drugA1 - drugA0 -39.3 2.75 12 -45.3 -33.3
##
## Confidence level used: 0.95
We can carry out the same analysis for drug B. The effect of stim is about the same regardless of whether drug B is administered. There appears to be a slight reduction, but is it reliably in one direction?
# Hold drug A at 0 since drug A is not used with drug B
emmeans(m2, specs = ~ stim | drugB, at = list(drugA = "0")) |>
pairs(reverse = TRUE)
## drugB = 0:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 50.3 1.94 12 25.878 <0.0001
##
## drugB = 1:
## contrast estimate SE df t.ratio p.value
## stim1 - stim0 46.6 1.94 12 24.012 <0.0001
The slight reduction of -3.62 is uncertain with a 95% CI of [-9.61, 2.36]. We’re not sure what kind of effect drug B has on stim, if any.
emmeans(m2, specs = ~ stim | drugB, at = list(drugA = "0")) |>
pairs(reverse = TRUE) |>
pairs(simple = "drugB", reverse = TRUE) |>
confint()
## contrast = stim1 - stim0:
## contrast1 estimate SE df lower.CL upper.CL
## drugB1 - drugB0 -3.62 2.75 12 -9.61 2.36
##
## Confidence level used: 0.95
As before, we can visualize these results with effect plots. We see the effect of stim is greatly reduced when drug A is administered.
ggpredict(m2, terms = c("stim", "drugA"), condition = c(drugB = "0")) |>
plot()
On the other hand, the effect of stim hardly changes in the presence of drug B.
ggpredict(m2, terms = c("stim", "drugB"), condition = c(drugA = "0")) |>
plot()
Let’s try this as a one-way ANOVA. If we fit a model without the intercept we simply get group means.
d$grp <- interaction(d$drugA, d$drugB, d$stim, drop = TRUE)
levels(d$grp) <- c("ctrl", "drugA", "drugB", "stim", "drugA+stim", "drugB+stim")
m3 <- lm(pgmL ~ -1 + grp, data = d)
summary(m3)
##
## Call:
## lm(formula = pgmL ~ -1 + grp, data = d)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.6910 -1.4360 -0.1096 1.0930 3.3312
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## grpctrl 10.584 1.373 7.707 5.50e-06 ***
## grpdrugA 9.481 1.373 6.904 1.64e-05 ***
## grpdrugB 7.635 1.373 5.560 0.000124 ***
## grpstim 60.840 1.373 44.304 1.14e-14 ***
## grpdrugA+stim 20.433 1.373 14.879 4.26e-09 ***
## grpdrugB+stim 54.269 1.373 39.518 4.45e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.379 on 12 degrees of freedom
## Multiple R-squared: 0.9969, Adjusted R-squared: 0.9954
## F-statistic: 647.3 on 6 and 12 DF, p-value: 2.378e-14
The ANOVA table is much less informative.
anova(m3)
## Analysis of Variance Table
##
## Response: pgmL
## Df Sum Sq Mean Sq F value Pr(>F)
## grp 6 21973.2 3662.2 647.32 2.378e-14 ***
## Residuals 12 67.9 5.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can get comparisons using emmeans but it’s a bit tricky.
The first comparison (c1) is grpstim - grpctrl (ie, effect of stim when drug A = 0) and the second comparison (c2) is grpdrugA+stim - grpdrugA (ie, effect of stim when drug A = 1)
emmeans(m3, specs = ~ grp) |> contrast(list(c1 = c(-1, 0, 0, 1, 0, 0),
c2 = c(0, -1, 0, 0, 1, 0)))
## contrast estimate SE df t.ratio p.value
## c1 50.3 1.94 12 25.878 <0.0001
## c2 11.0 1.94 12 5.639 0.0001
Taking the difference of those differences gives us the same result as before: 95% CI of [-45.3, -33.3]
emmeans(m3, specs = ~ grp) |> contrast(list(c1 = c(-1, 0, 0, 1, 0, 0),
c2 = c(0, -1, 0, 0, 1, 0))) |>
pairs(reverse = TRUE) |>
confint()
## contrast estimate SE df lower.CL upper.CL
## c2 - c1 -39.3 2.75 12 -45.3 -33.3
##
## Confidence level used: 0.95
I think the model with interactions is easier to work with.