Week 4 figures - Lectures 7 and 8

effect size
power analysis
confidence intervals
Author
Affiliation
Lecture date

April 21, 2025

Modified

April 28, 2025

0. set up

Code
# cleaning
library(tidyverse)

# visualization
theme_set(theme_classic() +
            theme(panel.grid = element_blank(),
                  axis.text = element_text(size = 18),
                  axis.title = element_text(size = 18),
                  text = element_text(family = "Lato")))
library(patchwork)

# cohen's d
library(effectsize)

# power
library(pwr)

1. Math

a. Cohen’s d

\[ Cohen's \; d = \frac{\bar{y_A} - \bar{y_B}}{\sqrt{(s^2_A + s^2_B)/2}} \]

b. Cohen’s d with separated SD

\[ Cohen's \; d = \frac{\bar{y_A} - \bar{y_B}}{\sqrt{\frac{(n_A - 1)\times s^2_A + (n_B - 1)\times s^2_B}{n_A + n_B - 2}}} \]

b. confidence interval for two-sample t-test

\[ CI = (\bar{y_A} - \bar{y_B}) \pm t \times \sqrt{\frac{(n_A - 1)s_A^2 + (n_B - 1)s_B^2}{n_A+n_B-2}} \times \sqrt{\frac{1}{n_A}+\frac{1}{n_B}} \]

c. test statistic for paired t-test

\[ t_s = \frac{\bar{y}_d - \mu_0}{s_d - \sqrt{n}} \]

d. standard error for two-sample t-test

If variances are not equal:

\[ SE_{\bar{y_A} - \bar{y_B}} = \sqrt{\frac{s_A^2}{n_A}+\frac{s_B^2}{n_B}} \]

2. Interpret this output

You have two raised beds in which you’re growing tomatoes. One bed is in the sun, but the other is in shade. You want to know if the weight of the tomatoes is different between beds. You measure 33 tomatoes from each bed.

Code
tomatoes <- cbind(sunny = rnorm(n = 33, mean = 150, sd = 20),
                  shaded = rnorm(n = 33, mean = 130, sd = 10)) %>% 
  as_tibble() %>% 
  pivot_longer(cols = 1:2, names_to = "sun_level", values_to = "weight_g")

ggplot(data = tomatoes,
       aes(x = sun_level,
           y = weight_g)) +
  geom_jitter(width = 0.1)

Code
var.test(weight_g ~ sun_level,
         data = tomatoes)

    F test to compare two variances

data:  weight_g by sun_level
F = 0.24646, num df = 32, denom df = 32, p-value = 0.0001527
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.1217225 0.4990146
sample estimates:
ratio of variances 
         0.2464575 
Code
t.test(weight_g ~ sun_level,
       data = tomatoes,
       var.equal = FALSE)

    Welch Two Sample t-test

data:  weight_g by sun_level
t = -6.6939, df = 46.87, p-value = 2.412e-08
alternative hypothesis: true difference in means between group shaded and group sunny is not equal to 0
95 percent confidence interval:
 -33.27431 -17.89509
sample estimates:
mean in group shaded  mean in group sunny 
            127.4624             153.0471 

3. Power analysis

Code
# higher power
pwr.t.test(n = NULL, d = 0.7, sig.level = 0.05, power = 0.95)

     Two-sample t test power calculation 

              n = 54.01938
              d = 0.7
      sig.level = 0.05
          power = 0.95
    alternative = two.sided

NOTE: n is number in *each* group
Code
# lower power
pwr.t.test(n = NULL, d = 0.7, sig.level = 0.05, power = 0.80)

     Two-sample t test power calculation 

              n = 33.02457
              d = 0.7
      sig.level = 0.05
          power = 0.8
    alternative = two.sided

NOTE: n is number in *each* group

4. Write about this result

You have two worm compost bins: one in which you throw citrus peels, and the other in which you don’t. You’re curious to see if the citrus worms are bigger than the non-citrus worms. You measure 34 worms from each bin and find this result:

Code
worms <- cbind(citrus = rnorm(n = 34, mean = 140, sd = 20),
               non_citrus = rnorm(n = 34, mean = 160, sd = 15)) %>% 
  as_tibble() %>% 
  pivot_longer(cols = 1:2, names_to = "compost_bin", values_to = "weight_g")

ggplot(data = worms,
       aes(x = compost_bin,
           y = weight_g)) +
  geom_boxplot()

Code
var.test(weight_g ~ compost_bin,
         data = worms)

    F test to compare two variances

data:  weight_g by compost_bin
F = 1.005, num df = 33, denom df = 33, p-value = 0.9886
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.5019252 2.0123262
sample estimates:
ratio of variances 
          1.005006 
Code
t.test(weight_g ~ compost_bin,
       data = worms,
       var.equal = FALSE)

    Welch Two Sample t-test

data:  weight_g by compost_bin
t = -7.2003, df = 66, p-value = 7.126e-10
alternative hypothesis: true difference in means between group citrus and group non_citrus is not equal to 0
95 percent confidence interval:
 -32.57036 -18.42879
sample estimates:
    mean in group citrus mean in group non_citrus 
                135.7626                 161.2622 

5. Effect size examples

a. large sample size, small difference

\[ \bar{y_a} - \bar{y_b} \]

Code
set.seed(1)
data <- cbind(a = rnorm(n = 100, mean = 10, sd = 2), 
               b = rnorm(n = 100, mean = 11, sd = 2)) %>% 
  as_tibble() %>% 
  pivot_longer(cols = 1:2, names_to = "group", values_to = "value") %>% 
  group_by(group)

set.seed(1)
large <- data %>% 
  slice_sample(n = 40)

ggplot(data = large,
       aes(x = group,
           y = value)) +
  stat_summary(geom = "pointrange",
               fun.data = mean_se) +
  scale_y_continuous(limits = c(5, 16)) 

Code
t.test(value ~ group,
       data = large,
       var.equal = TRUE)

    Two Sample t-test

data:  value by group
t = -2.9666, df = 78, p-value = 0.003996
alternative hypothesis: true difference in means between group a and group b is not equal to 0
95 percent confidence interval:
 -2.1892413 -0.4308888
sample estimates:
mean in group a mean in group b 
       9.819199       11.129264 
Code
set.seed(1)
small <- data %>% 
  slice_sample(n = 20)

ggplot(data = small,
       aes(x = group,
           y = value)) +
  stat_summary(geom = "pointrange",
               fun.data = mean_se) +
  scale_y_continuous(limits = c(5, 16)) 

Code
t.test(value ~ group,
       data = small,
       var.equal = TRUE)

    Two Sample t-test

data:  value by group
t = -1.5737, df = 38, p-value = 0.1238
alternative hypothesis: true difference in means between group a and group b is not equal to 0
95 percent confidence interval:
 -1.9408171  0.2431029
sample estimates:
mean in group a mean in group b 
       10.23925        11.08811 
Code
set.seed(1)

small <- cbind(a = MASS::mvrnorm(n = 10, mu = 16.5, Sigma = 2, empirical = TRUE),
               b = MASS::mvrnorm(n = 10, mu = 17, Sigma = 2, empirical = TRUE)) %>% 
  as.data.frame() %>% 
  rename("a" = "V1", "b" = "V2") %>% 
  pivot_longer(cols = 1:2, names_to = "group", values_to = "value")
# small <- cbind(a = rnorm(n = 10, mean = 14, sd = 3), 
#                b = rnorm(n = 10, mean = 17, sd = 3)) %>% 
#   as_tibble() %>% 
#   pivot_longer(cols = 1:2, names_to = "group", values_to = "value")
ggplot(data = small,
       aes(x = group,
           y = value)) +
  geom_point(size = 3,
             shape = 21,
             aes(color = group),
             position = position_jitter(
               width = 0.15,
               height = 0,
               seed = 10
             )) +
  stat_summary(geom = "point",
               fun = mean,
               color = "red",
               size = 4) +
  # scale_y_continuous(limits = c(5, 16),
  #                    breaks = c(6, 9, 12, 15)) +
  scale_color_manual(values = c("darkblue", "darkgreen")) +
  theme(legend.position = "none")

Code
ggplot(data = small,
       aes(x = value)) +
  geom_density(aes(fill = group),
               alpha = 0.4)

Code
t.test(value ~ group,
       data = small,
       var.equal = TRUE)

    Two Sample t-test

data:  value by group
t = -0.79057, df = 18, p-value = 0.4395
alternative hypothesis: true difference in means between group a and group b is not equal to 0
95 percent confidence interval:
 -1.8287398  0.8287398
sample estimates:
mean in group a mean in group b 
           16.5            17.0 
Code
cohens_d(value ~ group,
         data = small,
         pooled_sd = FALSE)
Cohen's d |        95% CI
-------------------------
-0.35     | [-1.23, 0.54]

- Estimated using un-pooled SD.
Code
set.seed(1)
large <- cbind(a = MASS::mvrnorm(n = 100, mu = 16.5, Sigma = 2, empirical = TRUE),
               b = MASS::mvrnorm(n = 100, mu = 17, Sigma = 2, empirical = TRUE)) %>% 
  as.data.frame() %>% 
  rename("a" = "V1", "b" = "V2") %>% 
  pivot_longer(cols = 1:2, names_to = "group", values_to = "value")
ggplot(data = large,
       aes(x = group,
           y = value)) +
  geom_point(size = 3,
             shape = 21,
             aes(color = group),
             position = position_jitter(
               width = 0.15,
               height = 0,
               seed = 10
             )) +
  stat_summary(geom = "point",
               fun = mean,
               color = "red",
               size = 4) +

  # scale_y_continuous(limits = c(5, 16),
  #                    breaks = c(6, 9, 12, 15)) +
  scale_color_manual(values = c("darkblue", "darkgreen")) +
  theme(legend.position = "none")

Code
ggplot(data = large,
       aes(x = value)) +
  geom_histogram(aes(fill = group),
               alpha = 0.4)

Code
t.test(value ~ group,
       data = large,
       var.equal = TRUE)

    Two Sample t-test

data:  value by group
t = -2.5, df = 198, p-value = 0.01323
alternative hypothesis: true difference in means between group a and group b is not equal to 0
95 percent confidence interval:
 -0.8944035 -0.1055965
sample estimates:
mean in group a mean in group b 
           16.5            17.0 
Code
cohens_d(value ~ group,
         data = large,
         pooled_sd = FALSE)
Cohen's d |         95% CI
--------------------------
-0.35     | [-0.63, -0.07]

- Estimated using un-pooled SD.

b. needlegrass example

Code
set.seed(1)
needlegrass <- cbind(ungrazed = rnorm(n = 35, mean = 82, sd = 6), 
                     grazed = rnorm(n = 35, mean = 74, sd = 5)) %>% 
  as_tibble() %>% 
  pivot_longer(cols = 1:2, names_to = "plot_type", values_to = "height_cm")

# boxplot
ggplot(data = needlegrass,
       aes(x = plot_type,
           y = height_cm,
           color = plot_type)) +
  geom_boxplot(outliers = FALSE,
               linewidth = 1) +
  geom_point(position = position_jitter(
    width = 0.1, 
    height = 0.1,
    seed = 10
  ),
  alpha = 0.4,
  size = 1) +
    scale_color_manual(values = c("darkgreen", "cornflowerblue")) 

Code
# plot without all the adjustments
ggplot(data = needlegrass,
       aes(x = plot_type,
           y = height_cm,
           color = plot_type)) +
  geom_point(position = position_jitter(width = 0.1, height = 0, seed = 10),
             alpha = 0.2) +
  stat_summary(geom = "pointrange",
               fun.data = mean_cl_normal) 

Code
# "finalized" plot
ggplot(data = needlegrass,
       aes(x = plot_type,
           y = height_cm,
           color = plot_type)) +
  geom_point(position = position_jitter(width = 0.1, height = 0, seed = 10),
             alpha = 0.2) +
  scale_color_manual(values = c("darkgreen", "cornflowerblue")) +
  stat_summary(geom = "pointrange",
               fun.data = mean_cl_normal,
               size = 1,
               linewidth = 1) +
  labs(x = "Plot type",
       y = "Height (cm)") +
  theme(legend.position = "none")

Doing a t-test:

Code
var.test(height_cm ~ plot_type,
       data = needlegrass)

    F test to compare two variances

data:  height_cm by plot_type
F = 0.72641, num df = 34, denom df = 34, p-value = 0.3559
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.366669 1.439115
sample estimates:
ratio of variances 
         0.7264149 
Code
t.test(height_cm ~ plot_type,
       data = needlegrass)

    Welch Two Sample t-test

data:  height_cm by plot_type
t = -5.9471, df = 66.334, p-value = 1.127e-07
alternative hypothesis: true difference in means between group grazed and group ungrazed is not equal to 0
95 percent confidence interval:
 -9.720197 -4.834401
sample estimates:
  mean in group grazed mean in group ungrazed 
              75.18323               82.46053 
Code
vect <- dt(x = seq(from = -10, to = 10, by = 0.5), df = 66.334) %>% 
  enframe()

ggplot(data = vect,
       aes(x = name,
           y = value)) +
  geom_line() +
  geom_vline(xintercept = -5.9471)

Cohen’s d:

Code
# pooled SD
cohens_d(height_cm ~ plot_type,
       data = needlegrass)
Cohen's d |         95% CI
--------------------------
-1.42     | [-1.94, -0.89]

- Estimated using pooled SD.
Code
# unpooled SD
cohens_d(height_cm ~ plot_type,
       data = needlegrass, 
       pooled_sd = FALSE)
Cohen's d |         95% CI
--------------------------
-1.42     | [-1.94, -0.89]

- Estimated using un-pooled SD.
Code
# by hand
needlegrass_sum <- needlegrass %>% 
  group_by(plot_type) %>% 
  reframe(
    mean = mean(height_cm),
    var = var(height_cm),
    n = length(height_cm)
  )

ya <- pluck(needlegrass_sum, 2, 1)
yb <- pluck(needlegrass_sum, 2, 2)
vara <- pluck(needlegrass_sum, 3, 1)
varb <- pluck(needlegrass_sum, 3, 2)
na <- pluck(needlegrass_sum, 4, 1)
nb <- pluck(needlegrass_sum, 4, 2)

# by hand

(ya - yb)/sqrt((vara + varb)/2)
[1] -1.421636
Code
(ya - yb)/sqrt(
  ((na - 1)*vara + (nb - 1)*varb)/(na + nb - 2)
)
[1] -1.421636

6. good and bad results statements

Code
managed <- rnorm(n = 33, mean = 5, sd = 1) %>% 
  enframe() %>% 
  mutate(treatment = "managed")
nonintervention <- rnorm(n = 30, mean = 7, sd = 1) %>% 
  enframe() %>% 
  mutate(treatment = "non-intervention") 
pools <- rbind(managed, nonintervention) %>% 
  select(treatment, value) %>% 
  rename(temp = value)
var.test(temp ~ treatment,
         data = pools)

    F test to compare two variances

data:  temp by treatment
F = 1.287, num df = 32, denom df = 29, p-value = 0.4953
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.6197933 2.6377877
sample estimates:
ratio of variances 
          1.286962 
Code
t.test(temp ~ treatment,
       data = pools)

    Welch Two Sample t-test

data:  temp by treatment
t = -11.206, df = 60.948, p-value < 2.2e-16
alternative hypothesis: true difference in means between group managed and group non-intervention is not equal to 0
95 percent confidence interval:
 -2.558536 -1.783706
sample estimates:
         mean in group managed mean in group non-intervention 
                      4.952439                       7.123560 
Code
cohens_d(temp ~ treatment,
       data = pools)
Cohen's d |         95% CI
--------------------------
-2.81     | [-3.51, -2.10]

- Estimated using pooled SD.

Statement: Our data suggest a difference in water temperature between managed (n = 33) and non-intervention (i.e. control, n = 30) vernal pools, with a strong (Cohen’s d = 2.19) effect of management.
Temperatures in managed pools were different from those in non-intervention pools (two-tailed two-sample t-test, t(60.9) = -8.7, p < 0.001, ⍺ = 0.05); on average, managed pools were 5.3 °C, while control pools were 7.1 °C.

7. QQ/histograms

Code
set.seed(666)
data1 <- rnorm(n = 30, mean = 15, sd = 2) |> 
  enframe()

ggplot(data = data1,
       aes(x = value)) +
  geom_histogram(bins = 6,
                 fill = "cornflowerblue",
                 color = "black") +
  scale_y_continuous(expand = c(0, 0))

Code
ggplot(data = data1,
       aes(sample = value)) +
  geom_qq_line(color = "black") +
  geom_qq(color = "cornflowerblue",
          size = 4)

Code
set.seed(666)
data2 <- rgamma(n = 30, shape = 0.5) |> 
  enframe()

ggplot(data = data2,
       aes(x = value)) +
  geom_histogram(bins = 6,
                 fill = "orange",
                 color = "black") +
  scale_y_continuous(expand = c(0, 0))

Code
ggplot(data = data2,
       aes(sample = value)) +
  geom_qq_line(color = "black") +
  geom_qq(color = "orange",
          size = 4)

Citation

BibTeX citation:
@online{bui2025,
  author = {Bui, An},
  title = {Week 4 Figures - {Lectures} 7 and 8},
  date = {2025-04-21},
  url = {https://spring-2025.envs-193ds.com/lecture/lecture_week-04.html},
  langid = {en}
}
For attribution, please cite this work as:
Bui, An. 2025. “Week 4 Figures - Lectures 7 and 8.” April 21, 2025. https://spring-2025.envs-193ds.com/lecture/lecture_week-04.html.