---
title: "07 Simulations.Crash Course in Statistics (Summer 2025)"
subtitle: "Neuroscience Center Zurich, University of Zurich"
author: "Zofia Baranczuk"
date: "2025-08-25"
output: pdf_document
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```



## 1. Multiple testing and type I error.
Run a simulation to show why multiple testing may lead to rejecting true hypothesis and how the Bonferroni correction deals with that.


```{r multiple_testing}
# Idea: run multiple tests when H0 is true, check how often we get
# at least one significant result.








set.seed(1)
reps <- 1000
sigs <- numeric(reps)
sigs_bonferroni <- numeric(reps)
for (j in 1:reps) {
  N <- 100  #number of tests
  p_vals <- numeric(N) # prepared vector for the p-values for all the tests
  n <- 50 # sample_size
    for (i in 1:N){
    x <- rnorm(n)
    y <- rnorm(n)
    tt <- t.test(x, y, var.equal = TRUE, conf.level = 0.95)
    p_vals[i] <- tt$p.value
  }
  sigs[j] <- any(p_vals < 0.05)
  sigs_bonferroni[j] <- any(p_vals < 0.05/N)
}
mean(sigs)            # probability of at least one false positive
mean(sigs_bonferroni) # with the Bonferroni correction

```

## 2. Bonferroni correction, example 2 (07_Worksheet). 
The Bonferroni correction is too strict when the tests are not independent. Run a simulation showing that. Discussion.
```{r multiple_testing2}



```


## 3. Distribution of p-value.
Run a simulation to show the distribution of p-values when the hypothesis 0 is true, and the assumptions are met.  

```{r p_value1}
set.seed(1)
N <- 10000
pvals <- numeric(N)
for (i in 1:N){
  tt <- t.test(rnorm(10), rnorm(10))
  pvals[i] <- tt$p.value
}
hist(pvals)
mean(pvals < 0.05)
```


## 4. Distribution of p-value when the assumptions are not met. 
Run a simulation to show the distribution of p-values when the hypothesis 0 is true, and the assumptions are not met.

```{r p_value2}

set.seed(1)
N <- 10000
pvals <- numeric(N)
for (i in 1:N){
  tt <- t.test(rexp(10), rexp(10))
  pvals[i] <- tt$p.value
}
hist(pvals)
mean(pvals < 0.05)

```


## 5. Power of a t-test. (07_Worksheet)
Run a simulation to approximate the power of a t-test. Check it for different sample sizes and effect sizes. Discussion.  
```{r}


```


## 6. Central Limit Theorem

```{r}
sample_means <- replicate(10000, mean(rexp(200)))

x <- seq(min(sample_means), max(sample_means), length.out = 1000)
hist(sample_means, breaks = 30, main = "CLT: Sampling distribution of the mean",
     freq = FALSE)
lines(x, dnorm(x, mean = mean(sample_means), sd = sd(sample_means)), col = "red", lwd = 2)
```