---
title: "06 Hypothesis Testing. Crash Course in Statistics (Summer 2025)"
subtitle: "Neuroscience Center Zurich, University of Zurich"
author: "Zofia Baranczuk"
date: "2025-08-25"
output: pdf_document
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## 1. T-test for the sickle-cell disease data set
The following hemoglobin levels of blood samples from patients with Hb SS and HbSb sickle cell disease are given (source: Hüsler, J. and Zimmermann, H. (2010). Statistische Prinzipien für medizinische Projekte. Huber,
5 edition.):
For the cases listed below, specify the null and alternative hypothesis. Then use R to perform the tests
and give a careful interpretation.
a) µHbSb = 10 (alpha = 5%, two-sided)
```{r}
HbSS <- c( 7.2, 7.7, 8, 8.1, 8.3, 8.4, 8.4, 8.5, 8.6, 8.7, 9.1,
9.1, 9.1, 9.8, 10.1, 10.3)
HbSb <- c( 8.1, 9.2, 10, 10.4, 10.6, 10.9, 11.1, 11.9, 12.0, 12.1)
t.test(HbSb, mu = 10) # run the t-test. Default: two-sided, 95% CI
qqnorm(HbSb) # testing normality of the data
```

We are testing H0 : µHbSb = 10 versus H1 : µHbSb != 10. Since the p-value is 0.1552, we fail to reject the
null hypothesis at a .05 significance level. Therefore, we do not have enough evidence to say that the mean
HbSb is different from 10. Note: This is not equivalent to accepting the null. In hypothesis testing, there
are only two decisions: 1. fail to reject the null; 2. reject the null.

b) µHbSb = µHbSS (alpha = 1%, two-sided)
```{r}
t.test(HbSb, HbSS, var.equal = T, conf = 0.99) 
qqnorm(HbSS)

```

We are testing H0 : µHbSb = µHbSS versus H1 : µHbSb != µHbSS. Since the p-value is less than alpha = .001, we
reject the null hypothesis. Therefore, there is enough evidence to suggest that the mean of HbSb is different
from the mean of HbSS.
c) What changes, if one-sided tests are performed instead
```{r}
t.test(HbSb, mu = 10, alternative = "greater")
```
Since the ’more extreme values’ (from the p-value definition) are in only one direction (greater values), our
p-value is halved. This means that we could reject H0 when the two-sided counterpart would fail to reject.
Regardless of this, a good statistician will fix the hypothesis beforehand collecting the data and based on
clear reasoning/evidence to, for example, consider a single direction in the hypothesis, and not afterward
due to a potential convenience.


## 2. A comparison of t-test (independent and paired), Wilcoxon test (independent and paired) and the sign test.
```{r}
set.seed(1)
a <- c(rnorm(7,10,1),100)
b <- c(rnorm(8,20,1))

print(a)
print(b)

t.test(a,b)
wilcox.test(a,b)

t.test(a,b, paired = TRUE)
wilcox.test(a,b, paired = TRUE)

binom.test(sum(a < b), length(a), p=0.5)

# To discuss power of the test and the assumptions. Why do the results differ so much between different tests?  

```

## 3. Water drinking preference.
Pachel and Neilson (2010) conducted a pilot study to assess if house cats prefer water if provided still or flowing.
Their experiment consisted of providing nine cats either form of water over 4 days. The data provided in mL
consumption over the 22h hour period is provided in the file cat.csv in the Data folder. Some measurement had to be excluded
due to detectable water spillage. Assess with a suitable statistical approach if the cats prefer flowing water over
still water.
```{r}
library(here)
library(readr)
par(mfrow = c(1, 2))
cats <- read.csv(here("Data", "cat.csv"))
head(cats)
plot(density(cats$still), main = "still")
plot(density(cats$flow), main = "flow")

wilcox.test(cats[, 2], cats[, 3], paired = T, exact = T)


```

The estimation of the densities for both water sources looks symmetrically distributed, but the sample
size is relatively low (n = 9), therefore a Wilcoxon signed-rank test would be advisable over a paired t-test. The
null hypothesis states that the median of water consumption sources (in mL) for cats is the same, and we keep a
two-sided alternative hypothesis. Empirical evidence points out that we cannot reject the null, suggesting that
cats do not have a preference for water consumption sources.

## 4. (Extra) Writing your own function. Permutation test.
Download the water transfer.csv data from the Data folder and read it into R. The dataset describes
tritiated water diffusion across human chorioamnion and is taken from Nonparametric Statistical Methods book
(M. Hollander and D. A. Wolfe), Table 4.1, page 110. The pd values for age "At term" and "12-26 Weeks" are
denoted with yA and yB, respectively. We will statistically test if the water diffusion is different at the two time
points. That means we test whether there is a shift in the distribution of the second group compared to the first.

Under the null hypothesis, we are allowed to permute the observations (all y-values) while keeping the group assignments fix. Keeping this in mind, we will now manually construct a permutation test to detect a potential shift. Write an R function perm test() that implements a two-sample permutation test and returns the p-value.
For the permutation test, see: https://www.jwilber.me/permutationtest/

First, let's visualize the data.
```{r}

library(ggplot2)

mydata <- read.csv(here("Data", "water_transfer.csv"), stringsAsFactors = FALSE)

# Ensure factor order (so "12–26 Weeks" appears first)
mydata$age <- factor(mydata$age, levels = c("12-26 Weeks", "At term"))


ggplot(mydata, aes(x = age, y = pd)) +
  geom_violin(trim = FALSE, width = 0.5, draw_quantiles = 0.5) +  
  geom_jitter(width = 0.08, height = 0, alpha = 0.6, size = 2) +
  # Red "x" at the sample medians
  stat_summary(fun = median, geom = "point", shape = 4, size = 4, stroke = 1.2, color = "red") +
  labs(
    x = NULL,
    y = "pd",
    title = "Water transfer by gestational age") +
 
  theme_minimal(base_size = 12)
```


```{r}

str(mydata) # Structure of the dataset


perm_test <- function(x, y) { # defining a new function
R <- 1000
tobs <- median(x) - median(y) # Store the ’original’ sample mean
all.data <- c(x, y) # Store the data without the group labels
tsim <- array(0, R) # Pre-allocation
for (i in 1:R) {
index <- sample(1:length(all.data), length(x), replace = F) # random permutation
medianxA <- median(all.data[index]) # Sample mean of group A
medianxB <- median(all.data[-index]) # Sample mean of group B
tsim[i] <- medianxA - medianxB # Difference for the current iteration
}
hist(tsim)
abline(v=tobs)
return(sum(abs(tsim) >= abs(tobs))/R) # Sample p-value
}

#We use our function:
yA <- mydata[mydata$age == "12-26 Weeks", 1]
yB <- mydata[mydata$age == "At term", 1]
set.seed(14)
perm_test(yA, yB)


```