---
title: "01 Intro. Crash Course in Statistics (Summer 2025)"
subtitle: "Neuroscience Center Zurich, University of Zurich"
author: "Dr. Zofia Baranczuk"
date: "2025-08-25"
output: pdf_document
---


```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)  
```

## 0. R Console vs. R Markdown

- Use the **console** for quick, temporary commands and checking results.
- Use an **R Markdown file (.Rmd)** to write code and text together for reproducible analysis.
- In `.Rmd`, combine:
  - **Text chunks** (like this explanation)
  - **Code chunks** (enclosed in ```{r} ... ```)


## 1. Loading and Exploring the Dataset
Load the data about depression per country in years 2019-2021 from the link:
https://user.math.uzh.ch/baranczuk/znz/Data/depression-rates-by-country-2025.csv
Have a look at the data. 
```{r}
depression <-  
  read.csv("https://user.math.uzh.ch/baranczuk/znz/Data/depression-rates-by-country-2025.csv")

#View(depression) - if you would like to view the data excel style. 
head(depression) #the first few lines

str(depression)
```
## 2. Working with Data Frames
Which country had the highest depression rate in 2021? 
Which 10 countries had the lowest depression rate in 2021? 
How many cases where there in the country with the highest rate? 
Which countries have less than 4% of depression in the population? 
What are the depression rates and number of cases in Switzerland?

```{r data_frames}
ind_max <- which.max(depression$RatePer100k_2021)
depression$country[ind_max]
# or
m <- max(depression$RatePer100k_2021)
ind_max <- which(depression$RatePer100k_2021 == m)
depression$country[ind_max]

cat('max depression rate = ', m, '\n')
cat('Number of cases in the country with the highest depression rates:',
    depression$Cases_2021[ind_max], "\n")

ind_min <- which.min(depression$RatePer100k_2021)
depression$country[ind_min]


inds_low <-which(depression$Percentage_2021<4)
depression$country[inds_low]

depression[depression$country=="Switzerland",]
```

## 3. Accessing Elements of a Data Frame
```{r}
depression[1,] # the first row
depression[,1] # the first column

depression[c(1:5), c(2,4,5)] # the first 5 rows, columns 2, 4, 5.
```

## 4. Simple plots
Let's have a look at a histogram of depression percentages, boxplots of depression percentages and a scatter plot of depression percentages in 2019 and 2021.
```{r simple_plots}
hist(depression$Percentage_2021,breaks = 10, col = "deepskyblue" )

boxplot( depression$Percentage_2021,  depression$Percentage_2020,
         depression$Percentage_2019 , col = c("aquamarine", "bisque2",
                                              "cadetblue"), xaxt = 'n',
         main = "Prevalence (%) of Depression by Year")
axis(side = 1, at = c(1,2,3), labels = c("2021", "2020", "2019"))

plot(Percentage_2019 ~ Percentage_2021,
     data = depression,
     col = "orchid",
     pch = 16,
     main = "Depression Percentages: 2021 vs 2019")
```
