# Lecture 5/6

## Cookbook of commonly used statistical tests

• Comparing two groups
• Comparing more than two groups

### Introduction to Biostatistics

By: Peter Kamerman    (view at painblogR)

A quick recap on:
p-values & hypothesis testing

### Definition of a p-value

“The probability of observing a result as great as (or greater than) you observed if the null hypothesis is true.”

If the data are unlikely under the null hypothesis (small p-value), then either we observed a low probability event, or it must be that the null hypothesis is not true.

…only one of these can be correct.

### Hypothesis testing

Jerzy Neyman and Egon Pearson:

• Works by setting a threshold $$(\alpha)$$ that the p-value must cross.

• You state a null hypothesis and an alternative hypothesis and use the threshold p-value as a decision rule.

• The p-value threshold is chosen to control false-positive inference (usually set at $$\alpha$$ = 0.05).

• You have to abide by the statistical test's 'decision' if you wish to protect against false-positive errors.

### Parametric tests

Experimental groups may differ for two reasons:

1. Real effect of intervention

2. Random variation between samples drawn from the same population

### You must decide whether:

[1] is large enough relative to [2] to conclude
that a treatment had an effect.

### Parametric tests

Calculate the ratio of variances

1. between-group variance $$(\sigma^2_{bet})$$

2. within-group variance $$(\sigma^2_{with})$$

If samples are from the same population,
the variances will be similar, and…

$$\frac{\sigma^2_{bet}}{\sigma^2_{with}} \rightarrow 1$$

Degrees of Freedom (df) determine the critical value the ratio (test statistic) must reach for the null hypothesis to be rejected.

### Assumptions for parametric tests

• The distribution of the data in the population is Gaussian

• Equal variance across groups
(the basis on which the test statistic is calculated)

• The errors are independent
(the 'error' refers to the difference between each value and the mean)

• Data are unmatched (for unpaired data) / matching is effective (for repeated measures data)

### Student's t-test

First have a look at the data

data(sleep)

  extra group ID
1   0.7     1  1
2  -1.6     1  2
3  -0.2     1  3
4  -1.2     1  4
5  -0.1     1  5
6   3.4     1  6


boxplot(extra~group, data = sleep)


### Student's t-test

Run t-test

# When you