Frequentist Inference II – NHST II

This here shall be a summary of the most common significance tests.

Designing NHST

  1. specify H_{ 0 } and H_{ 1 }
  2. choose test statistic of which we know the null distribution and alternative distribution
  3. specify rejection region, significance level and decide if the rejection region is one or two-sided
  4. compute power using the alternative distribution(s)

Running a NHST

  1. collect data and compute test statistic X
  2. Check if X is in rejection region (p<\alpha )

Common significance tests

z-test

  • Use: Compare data mean to the mean of our hypothesis
  • Data: x_{ 1 },…,x_{ n }
  • Assumptions: x_{ i } \sim N(\mu, \sigma^{ 2 }) where the mean is unknown but the variance is known
  • H_{ 0 } : \mu=\mu_{ 0 } for a specified value \mu_{ 0 } .
  • H_{ A } : \mu\neq\mu_{ 0 } \;or\; \mu < \mu_{ o } \;or\; \mu>\mu_{ 0 }
  • test statistic: z=\frac{ \overline{ x }-\mu_{ 0 } }{ \frac{ \sigma }{ \sqrt{ n } } }
  • null distribution: f(z|H_{ 0 }) is pdf of Z~N(0,1)
  • p-value: The p-value is calculated like always

One-Sample t-Test for the mean

  • Use: Compare the data mean to the mean of our hypothesis
  • Data: x_{ 1 },...,x_{ n }
  • Assumptions: x_{ i }\sim N(\mu,\sigma^{ 2 }) where both the mean and the variance is unknown.
  • H_{ o } : \mu=\mu_{ 0 } for a specific value \mu_{ 0 }
  • H_{ A } : \mu\neq\mu_{ 0 } \;or\; \mu < \mu_{ o } \;or\; \mu>\mu_{ 0 }
  • test statistic: t=\frac{ \overline{ x }-\mu_{ 0 } }{ \frac{ s }{ \sqrt{ n } } } where s^{ 2 }=\frac{ 1 }{ n-1 }\sum_{ i=1 }^{ n }{ (x_{ i }-\overline{ x })^{ 2 } }
  • null distribution: f(t|H_{ 0 }) is the pdf of T~t(n-1)
  • p-values: the p-value is calculated like always

Two-Sample t-Test for comparing means (assuming equal variance)

  • Use: Compare the sample means of two groups
  • Data: x_{ 1 },...,x_{ n } and y_{ 1 },...,y_{ m }
  • Assumptions: x_{ i }\sim N(\mu_{ x },\sigma^{ 2 }) and x_{ j }\sim N(\mu_{ y },\sigma^{ 2 }) . The means and the variance is unknown but the variance of the two groups is the same.
  • H_{ 0 } : \mu_{ x }=\mu_{ y }
  • H_{ A } : \mu_{ x }\neq\mu_{ y } \;or\; \mu_{ x } < \mu_{ y } \;or\; \mu_{ x }>\mu_{ y }
  • test statistic: t=\frac{ \overline{ x }-\overline{ y } }{ s_{ p } } where s_{ p }^{ 2 }=\frac{ (n-1)s_{ x }^{ 2 }+(m-1)s_{ y }^{ 2 } }{ n+m-2 }(\frac{ 1 }{ n }+\frac{ 1 }{ m })
  • null distribution: f(|H_{ 0 }) is the pdf of T~t(n+m-2)
  • p-values: the p-value is calculated like always

One-way ANOVA (F-test for equal means)

  • Use: Compare the means of n groups each with m data points
  • Data:
    • x_{ 1,1 },...,x_{ 1,m }
    • x_{ 2,1 },...,x_{ 2,m }
    • x_{ n,1 },...,x_{ n,m }
  • Assumptions: all groups follow a normal distribution with \mu_{ n } and \sigma^{ 2 } where the means are possibly different but the variance is the same for all groups.
  • H_{ 0 } : \mu_{ 1 }=\mu_{ 2 }=...=\mu_{ n } all means are the same.
  • H_{ A } : Not all means are the same.
  • test statistic: w=\frac{ MS_{ B } }{ MS_{ w } } where:
    •  \overline{ x }_{ i } is the group mean; \overline{ x }_{ i }=\frac{ \sum_{ j=1 }^{ m }{ x_{ j } } }{ m }
    • \overline{ x } is the mean of all groups; \overline{ x }=\frac{ \sum_{ i=1 }^{ n }{ \overline{ x }_{ i } } }{ n }
    • s_{ i }^{ 2 } is the sample variance of the group
    • MS_{ B } is the variance between the group means; MS_{ B } = \frac{ m }{ n-1 }\sum_{ i=1 }^{ n }{ (\overline{ x }_{ i }-\overline{ x })^{ 2 } }
    • MS_{ w } is the sample mean of all the sample variances; MS_{ w }=\frac{ \sum_{ i=1 }^{ n }{ s_{ i }^{ 2 } } }{ n }
  • null distribution: The null distribution f(w|H_{ 0 }) is the pdf of W~F(n-1, n(m-1)) with n-1 and n(m-1) degrees of freedom.
  • p-values: the p-value is calculated like always.

Chi-Square test for goodness of fit

  • Use: Test if discrete data fits a finite probability mass function.
  • Data: for each outcome \omega_{ i } an observed outcome O_{ i }
  • Assumptions: none
  • H_{ 0 } : The data was drawn from a specific discrete distribution
  • H_{ A } : The data was drawn from a different distribution
  • test statistic:
    • Likelihood ratio statistic: G=2\sum{ O_{ i }ln(\frac{ O_{ i } }{ E_{ i } }) }
    • Pearson’s Chi-Square statistic: X^{ 2 }=\sum{ \frac{ (O_{ i }-E_{ i })^{ 2 } }{ E_{ i } } }
  • null distribution: f(G|H_{ o }) and f(X^{ 2 }|H_{ 0 }) have approximately the same pdf as Y\sim X^{ 2 }(df) where the degrees of freedom are the number of outcomes minus the number of parameters we have to calculate. That can be the mean, or any other thing.
  • p-value: the p-value is calculated like always.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s