Shared Flashcard Set

Details

Title

HG_II_Lecture_11

Description

Statistical tests

Total Cards

Subject

Biology

Level

Graduate

Created

07/15/2014

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Biology Flashcards

Cards Return to Set Details

Term

Which test imply distribution?

Definition

Parametric tests (t-test, ANOVA)assume a distribution (usually that errors are normally distribted)
Non-parametric tests (Mann-Whitney U, Wilcoxon rank test) analyse ranks, i.e.assume no distribution

Term

What T-test compares? What is the null hypothesis?

Definition

• Compares means of samples
• Null: means are the same

Term

What is the difference between two-sided and one-sided T-tests?

Definition

Two-sided: means are different, one-sided - higher or lower

Term

What one-sample T test tests?

Definition

The one-sample t-test is used only for tests of the sample mean.
uncertainty is only in one sample

Term

What two-sample T test tests?

Definition

Two-sample: uncertainty is in two samples

Term

What are two variants of two-sample T test tests?

Definition

– Variant1: both samples have the same variance
– Variant 2: two samples have different variances

Term

What paired-sample T test tests?

Definition

You use the paired t-test when there is one measurement variable and two nominal variables.
The most common design is that one nominal variable represents different individuals, while the other is "before" and "after" some treatment.

Term

What is SD?

Definition

Standard deviation. The SD quantifies scatter — how much the values vary from one another.

Term

What is the SEM?

Definition

Standard error of mean.The SEM quantifies how precisely you know the true mean of the population. It takes into account both the value of the SD and the sample size.

Term

95% of confidence interval (CI) of the mean = ?

Definition

mean +- 1.96 SEM for a normal distribution

Term

Which types of plots should we use for non-paramentic tests and why?

Definition

Box-plots, because they imply no distribution

Term

What is familywise error rate?

Definition

FWER is the probability of making one or more false discoveries, or type I errors, among all the hypotheses when performing multiple hypotheses tests.

Term

Bonferoni correction

Definition

if it is desired that the significance level for the whole family of tests should be (at most) α, then the Bonferroni correction would be to test each of the individual tests at a significance level of α/n

Term

Problem of Bonferoni correction

Definition

• Problem : loss of power, especially if there are many true positives and many tests, but few replicates
• Solution: determine the False Discovery rate that controls the number of false positives among all discoveries

Term

What is false discovery rate?

Definition

Fraction of false positives among all discoveries

Term

Storey q-value

Definition

The q-value of an individual hypothesis test is the minimum FDR at which the test may be called significant.

Term

What are the methods of gene clustering?

Definition

Pairwise distances =>
Hierarchical clustering (b)
K-means clustering (c)
Self organizing maps (d)

Term

Benjamini Hochberg

Definition

Put the individual P from smallest to largest. Then compare each individual P-value to (ind/m)Q, where m is the total number of tests and Q is the chosen false discovery rate. The largest P-value that has P<(ind/m)Q is significant, and all P-values smaller than it are also significant.

Term

PINGO: You have measured gene expression levels of a gene in blood cells of 10 humans, each before and after eating. Which type of test would be most appropriate?

Definition

A paired sample t-test, comparing the difference before and after eating for all 10 humans

Term

PINGO: In a paper it is written that Gene X is 2.3-fold differently expressed between patients (mean=8.8, sd=0.56, N=5) and controls (mean=7.7, sd=0.37, N=5), which is significant (t-test, p=0.005).Which statements are correct?

Definition

Mean expression levels are given in log2 space
The standard error of the mean for patients is 0.25

Term

PINGO: Which statements regarding the False Discovery Rate (FDR) are true?

Definition

+The FDR is used to correct for multiple testing
+The Benjamini-Hochberg procedure is a common way to calculate the FDR
+Storey q-value is a common way to calculate the FDR

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Biology Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile