Shared Flashcard Set

Details

Title

Statistics I: Term Test 1

Description

University of Guelph STAT*2040

Total Cards

118

Subject

Mathematics

Level

Undergraduate 2

Created

02/05/2015

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Mathematics Flashcards

Cards Return to Set Details

Term

Absolute value of deviation

Definition

A measure of variability. The magnitude of deviation. The distance from the mean.

Term

Addition rule

Definition

P(AUB) = P(A) + P(B) + P(A∩B)

Term

Bar graph

Definition

aka Bar chart

Illustrates the distribution of categorical variables, which are usually placed on the x-axis, and percent relative frequency on the y-axis. Includes Pareto diagrams.

Term

Baye's theorem

Definition

P(A|B) = [P(A∩B)] / [P(B)]

Term

Bimodal distribution

Definition

A distribution with two peaks.

Term

Binomial distribution

Definition

aka Bernoulli distribution.

When there are two possible outcomes or a probability experiment: success and failure.

You have to use the nCr function on your calculator, simbolized by C.

P(X = x) = (n C x)p^x(1 - p)^n-x

Term

Boxplot

Definition

An illustration of the five-number summary. The top of the box is the third quartile and the bottom the first quartile. A line through it is the median. Whiskers extend to the maximum and minimum values. Outliers are plotted separtely. Useful for comparing groups.

Term

Categorical variable

Definition

aka Qualitative variable

A variable that falls into one of two or more distinct categories. May be displayed using a bar graph, Pareto diagram, or pie chart.

Term

Chebyshev's Inequality

Definition

aka Chebyshev's theorem

The proportion of observations that lie within k standard deviations must be at least 1 - (1/k²)

Term

Chi-square test

Definition

A test that quantifies how strong evidence is.

Term

Classes

Definition

aka Bins

Ranges of quantitative variables that data is sorted into when making a frequency table. Appropriate number and range of bins must be selected, with boundaries the same for each class.

Term

Cluster sampling

Definition

Random selection of groups within a population, such as towns or households. Within every cluster, every individual is surveyed. Cuts down costs of sampling.

Term

Combination

Definition

A permutation where the order of selection doesn't matter.

x is the number of items in the combination

n is the number of items x is selected from

Cⁿ_x = [n!] / [x!(n - x)!]

Term

Complement

Definition

^C

An event has not occured.

P(A^C) = 1 - P(A)

Term

Conceptually infinite population

Definition

A population that is too large or too nebulous and it is practically impossible to list every member.
Example: the mosquitoes of Southern Ontario.

Term

Conditional probability

Definition

|

An event that has occured given that another event has already occured.

P(A|B) is the probability of A, given that B has already occured.

Term

Confounded variables

Definition

Variables that are impossible to separate. Cannt study either one without the other being a lurking variable.

Term

Continuous

Definition

A sliding continuum of values. There are infinity number of fractions it can be divided into. May be bound to a certain range.

Example: time, weight, distance.

Term

Control group

Definition

A group in an experiment that is exposed to all the same environmental factors excepting one; the variable which is being studied.

Term

Covariance

Definition

A measure of the linear relationship between x and y.

Term

Cumulative frequency

Definition

The number of data points in a class, pluss all the data points in lower classes.

Term

Degrees of freedom

Definition

The number of independent pieces of information used to estimate a quantity.

Term

Descriptive statistics

Definition

Plots and numerical summaries used to describe a data set.

Term

Deviation

Definition

A measure of variability. The value minus the mean. The sum of all deviations will always equal zero.

Term

Discrete

Definition

Having a countable number of possible values. May be infinite or bound to a certain range. Example: money. Can go up to infinity, but the smallest fraction it can be divided into is cents.

Term

Distribution

Definition

How often variables take on certain values. Includes symmetric, skewed, unimodal, bimodal, and multimodal.

Term

Dot plots

Definition

A metho of illustrating data points. Every data point is individually plotted.

Term

Empirical rule

Definition

About 68% of observations lie within 1 standard deviation of the mean, about 95% within 2 standard deviations, and almost all within 3 stanard deviations. Does not apply to extremely skewed data.

Term

Event

Definition

Represented by a capital letter. A group of outcomes in the sample space.

Term

Expected value (μ)

Definition

The theoretical value of a mean variable. Not to be confused with the most likely value. The average if an experiment was done infinity times.

μ = E(x) = Σ x p(x)

Term

Experiment

Definition

Researchers impose conitions for the explanatory variable that are pre-existing. Well-designed, randomized experiments with a control group can show causal relationships if differences are significant.

Term

Explanatory variable

Definition

The variable which we can control for. In an experiment or observational study individuals are categorized into groups.

Term

Exponential distribution

Definition

Distribution skewed strongly to the right.

Term

Finite population

Definition

A population which is small enough for every member to be listed.

Example: U of G students.

Term

First quartile

Definition

aka 25th percentile

The bottom section of the box in a boxplot. Included in the five-number summary.

Term

Five-number summary

Definition

The minimum, the first quartile, the median, the third quartile, and the maximum. Illustrated with a boxplot.

Term

Frequency

Definition

The number of observations occuring in a category.

Term

Frequency table

Definition

A table showing the frequency of categories in data. Use for making bar graphs and histograms. With histograms, data is sorted into classes.

Term

Geometric distribution

Definition

The number of trials needed to get the first success in a binomial trial. Must be independent binomial trials with constant probability of success. Modelled by the probability mass function.

Term

Geometric mean

Definition

A measure of central tendency. The nth root of the product of observations.

(Πxi)^(1/n)

Term

Harmonic mean

Definition

A measure of central tendency. The reciprocal of the mean, using reciprocals of all observations.

n / [∑(1 / x_i)]

Term

Histogram

Definition

An illustration of the distribution of a quantitative variable. Made using a frequency table.

Term

Hypergeometric distribution

Definition

Binomial distribution where the trials are not independent; the probability of outcomes is dependent on the results of previous trials.

You need to use the nCr function on a calcultor, symbolized by C.

X is the number of successes

a is the probability of a success

n is the sample size

N is the population size

P(X = x) = [(a C n)*((N - a) C (n - x))] / [N C n]

Term

Independent

Definition

The occurance of an event has no effect on the probability of an another effect and vise versa.

All three must be true or all three false:

1. P(A∩B) = P(A)*P(B)

2. P(A|B) = P(A)

3. P(B|A) = P(B)

Term

Individual

Definition

aka Unit

aka Case

Objects on which measurements are taken.

Term

Inferential statistics

Definition

Investigating the relationship between variables.

Term

Interquartile range (IQR)

Definition

A descriptive measure of variance. The difference between the third and first quartile. Not sensitive to extreme values.

IQR = Q₃ - Q₁

Term

Intersection

Definition

∩

One event and another event have occured together in the same sample point.

Term

Law of Large Numbers

Definition

If you sample an infinitely large number of variales, you get the expected value and expected sample variance.

Term

Linear transformations

Definition

Conversions that are linear, such as the conversion between Celsius and Fahrenheit.

Term

Lurking variables

Definition

Variables that contribute to correlations, but are not included in the study. Researchers may be completely unaware of them. More likely in observational studie than in experiments.

Term

Maximum value

Definition

The largest value in a dataset. The top line of a boxplot.

Term

Mean (x bar)

Definition

aka Average

The most popular measure of central tendency. Uses more information, but is more sensitive to extreme values in the data. This sensitivity can make the mean misleding

x bar = [Σx_i] / n

Term

Mean absolute deviation (MAD)

Definition

The average absolute value of deviation. A reasonable measure of variability, but hard to work with.

MAD = [Σ|x_i - x bar|] / n

Term

Median

Definition

aka Second quartile

aka 50th percintile

A measure of central tendency. The line in a boxplot separating the box. The middle point, if all data points were ordered in ascending order. If n is even, the median is the average ot the two middle values. Not as sensitive to extreme values as the mean. Good for data that is right-skewed, such as property value or salary.

Term

Midrange

Definition

A measure of central tendency. The midpont between the minimum and maximum values.

Term

Minimum value

Definition

The smallest value in a dataset. The bottom line of a boxplot.

Term

Mode

Definition

A measure of central tendency. The most frequenty occuring observation.

Term

Multimodal distribution

Definition

Distribution with multiple peaks.

Term

Multiplication rule

Definition

P(A∩B) = P(A)*P(B|A) = P(B)*P(A|B)

Term

Multivariate hypergeometric distribution

Definition

Hypergeometric distribution where there more than two classifications of outcomes.

Term

Mutually exclusive

Definition

Evens where there is no outcome in the sample space that satisfies both.

P(A∩B) = 0

Term

Negatively skewed

Definition

Skewed to the left. Higher on the right.

Term

Normal distribution

Definition

Perfectly symmetrical distribution. Rare.

Term

Observational study

Definition

Researchers observe and measure variables, but do not impose any conditions on the subjects. The groups of explanatory variables are pre-existing.

Done if the experiments are impossible (time, money, ethical reasons). Doesn't provide strong evidence for causal relationships; there may be lurking variables.

Term

Outliers

Definition

Extreme values that fall from the overall pattern of distribution. Fall outside the range of boxplot whiskers. Plotted individually in a boxplot.

Term

P value

Definition

A measure of the strength of evidence.
If the probability a result is false is less than 0.05 then the result is considered significant.

Term

Parameter

Definition

A numerical characteristic of a population.

Term

Pareto diagram

Definition

A bar graph where the categories are sorted by percent frequency from largest to smallest.

Term

Percent relative cumulative frequency

Definition

The cumulative frequency expressed as a percent of all data points. The last class should have a percent relative cumulative frequency of 100%.

Term

Percent relative frequency

Definition

The relative frequency expressed as a percent.

Term

Percentile

Definition

The value of the variable that has p% of the ordered data values at or below this value.

Term

Permutation

Definition

An ordering of a set of items.

x is the number of things being ordered

n is the number of things x is selected from

Pⁿ_x = [n!] / [(n - x)!]

Term

Pie chart

Definition

Illustrates the percent relative frequencies of categorical variables as slice-shaped areas on a circle.

Term

Poisson distribution

Definition

When events occur independently over a range. The probability of an event within any given range of a certain size does not change.

X is the number of events in a fixed range

x is a positive integer

λ is the theoretical mean of events in a fixed range

P(X = x) = [λ^xe^-λ] / [x!]

Term

Population

Definition

The set of individuals or objects of interest to an investigator.

Term

Population mean

Definition

A parameter. The average of all individuals in a population.

Term

Positively skewed

Definition

Skewed to the right. Skewed distribution that is higher on thhe left.

Term

Probability

Definition

The propotion of times that the outcome would occur in an infinite number of trials.

Term

Probability experiment

Definition

We don't know what is going to happen in any one individual trial, but we can keep traack of the long-run distribution of outcomes.

Term

Probability Mass Function (PMF)

Definition

Used to calibrate the probability a success will occur after a certain number of trials.

P(X = x) = p*(1 - p)^{x - 1}

P(X ≤ x) = 1 - (1 -p)^x

Term

Quantitative variable

Definition

A variable that falls onto a sliding continuous scale of values.

Term

Quartile

Definition

Specific percentiles. Useful descriptive measures of the distribution of data. Used in the construction of boxplots. Includes the first, second, and third quartiles.

Term

Definition

A software program that is used for statistics.

Term

Random sampling

Definition

Ensures that we avoid systematic bias in the samples.

Term

Range

Definition

A measure of variability. The maximum value minus the minimum value. Does not provide much information.

Term

Relative frequency

Definition

Frequency divided by n. The proportions of observations in a category.

Term

Response variable

Definition

The variable of interest in an experiment; what we look for changes in.

Term

Sample

Definition

A subset of individuals selected from a population.

Term

Sample mean

Definition

A statistic. The average of all observations in a sample.

Term

Sample points

Definition

Individual outcomes of probability experiments. Exclusive; no two points can occur on the same trial.

Term

Sample space (S)

Definition

A list of all possible outcomes of a probability experiment. Exhaustive; there are no possible outcomes not included in the sample space.

Term

Sample variance (s²)

Definition

A measure of variability. The average squared deviation. Will give an answer in units squared.

s² = [Σ(xi - x bar)²] / (n - 1)

Term

Side-by-side bar chart

Definition

A bar chart with data for categories is represented by bars side by side to one another.

Term

Simple Random Sampling (SRS)

Definition

One of the simplest and most important types of random sampling. Each individual in the population has the same likelihood of being selected for the sample.

Term

Skewed distribution

Definition

When the distribution is stretched off to one side. Includes positive and negative skewedness.

Term

Squared deviation

Definition

A meaure of variability. The square of deviation.

Term

Stacked bar chart

Definition

A bar chart where categories are represented by stacking bars on top of each other.

Term

Standard deviation (s)

Definition

The squared root of variance. Cannot be negative.

s = √s²

Term

Statistic

Definition

A numerical characteristic of a sample.

Term

Statistical inference

Definition

Making statements about population parameters based on sample statistsics.

Term

Stem

Definition

In a stemplot, groups of data based on the second to last digit in the data points (each data point is written to the same number of decimal points). The stems are listed in ascending order in a column, and the leaves going off to the right.

Term

Stemplot

Definition

aka Stem-and-leaf display

A way of illustrating quantified variable data. The data is sorted into stems and leaves based on the last two digits. The leaves are listed as single digits (the last digit in the data point) to the right of their stem. Must include a legend for the stems. Includes split-stem and back-to-back stemplots.

Term

Strata

Definition

Groups from which samples are taken in stratified random sampling.

Term

Stratified random sampling

Definition

The population is divided into strata and random samples are taken from each strata.

Term

Symmetric distribution

Definition

aka Bell-shaped distribution

Distribution that is roughly the same on either side of the median. Includes normal distribution.

Term

T-test

Definition

Determines if there is a significant difference between variables. If there is, there is a large likelihood that there is a correlation between variables.

Term

Third quartile

Definition

aka 75th percentile

The top line of a boxplot.

Term

Trimmed mean

Definition

A measure of central tendency. A certain percentage of the largest and smallest observations are omitted from calculations, resulting in a mean less sensitive to extreme values.

Term

Uniform distribution

Definition

Distribution that is constant over the entire range.

Term

Unimodal distribution

Definition

Distribution with one peak.

Term

Union

Definition

U

One event or another event has occured in one sample point.

Term

Variability

Definition

The dispersion of a variable.

Var(x) = E * [(x - μ)²] = E * (x - μ)² * p * x

Term

Voluntary response

Definition

When individuals volunteer themselves to be included in a sample. Results tend to be biased; measuring statistics of people who would volunteer.

Term

Whisker

Definition

An extension up and down from a boxplot indicating the minimum and maximum values if they lie within 1.5 of the length of the box; values outisde this range are outliers.

Term

Wiebull distribution

Definition

Distribution with a peak near the left and skewed towards the right.

Term

Weighted mean

Definition

A measure of central tendency. A mean where some observations are given more weight in calculations.

Term

Z-score

Definition

A unitless measure of how many standard deviations a point is away from the mean. Positive means above the mean, negative means below.

z_i = [x_i - x bar] / s

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Mathematics Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile