Term
|
Definition
numbers that are used to describe data sets |
|
|
Term
measure of centeral tendency (measures of center) |
|
Definition
descriptive measures that indicates where the center, or most typical value, in a data set lie
often called averages |
|
|
Term
|
Definition
sum of the observations divided by the number of observations |
|
|
Term
|
Definition
arrange the data in increasing order.
If the number of observations is odd, then the median is the observation exactly in the middle of the ordered list.
If the number of observations is even, then the median is the mean of the two middle observations in the ordered list
In both cases, if we let n denote the number of observations, then the median is at position (n+1)/2 in the ordered list |
|
|
Term
|
Definition
find the frequency of each value in the data set
if no value occurs more than once, then the data set has no mode
otherwise, any value that occurs with the greatest frequency is a mode of the data set |
|
|
Term
|
Definition
not sensitive to the influence of a few extreme observations (e.g. median, but not mean) |
|
|
Term
|
Definition
more resistant mean
created by removing a percentage of the smallest and largest observations before computing the mean |
|
|
Term
|
Definition
for a variable x, the mean of the observations for a sample is called a sample mean and is denoted as an x with a line over it
mean of the sample data |
|
|
Term
measures of variation (measures of spread) |
|
Definition
descriptive measures that indicate the amount of variation or spread in a data set |
|
|
Term
|
Definition
difference between the maximum (largest) and minimum (smallest) observations |
|
|
Term
|
Definition
measures variation by indicating how far, on average, the observations are from the mean |
|
|
Term
|
Definition
how far each observation is from the mean |
|
|
Term
sum of squared deviations |
|
Definition
the sum of the squared deviations from the mean
gives a measure of the total deviation from the mean for all the observations |
|
|
Term
|
Definition
valid for all data sets and implies, in particular, that at least 89% of the observations lie within three standard deviations to either side of the mean |
|
|
Term
|
Definition
If the distribution of the data set is approximately bell shaped, then we can apply this rule, which implies, in particular, that roughly 99.7% of the observations lie within 3 standard deviations to either side of the mean |
|
|
Term
|
Definition
divide a data set into hundredths, or 100 equal parts |
|
|
Term
|
Definition
divide a data set into tenths, or 10 equal parts |
|
|
Term
|
Definition
divide a data set into fifths, or 5 equal parts |
|
|
Term
|
Definition
divide a data set into quarters, or 4 equal parts |
|
|
Term
|
Definition
the number that divides the bottom 25% from the top 75% |
|
|
Term
|
Definition
the number that divides the bottom 50% from the top 50%
median |
|
|
Term
|
Definition
number that divides the bottom 75% from the top 25% |
|
|
Term
interquartile range (IQR) |
|
Definition
the difference between the first and third quartiles (Q3 - Q1) |
|
|
Term
|
Definition
|
|
Term
|
Definition
observations that fall well outside the overall pattern of data |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
observations that fall below the lower limit or above the upper limit |
|
|
Term
boxplot (box-and-whisker diagram) |
|
Definition
based on the five-number summary and can be used to provide a graphical display of the center and variation of a data set |
|
|
Term
constructing a boxplot procedure |
|
Definition
1. determine the quartiles
2. determine potential outliers and the adjacent values
3. draw a horizontal axis on which the numbers obtained in steps 1 and 2 can be located. above this axis, mark the quartiles and the adjacent values with vertical lines.
4. connect the quartiles to make a box, and then connect the ox to the adjacent values with lines
5. plot each potential outlier with an asterisk |
|
|
Term
|
Definition
two lines emanating from the box in a boxplot |
|
|
Term
population mean (mean of a variable) |
|
Definition
for a variable, x, the mean of all possible observations for the entire population |
|
|
Term
population standard deviation (standard deviation of a variable) |
|
Definition
for a variable, x, the standard deviation of all possible observations for the entire population |
|
|
Term
|
Definition
a descriptive measure for a population |
|
|
Term
|
Definition
a descriptive measure for a sample |
|
|
Term
|
Definition
always has a mean of 0 and standard deviation of 1
the standardized version of a variable x is obtained by first subtracting from x its mean and then dividing by its standard deviation |
|
|
Term
|
Definition
for an observed value of a variable, x, the corresponding value of the standarized variable z is called the z-score of the observation. The term standard score is often used instead of z-score.
A negative z-score indicates that the observation is below the mean, whereas a positive score indicates that the observation is above the mean |
|
|