Term
|
Definition
the distribution of a quantitative variable slices up all the possible values of the variable into equal-width bins and gives the number of values (or counts) falling into each bin. |
|
|
Term
Histogram (relative frequency histogram) |
|
Definition
a histogram uses adjacent bars to show the distribution of a quantitative variable. Each bar represents the frequency (or relative frequency) of values falling in each bin. |
|
|
Term
|
Definition
a region of the distribution where there are no values |
|
|
Term
|
Definition
shows quantitative data values in a way that sketches the distribution of the data |
|
|
Term
|
Definition
graphs a dot for each case against a single axis |
|
|
Term
|
Definition
to describe the shape of a distribution, look for: single vs multiple modes symmetry vs skewness outliers and gaps |
|
|
Term
|
Definition
the place in the distribution of a variable that you'd point to if you wanted to attempt the impossible by summarizing the entire distribution with a single number, measure of center include the mean and median |
|
|
Term
|
Definition
a numerical summary of how tightly the values are clustered around the center, measures of spread include the IQR and standard deviation |
|
|
Term
|
Definition
a hump or local high point in the shape of the distribution of a variable, the apparent location of modes can change as the scale of a histogram is changed |
|
|
Term
|
Definition
having one mode, useful term for describing the shape of a histogram when it's generally moundshaped; distributions with two modes are bimodal, with more than 2 modes are multimodal |
|
|
Term
|
Definition
a distribution that's roughly flat |
|
|
Term
|
Definition
the parts typically trail off on either side |
|
|
Term
|
Definition
a distribution is skewed if it's not symmetric and one tail stretches out farther than the other |
|
|
Term
|
Definition
extreme values that don't appear to belong with the rest of the data |
|
|
Term
|
Definition
the middle value, with half of the data above and half below it, usually paired with the IQR |
|
|
Term
|
Definition
difference between the lowest and highest values in a data set |
|
|
Term
|
Definition
the median and quartiles divide data into four parts with equal numbers of data values |
|
|
Term
|
Definition
the IQR is the difference between the first and third quartiles |
|
|
Term
|
Definition
the nth percentile is the number that falls above n% of the data |
|
|
Term
|
Definition
reports the minimum value, Q1, median, Q3, maximum value |
|
|
Term
|
Definition
found by summing all the data values and dividing by the count, usually paired with standard deviation |
|
|
Term
|
Definition
a calculated summary is said to be resistant if outliers have only a small effect on it |
|
|
Term
|
Definition
the sum of squared deviations from the mean, divided by the count minus 1 |
|
|
Term
|
Definition
the square root of the variance |
|
|