| Term 
 | Definition 
 
        | A countable subset of the population.  A set of actual observations.  When the population is uncountable, we draw a sample of observations from the population. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A complete set of events in which we are interested.  Often uncountable, or infinite. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Numerical values summarizing sample data. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Numerical values summarizing population data. N.B.
 Statistics describe the SAMPLE data
 Parameters describe the POPULATION data
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A sample in which every member of the population has an equal chance of inclusion in teh sample.  If a sample is truly random, then statistics help us define the parameters of our population. |  | 
        |  | 
        
        | Term 
 
        | Measurement/Quantitative Data |  | Definition 
 
        | Data obtained by measuring objects or events.  Uses some form of instrument for measuring the variable in question. |  | 
        |  | 
        
        | Term 
 
        | Categorical/Frequency/Count Data |  | Definition 
 
        | Statements that count the frequencies or totals for various categories. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Assigning numbers to objects |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The characteristic of the relationship between objects and the numbers we assign them while measuring.  Four kinds of "scales of Measurement" 1. nominal scale
 2. ordinal scale
 3. interval scale
 4. ratio scale
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Doesn't truly "scale" items in any dimension, but NAMES (labels) them. The numbers used have no real meaning other than differentiating between the items.
 ex: numbering football players by the numbers on their jersies.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | a scale where numbers are used to place the items in order along a continuum. ex: class ranking. N.B.  The numbers strictly order the values, but do not indicate the nature of the difference between the two.  We can know who came first, who came second, but not the difference between their rankings.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A scale in which differences between scale points represent legitimate values.  Equal distances between two objects represent equal values. Ex: the difference between 1 deg. and 11 deg. is the same as  12 deg. and 22 deg.
 But we don't know anything about the ratios between two scale points since the values are arbitrary.  40 degrees is not twice as hot as 20 degrees.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Has a TRUE ZERO. Temperature is NOT a ratio scale since 0 degrees is an arbitrary value about which numbers are assigned.  The zero must represent some physical reality, ex:
 0 km/hr
 Ratios are meaningful!  40 km/hr IS twice as fast as 20 km/hr.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | properties of objects or events that take on different values. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | randomly assigning participants to ensure a truly random sample |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Indicates summation summation rules:
 sigma(X-Y) = sigma(X) - sigma(Y)
 sigma(CX) = Csigma(x)
 sigma(X+Y) = sigma(x) + sigma(Y)
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A distribution in which values of the dependant variable are tabled or plotted against the frequency in which they occured. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A graphical display of a frequency distribution into a histogram.  Preserves the actual values obtained, and visually represents the frequency in which they occured. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A graphical display of a frequency distribution into a histogram.  Preserves the actual values obtained, and visually represents the frequency in which they occured. |  | 
        |  | 
        
        | Term 
 
        | EDA: Exploratory Data Analysis
 |  | Definition 
 
        | A set of techniques developed to present data in visually meaningful ways.  ex. a histogram. |  | 
        |  | 
        
        | Term 
 
        | Leading/Most Significant digits |  | Definition 
 
        | leftmost digits of a number |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The vertical axis of a stem and leaf display - given by the leading digits. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The vertical axis of a stem and leaf display - given by the leading digits. |  | 
        |  | 
        
        | Term 
 
        | Trailing/Less Significant digits |  | Definition 
 
        | digits to the right of the leading digits |  | 
        |  | 
        
        | Term 
 
        | Trailing/Less Significant digits |  | Definition 
 
        | digits to the right of the leading digits |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | horizontal axis of a stem and leaf display: contains the trailing or less significant digits for each leading digit on the stem. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A type of graph that accumulates adjacent values into intervals, and plots the intervals as rectangles with respect to the frequency of oberservation. |  | 
        |  | 
        
        | Term 
 
        | Real Limits (Upper and Lower) |  | Definition 
 
        | Lowest and highest possible values which could be classified as belonging to a given interval. ex:  Interval
 1. 25-29
 2. 30-34
 3. 35-39
 The real lower limit of interval (2.) is 29.5 (half way between. 29&30) and 34.5 is the real upper limit.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The center of the interval - average of the upper and lower limits. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Extreme point that stands quite removed from the rest of the data.  Often due to error. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | distribution having the same shape about both sides of the center |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A distribution having two distinct peaks |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A distribution having a single peak |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | describes the number of meaningful peaks in a distribution |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A measure of the degree to which a distribution is asymmetrical Positively skewed: distribution trails off to the right of the peak
 Negatively skewed: distribution trails off to the left of the peak
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | CUMULATIVE frequency counting as you move accross intervals from the outside in.  (cumulatively start adding frequencies, beginning at both ends, until summations meet in the middle) |  | 
        |  | 
        
        | Term 
 
        | Measures of Central Tendancy |  | Definition 
 
        | various statistical measures used to describe where the middle of a data distribution lies |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The most commonly occuring score, or the most populous interval If two adjacent terms have equal and greatest frequencies, then we average the two.  If two nonadjacent terms share these properties, then the data is said to be bimodal
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The score corresponding to the point having 50% of the observations below it, and 50% above, when displayed in numeric order. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | describes where in an ordered series the median lies median location = (N+1)/2
 ex: if N=83, then the median location is 42, meaning the location falls at the 42nd number in the ordered series.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | most commen measure of central tendancy. the sum of the scores divided by the number of scores
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | the degree to which individual data points are distributed around the mean. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | distance between the lowest and highest scores.  may give a distorted image of the data due to outliers. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The range of the middle 50% of the data: distance between 25%ile and 75%ile.  Avoids the range's problem of being dependant on outlying data. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | samples that have had a certain amount of the data from each tail removed.  statistics calculated from a trimmed sample are called trimmed statistics. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Taking the deviations from X and the mean, and averaging them to get a measure of average variance.  DOESNT WORK!!!  Will ALWAYS equal zero unless you take the absolute value of these deviations.  The positive deviations cancel the negative deviations. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | common measure of variance. obtained by AVERAGEING the sum of the SQUARED deviations about the mean rather than the absolute value of the deviations.
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | variance of the true population.  usually estimated, rarely computed. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | positive square root of the variance.  The standard deviation for the sample variance is given by "s" and the standard deviation  for the population variance is given by sigma. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | A property of a statistic whose long-range average is not equal to the parameter it estimates. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | graphical method used to represent the dispersion of a sample. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | The points that cut off the bottom and top  quarter of a distribution |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | range between the two hinges.  (Inner Quartile Range) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | In a box plot, a line joining the the hinge with the farthest data point whose distance from the hinge is NO MORE THAN 1.5 times the H-spread. |  | 
        |  |