Term
|
Definition
-refers to how large volumes of data are converted into useful readily understood information by summarizing their important characteristics |
|
|
Term
|
Definition
-uses a sample from a population to make probabilistic statements about the characteristics of a population |
|
|
Term
|
Definition
-includes ALL the members of a particular group |
|
|
Term
|
Definition
small subset of the population of which conclusions about the population are drawn |
|
|
Term
|
Definition
Describes a characteristic of a population |
|
|
Term
|
Definition
describes a characteristic of a sample (drawn from a population) |
|
|
Term
relative frequency distribution |
|
Definition
shows the percentage of a distribution's outcomes in each interval |
|
|
Term
cumulative frequency distribution |
|
Definition
shows the percentage of observations less than upper bound of each interval |
|
|
Term
|
Definition
a tabular illustration of data categorized into a relatively small number of intervals or classes |
|
|
Term
Cumulative relative frequency |
|
Definition
the PROPORTION of total observations that are less than the upper bound of the interval |
|
|
Term
|
Definition
1) Nominal - only names make sense --> categorize data but do not rank them 2) Ordinal - order makes sense --> sort data into categories that are ranked. They tell us nothing about the MAGNITUDE of the difference between categories 3) Intervals - interval makes sense --> difference between scale values is equal so that values can be added and subtracted meaningfully 4) Ratio - ratio makes sense (absolute 0) |
|
|
Term
|
Definition
-graphical equivalent of a frequency distribution -bar chart -advantage: allows us to quickly see where most of the observations lie! |
|
|
Term
|
Definition
-graphically illustrates data from frequency distribution -plot the frequency of each interval again the MIDPOINT of each interval -shows how absolute frequency varies w each interval |
|
|
Term
Measures of Central Tendency |
|
Definition
Arithmetic mean, median, mode, weighted mean, geometric mean, harmonic mean |
|
|
Term
|
Definition
-the sum of all the observations in a data set divided by the total number of observations -arithmetic mean of a sample is the best estimate of the value of the next observation -population and sample means are arithmetic means -ONLY measure where sum of the deviations of each value from the mean is always zero -data set only has ONE arithmetic mean -interval and ratio data sets have an arithmetic mean -use if gauging performance over a single period |
|
|
Term
|
Definition
-used to average rates of change over time or to calculate the growth rate of a variable over a period -returns are constant over time, geometric mean will equal arithmetic mean -greater the variability of returns over time, the more the arithmetic mean will exceed the geometric mean -based on all the various percent returns over the series, it gives you the average amount per year that would give you the same return -measures how investment returns re linked over time so should use if estimating returns over more than one period |
|
|
Term
|
Definition
-used by investors to find the average cost of shares purchased over time. how much investor paid on avg per share where he is buying the shares and spending an equal amt of money every time -formula essentially says the total amount spent on shares divided by the number of shares bought -AM>GM>HM --> A comes before G comes before H |
|
|
Term
|
Definition
-mean in which different observations are given different proportional influence on the mean -arithmetic mean assigns equal weights to every observations so it is sensitive to extreme values |
|
|
Term
|
Definition
-value of the middle item of a data set once it has been ordered -less affected by extreme values than the mean |
|
|
Term
|
Definition
-value occurring most frequently in a data set -bimodal, trimodal, etc |
|
|
Term
|
Definition
-values which divide the distribution such that there is a given proportion of observations at or below the quantile -if value for location of quantile is not a whole number, use LINEAR INTERPOLATION -->estimates an unknown value on the basis of two known values that surround it |
|
|
Term
|
Definition
-divides distribution in quarters -25% of data lies below certain number, 50% lies below certain value, 75% lies below certain value and 100% lie below specific value |
|
|
Term
|
Definition
-divide the distribution into fifths -20%,40%,60%,80%,100% |
|
|
Term
|
Definition
-divide the data into tenths |
|
|
Term
|
Definition
-divide the distribution into hundredths |
|
|
Term
|
Definition
-measure of variability of a random variable around its central tendency -how spread out the data is -measures of dispersion: Range, MAD, standard deviation and variance |
|
|
Term
|
Definition
-Range=Max value - Min value -high range means data is more dispersed |
|
|
Term
|
Definition
-average of the ABSOLUTE values of deviations of observations in a data set from their mean -take absolute value bc if not we'd just get 0 |
|
|
Term
|
Definition
-average of the squared deviations around the mean -computed using all members of a population -no units |
|
|
Term
|
Definition
-standard deviation is the positive square root of the variance -same units as random variable -by squaring all deviations from the mean, the standard deviation attaches a greater weight to larger deviations from the mean |
|
|
Term
|
Definition
-dealing with a subset or sample of the total population -difference from pop variance is denominator where you use n-l instead |
|
|
Term
|
Definition
= 1 - 1/(k^2) for all k>1
-gives an approximate value for the proportion of observations in a data set that fall within k standard deviations from the mean -ADVANTAGE: holds for samples, populations and for discrete and continuous data regardless of the shape of the distribution |
|
|
Term
|
Definition
-how much dispersion exists relative to the mean of a distribution --> per unit of return, how much risk are we taking -allows for direct comparisons of dispersion across different data sets! |
|
|
Term
|
Definition
-measures excess returns per unit of risk -tells us whether a portfolios returns are due to smart investment decisions or a result of excess risk -issue: portfolios w negative Sharpe ratios-->increasing risk will increase Sharpe ratio -standard deviation is an appropriate measure of risk only for investments and strategies that have an approximately symmetric distribution |
|
|
Term
|
Definition
-refers to the extent to which a distribution is not symmetrical -if skewness is more than 0.5 or less than -0.5 then its significant skew |
|
|
Term
|
Definition
-data equally dispersed to the right and left of the mean -skew=0 -mean=median=mode |
|
|
Term
Positively skewed distribution |
|
Definition
-characterized by many outliers in the upper region (right tail); long tail on right side -Mean>Median>Mode -skewness > 0 |
|
|
Term
Negatively skewed distribution |
|
Definition
-characterized by large amount of outliers that fall within its lower tail -mode>median>mean -skewness < 0 |
|
|
Term
|
Definition
-measures the extent to which a distribution is more or less peaked than a normal distribution -normal distribution--> K=3 -excess kurtosis = kurtosis - 3 -excess kurtosis >1 then it is significant amount |
|
|
Term
|
Definition
-more peaked w fatter tails -->more extreme outliers -kurtosis > 3 -excess kurtosis > 0 -someone who lept had to jump really high |
|
|
Term
|
Definition
-less peaked and has thinner tails -kurtosis < 3 -excess kurtosis < 0 -flatter like a platypus! |
|
|