| Term 
 | Definition 
 
        | tabular presenatation of statistical data---using tally marks. Columns include interval, tallies, (absolute, relative, cumulative absolute, cumulative relative) frequency |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | bar chart--representation of absolute frequency distribution--intervals on x axis |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | midpoint of each interval fro histogram is plotted instead--connect all points |  | 
        |  | 
        
        | Term 
 
        | measures of central tendency |  | Definition 
 
        | identify center, or average, of data sets--mean, median, mode, etc. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sum observations/N --number in population |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sum sample observations/n --number in sample *used to make inferences about populations* |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sum of observations/number of observations *only measure of central tendency for which sum of deviations of observations from the mean equals 0 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | assign a coefficient weight to each observation (Ex: w1*x1+w2*x2 etc.) = sum/number of observations |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | midpoint of data set when arranged in ascending order |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | value that occurs most frequently in data set |  | 
        |  | 
        
        | Term 
 
        | unimodal/bimodal/trimodal |  | Definition 
 
        | data set may have multiple modes |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | 1+Rg = [(1+R1) *(1+R2) * (1+R3)] ^ 1/n *always less than or equal to arithmetic mean *used when calculating investment returns over multiple periods or when measuring compound growth rates |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | N/[(1/X1)+(1/X2)+(1/X3)..etc.] *used for computations such as average cost of shares purchased over time--average price per share   [image] |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | harmonic < geometric < arithmetic |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | value at or below which a stated proportion of the data in a distribution lies (EX: quartile, quintile, decile, percentile) 99th percentile--measured from left side |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Ly= (n+1) * y/100-----n= number of observations y= 75 if you're looking for 3rd quartile |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | variability around central tendency; risk in financial terms |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | high observation - low observation |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | average of abs values of deviations of observations from the mean |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | average of the squared deviations from the mean---- sum of [(Xi- mean)^2]/ N |  | 
        |  | 
        
        | Term 
 
        | population standard deviation |  | Definition 
 
        | square root of the population variance-- st. dev > MAD |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sum of squared deviations of observations from the sample mean/(n-1) *n-1 is used to improve the statistical properties of s^2 as an estimator of population variance *for samples, mean is referred to as X with bar above instead of mu. *if we used sum of squared deviations/n, sample variance would be a biased estimator of population variance |  | 
        |  | 
        
        | Term 
 
        | sample standard deviation (s/ |  | Definition 
 
        | square root of sample variance (s^2) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | for ANY shape distribution based on sample or population data, gives us the minimum percentage of the observations that lie within k standard deviations from the mean-- 1-1/k^2 for all k > 1 *applies to any distribution, if we know dist. is normal--we can be more precise |  | 
        |  | 
        
        | Term 
 
        | coefficient of variation (CV) |  | Definition 
 
        | *allows us to compare measures of dispersion between two distributions with different means---CV = st. dev of x/average value of x *lower number means less risk per unit of expected return |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | measures units of excess return recieved per unit of risk---(Rp-Rf)/st. dev. larger numbers are better--more excess return per unit of risk |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | non-symmetry in distributions |  | 
        |  | 
        
        | Term 
 
        | positive/negative skewness |  | Definition 
 
        | indicates many outliers in right tail |  | 
        |  | 
        
        | Term 
 
        | skewness & central tendency |  | Definition 
 
        | mean is most effected by outliers that cause skewness--it will be furthest to the skewed side *for positive skewness: mean > median > mode--this relation holds *mean is pulled further up or down by averaging outliers |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | leptokurtosis--more steeply peaked than normal distribution with further reaching tails--deviations from mean will likely be very small or very large |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | any kurtosis different than normal distributions--normal dist. have kurtosis of 3--therefore, kurtosis - 3 will provide evidence of excess kurtosis (either positive or negative) *Postive kurtosis or leptokurtosis is an indicator of risk because longer tails allow for large negative outcomes |  | 
        |  | 
        
        | Term 
 
        | which mean for investment returns? |  | Definition 
 
        | for past annual returns, geometric mean is used to compute average annual compound return--the return that if compounded the same number of periods would lead to the same increase in wealth--best estimator of next equal period of returns---HOWEVER, arithmetic mean is the best indicator of next year's return |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | uncertain outcome quantity/number |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | observed value of a random variable |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | single outcome or set of outcomes |  | 
        |  | 
        
        | Term 
 
        | mutually exclusive events |  | Definition 
 
        | cannot happen at the same time (heads--tails) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | those that include all possible outcomes--probabilities add to 1 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | the probability of the event = |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | derived examining past data |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | determined using formal reasoning and inspection |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | involves use of personal judgment EX: probability of my gf being the best looking in the room |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | odds are 1 to 6 for = 1/7 will be for---1 yes for every 6 nos |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | joint probability of A & B |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | probability of A, given B |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | P(AB)= P(A I B) * P(B)---multiply the two outcomes |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | mutually exclusive P(AB) is impossible = P(A) + P(B) non-mutually exclusive P(AB) is possible-- we must subtract the probability of P(AB) = P(A) + P(B) - P(AB) *remember venn diagram |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sum of probabilities (that sum to 1) * respective outcomes *gives us a "best guess" of an outcome of a random variable *a type of weighted mean |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | equals sum of probability*deviation from mean for each event--similar to normal variance but we multiply by the probability of that deviation occuring EX: .10(1.80 - 1.28)^2 + .90 (1.6 - 1.28)^2 = variance |  | 
        |  | 
        
        | Term 
 
        | unconditional probability |  | Definition 
 
        | regardless of the outcome --probability calculation considering all alternative outcomes and associated probabilities (this is an OR problem) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | E{[Ri - E(Ri)][Rj - E(Rj)]} R= observed return of asset E(R)= expected return for that asset *measures how one random variable moves with another random variable---Cov(Ri,Rj) = Corr (Ri,Rj) * stdev (Ri) * stdev (Rj) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | *makes covariance usable-- corr(Ri,Rj) = Cov (Ri,Rj)/[stdev(Ri) * stdev(Rj)] *measures strength of linear relationship between 2 variables-- r^2 = 0 = no linear relationship |  | 
        |  | 
        
        | Term 
 | Definition 
 | 
        |  | 
        
        | Term 
 | Definition 
 | 
        |  | 
        
        | Term 
 | Definition 
 | 
        |  | 
        
        | Term 
 | Definition 
 
        | allows us to update a set of prior probabilities in response to new information gained--- insert image---decision tree pg. 217 |  | 
        |  | 
        
        | Term 
 | Definition 
 | 
        |  | 
        
        | Term 
 
        | combination formula (binomial) |  | Definition 
 
        | insert image *used when number of labels = 2 TI: to calculate number of different groups of 3 stocks from a list of 8 stocks use 8 [2nd] [nCr] 3 [=] |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | insert image *TI: to calculate number of differently ordered groups of 3 that can be selected from list of 8, use 8 [2nd] [nPr] 3 [=] |  | 
        |  | 
        
        | Term 
 
        | counting problem guidelines |  | Definition 
 | 
        |  | 
        
        | Term 
 
        | continuous random variable |  | Definition 
 
        | one for which the number of possible outcomes is infinite, even if lower and upper bounds exist EX: amount of daily rainfall in inches |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | one for which number of possible outcomes can be counted, and for each possible outcome, there's a positive probability EX: number of days it rains in a month |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | p(x)> 0 if the event can occur |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | even if x can occur p(x) = 0 because it's a single point along a line of infinite outcomes. However, we can assign positive probability to a range, say p(5 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | p(x)=    denotes probability of some event. all possible events must sum to 1 |  | 
        |  | 
        
        | Term 
 
        | probability density function (pdf) |  | Definition 
 
        | f(x)=  probability function for a continuous distribution--probability that x falls within a range of outcomes |  | 
        |  | 
        
        | Term 
 
        | cumulative distribution function (cdf) |  | Definition 
 
        | F(x)=  gives cumulative probability for all possible outcomes up to x EX: F(6)= P(X< or equal to 6)
 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | number of successes in a given number of trials where outcome can be success or failure |  | 
        |  | 
        
        | Term 
 
        | discrete uniform random variable |  | Definition 
 
        | one for which the probabilities for all possible outcomes for a discrete random variable are equal |  | 
        |  | 
        
        | Term 
 | Definition 
 
        |        p(x)=   n!   px(1-p)n-x (n-x)!x!     *used when only two outcomes are possible--pulling x black beans from n draws of a bowl of black & white |  | 
        |  | 
        
        | Term 
 
        | up transition probability & down transition probability for stock price movement |  | Definition 
 
        | If the probability of stock price rising is .3 any given period, then the probability of a down transition is (1-.3).   Similarly, and up factor of 1.01 * current price will be offset by a down factor of current price/1.01 |  | 
        |  | 
        
        | Term 
 
        | continuous uniform distribution |  | Definition 
 
        | equal probabilities for all ranges of outcomes between lower limit a and upper limit b. This distribution is shaped like a rectangle. Simplest distribution. |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | relationship between one variable |  | 
        |  | 
        
        | Term 
 
        | multivariate distribution |  | Definition 
 
        | relationship between two or more normally distributed variables. 
 for a portfolio of 4 assets, a multivariate distribution can be described by 4 means, 4 standard deviations and .5n(n-1) = 6 correlations
 |  | 
        |  | 
        
        | Term 
 
        | confidence intervals (normal distribution) |  | Definition 
 
        | 68% = + 1s 90% = +1.65s 95% = +1.96s 99% = +2.58s   s= sample standard deviation     |  | 
        |  | 
        
        | Term 
 
        | standard normal distribution |  | Definition 
 
        | normal distribution with mean = 0 and stdev = 1   *standardization is the process of converting an observed value for a random variable to it's z-value |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | z= observation - population mean standard deviation   *z-score is the number of standard deviations observation is from the mean |  | 
        |  | 
        
        | Term 
 
        | Roy's safety first criterion |  | Definition 
 
        | states that optimal portfolio minimzes the probability that the return falls be low some minimum acceptable threshold   [E(Rp) - RL] s   *higher numbers are better--meaning threshold is more standard deviations away from the expected return   *SFR is number of stdevs below the mean |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | generated by function ex where x is normally distributed   skewed to the right with a lower bound of 0   *useful for modeling assets whose values cannot drop below 0 |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | annual, semi-annual,  monthly, etc |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | er-1   r= stated annual rate   EX: 10%; e0.1-1 = 1.105171   *to find what continuously compounded rate we would use to match a given holding period return---use the ln key   EX: HPR = 12%  12 ln = 11.778%     |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | computer program used to value assets by incorporating different risk factors and running reiterations 1000s of times to develop a mean and stdev   *used to value complex securities *simulate profits/losses of trading strategies *value portfolios of assets with non normal return dist |  | 
        |  | 
        
        | Term 
 
        | limitations of Monte Carlo |  | Definition 
 
        | limited to accuracy of assumptions and pricing model used   statistical and not analytic analysis--cannot provide the insights of analytic analysis |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | similar to Monte Carlo, using past changes in risk factors to estimate how changes affect security prices   randomly enacts these historical changes based on a forwardlooking model   *limited because past changes may not reflect future events   *no "what if" analysis--Monte Carlo has this |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | each item in the population being studied has an equal likelihood of being included in the sample |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | selecting every nth member from a population |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | difference between sample statistic (mean, stdev) and its corresponding population parameter |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | distribution of sample statistics drawn from many same size samples from a population   EX: distribution sample means |  | 
        |  | 
        
        | Term 
 
        | stratified random sampling |  | Definition 
 
        | separate the population into classified groups--we pull a certain number from each group in the proportion that that group represents the whole |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | observations taken over a period of time at specific and equally spaced time intervals |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | sample of observations taken at a single point in time   EX: EPS of all NASDAQ companies today |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | observations over time of multiple characteristics of the same entity |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | observations over time of the same characteristic for many entities |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | for simple random samples of size n from a populartion with an unknown mean and variance, the sampling distribution of the sample mean x(bar) approaches a normal prob distribution with x(bar) = mu and a variance equal to var/n   n> 30 is a large enough sample   *this means 30 different sample means pulled from 30 simple random samples of a population |  | 
        |  | 
        
        | Term 
 
        | standard error of the sample mean   [image] |  | Definition 
 
        | standard deviation of the distribution of the sample mean   σx(bar) = s               n1/2   Remember:   σ = population stdev   s= sample stdev   se= st dev of distribution of >30 sample means |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | *appropriate for using with sample size N<30   *use when population variance is unknown Z is used when population variance is known
   defined by Degrees of Freedom = n-1   similar to the normal distribution with fatter tails (more conservative confidence intervals), less peaked   as degrees of freedom increase, approaches the normal distribution |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | use z-statistic if pop variance is known for n>30   use t-statistic (more conservative) when pop variance is unknown for n>30   non-normal with n<30 does not work! |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | repeatedly using the same database to search for patterns until one that "works" is discovered |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | statistical significance of the pattern was overestimated because of data mining |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | some data is systematically excluded from the analysis because of lack of availability   sample is not truly random |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | EX: mutual fund databases only include data for those that still exist, and exclude those that cease to exist due to closure or merger |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | test using sample data that was not available on teh test date |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | using a period that is too short or too long   too short--may be circumstantial phenomena   too long--underlying economic factors may have changed |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | 1. state hypothesis 2. select appropriate test statistic 3. specify the level of significance 4. state the decision rule regarding the hypothesis 5. collect the sample and calculate the sample stats 6. make a decision regarding the hypothesis 7. make a decision based on results of the test |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | Ho. the statement that we test but in actuality want to prove wrong     |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | rejection of null hypothesis when it is actually true |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | failure to reject the null hypothesis when it is actually false |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | the probability of making a type I error   EX: 5% for 95% confidence intervals |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | probability of correctly rejecting the null hypothesis when it's false   1-P(type II error) |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | not necessarily economic significance--there are other factors to be considered |  | 
        |  | 
        
        | Term 
 | Definition 
 
        | the probability of obtaining a test statistic that would lead to a rejection of the null hypothesis, assuming the null hypothesis is true |  | 
        |  |