Term
|
Definition
tabular presenatation of statistical data---using tally marks. Columns include interval, tallies, (absolute, relative, cumulative absolute, cumulative relative) frequency |
|
|
Term
|
Definition
bar chart--representation of absolute frequency distribution--intervals on x axis |
|
|
Term
|
Definition
midpoint of each interval fro histogram is plotted instead--connect all points |
|
|
Term
measures of central tendency |
|
Definition
identify center, or average, of data sets--mean, median, mode, etc. |
|
|
Term
|
Definition
sum observations/N --number in population |
|
|
Term
|
Definition
sum sample observations/n --number in sample *used to make inferences about populations* |
|
|
Term
|
Definition
sum of observations/number of observations *only measure of central tendency for which sum of deviations of observations from the mean equals 0 |
|
|
Term
|
Definition
assign a coefficient weight to each observation (Ex: w1*x1+w2*x2 etc.) = sum/number of observations |
|
|
Term
|
Definition
midpoint of data set when arranged in ascending order |
|
|
Term
|
Definition
value that occurs most frequently in data set |
|
|
Term
unimodal/bimodal/trimodal |
|
Definition
data set may have multiple modes |
|
|
Term
|
Definition
1+Rg = [(1+R1) *(1+R2) * (1+R3)] ^ 1/n *always less than or equal to arithmetic mean *used when calculating investment returns over multiple periods or when measuring compound growth rates |
|
|
Term
|
Definition
N/[(1/X1)+(1/X2)+(1/X3)..etc.] *used for computations such as average cost of shares purchased over time--average price per share
[image] |
|
|
Term
|
Definition
harmonic < geometric < arithmetic |
|
|
Term
|
Definition
value at or below which a stated proportion of the data in a distribution lies (EX: quartile, quintile, decile, percentile) 99th percentile--measured from left side |
|
|
Term
|
Definition
Ly= (n+1) * y/100-----n= number of observations y= 75 if you're looking for 3rd quartile |
|
|
Term
|
Definition
variability around central tendency; risk in financial terms |
|
|
Term
|
Definition
high observation - low observation |
|
|
Term
|
Definition
average of abs values of deviations of observations from the mean |
|
|
Term
|
Definition
average of the squared deviations from the mean---- sum of [(Xi- mean)^2]/ N |
|
|
Term
population standard deviation |
|
Definition
square root of the population variance-- st. dev > MAD |
|
|
Term
|
Definition
sum of squared deviations of observations from the sample mean/(n-1) *n-1 is used to improve the statistical properties of s^2 as an estimator of population variance *for samples, mean is referred to as X with bar above instead of mu. *if we used sum of squared deviations/n, sample variance would be a biased estimator of population variance |
|
|
Term
sample standard deviation (s/ |
|
Definition
square root of sample variance (s^2) |
|
|
Term
|
Definition
for ANY shape distribution based on sample or population data, gives us the minimum percentage of the observations that lie within k standard deviations from the mean-- 1-1/k^2 for all k > 1 *applies to any distribution, if we know dist. is normal--we can be more precise |
|
|
Term
coefficient of variation (CV) |
|
Definition
*allows us to compare measures of dispersion between two distributions with different means---CV = st. dev of x/average value of x *lower number means less risk per unit of expected return |
|
|
Term
|
Definition
measures units of excess return recieved per unit of risk---(Rp-Rf)/st. dev. larger numbers are better--more excess return per unit of risk |
|
|
Term
|
Definition
non-symmetry in distributions |
|
|
Term
positive/negative skewness |
|
Definition
indicates many outliers in right tail |
|
|
Term
skewness & central tendency |
|
Definition
mean is most effected by outliers that cause skewness--it will be furthest to the skewed side *for positive skewness: mean > median > mode--this relation holds *mean is pulled further up or down by averaging outliers |
|
|
Term
|
Definition
leptokurtosis--more steeply peaked than normal distribution with further reaching tails--deviations from mean will likely be very small or very large |
|
|
Term
|
Definition
any kurtosis different than normal distributions--normal dist. have kurtosis of 3--therefore, kurtosis - 3 will provide evidence of excess kurtosis (either positive or negative) *Postive kurtosis or leptokurtosis is an indicator of risk because longer tails allow for large negative outcomes |
|
|
Term
which mean for investment returns? |
|
Definition
for past annual returns, geometric mean is used to compute average annual compound return--the return that if compounded the same number of periods would lead to the same increase in wealth--best estimator of next equal period of returns---HOWEVER, arithmetic mean is the best indicator of next year's return |
|
|
Term
|
Definition
uncertain outcome quantity/number |
|
|
Term
|
Definition
observed value of a random variable |
|
|
Term
|
Definition
single outcome or set of outcomes |
|
|
Term
mutually exclusive events |
|
Definition
cannot happen at the same time (heads--tails) |
|
|
Term
|
Definition
those that include all possible outcomes--probabilities add to 1 |
|
|
Term
|
Definition
the probability of the event = |
|
|
Term
|
Definition
derived examining past data |
|
|
Term
|
Definition
determined using formal reasoning and inspection |
|
|
Term
|
Definition
involves use of personal judgment EX: probability of my gf being the best looking in the room |
|
|
Term
|
Definition
odds are 1 to 6 for = 1/7 will be for---1 yes for every 6 nos |
|
|
Term
|
Definition
joint probability of A & B |
|
|
Term
|
Definition
probability of A, given B |
|
|
Term
|
Definition
P(AB)= P(A I B) * P(B)---multiply the two outcomes |
|
|
Term
|
Definition
mutually exclusive P(AB) is impossible = P(A) + P(B) non-mutually exclusive P(AB) is possible-- we must subtract the probability of P(AB) = P(A) + P(B) - P(AB) *remember venn diagram |
|
|
Term
|
Definition
sum of probabilities (that sum to 1) * respective outcomes *gives us a "best guess" of an outcome of a random variable *a type of weighted mean |
|
|
Term
|
Definition
equals sum of probability*deviation from mean for each event--similar to normal variance but we multiply by the probability of that deviation occuring EX: .10(1.80 - 1.28)^2 + .90 (1.6 - 1.28)^2 = variance |
|
|
Term
unconditional probability |
|
Definition
regardless of the outcome --probability calculation considering all alternative outcomes and associated probabilities (this is an OR problem) |
|
|
Term
|
Definition
E{[Ri - E(Ri)][Rj - E(Rj)]} R= observed return of asset E(R)= expected return for that asset *measures how one random variable moves with another random variable---Cov(Ri,Rj) = Corr (Ri,Rj) * stdev (Ri) * stdev (Rj) |
|
|
Term
|
Definition
*makes covariance usable-- corr(Ri,Rj) = Cov (Ri,Rj)/[stdev(Ri) * stdev(Rj)] *measures strength of linear relationship between 2 variables-- r^2 = 0 = no linear relationship |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
allows us to update a set of prior probabilities in response to new information gained--- insert image---decision tree pg. 217 |
|
|
Term
|
Definition
|
|
Term
combination formula (binomial) |
|
Definition
insert image *used when number of labels = 2 TI: to calculate number of different groups of 3 stocks from a list of 8 stocks use 8 [2nd] [nCr] 3 [=] |
|
|
Term
|
Definition
insert image *TI: to calculate number of differently ordered groups of 3 that can be selected from list of 8, use 8 [2nd] [nPr] 3 [=] |
|
|
Term
counting problem guidelines |
|
Definition
|
|
Term
continuous random variable |
|
Definition
one for which the number of possible outcomes is infinite, even if lower and upper bounds exist EX: amount of daily rainfall in inches |
|
|
Term
|
Definition
one for which number of possible outcomes can be counted, and for each possible outcome, there's a positive probability EX: number of days it rains in a month |
|
|
Term
|
Definition
p(x)> 0 if the event can occur |
|
|
Term
|
Definition
even if x can occur p(x) = 0 because it's a single point along a line of infinite outcomes. However, we can assign positive probability to a range, say p(5 |
|
|
Term
|
Definition
p(x)= denotes probability of some event. all possible events must sum to 1 |
|
|
Term
probability density function (pdf) |
|
Definition
f(x)= probability function for a continuous distribution--probability that x falls within a range of outcomes |
|
|
Term
cumulative distribution function (cdf) |
|
Definition
F(x)= gives cumulative probability for all possible outcomes up to x EX: F(6)= P(X< or equal to 6) |
|
|
Term
|
Definition
number of successes in a given number of trials where outcome can be success or failure |
|
|
Term
discrete uniform random variable |
|
Definition
one for which the probabilities for all possible outcomes for a discrete random variable are equal |
|
|
Term
|
Definition
p(x)= n! px(1-p)n-x
(n-x)!x!
*used when only two outcomes are possible--pulling x black beans from n draws of a bowl of black & white |
|
|
Term
up transition probability &
down transition probability
for stock price movement |
|
Definition
If the probability of stock price rising is .3 any given period, then the probability of a down transition is (1-.3).
Similarly, and up factor of 1.01 * current price will be offset by a down factor of current price/1.01 |
|
|
Term
continuous uniform distribution |
|
Definition
equal probabilities for all ranges of outcomes between lower limit a and upper limit b. This distribution is shaped like a rectangle. Simplest distribution. |
|
|
Term
|
Definition
relationship between one variable |
|
|
Term
multivariate distribution |
|
Definition
relationship between two or more normally distributed variables.
for a portfolio of 4 assets, a multivariate distribution can be described by 4 means, 4 standard deviations and .5n(n-1) = 6 correlations |
|
|
Term
confidence intervals (normal distribution) |
|
Definition
68% = + 1s
90% = +1.65s
95% = +1.96s
99% = +2.58s
s= sample standard deviation
|
|
|
Term
standard normal distribution |
|
Definition
normal distribution with mean = 0 and stdev = 1
*standardization is the process of converting an observed value for a random variable to it's z-value |
|
|
Term
|
Definition
z= observation - population mean
standard deviation
*z-score is the number of standard deviations observation is from the mean |
|
|
Term
Roy's safety first criterion |
|
Definition
states that optimal portfolio minimzes the probability that the return falls be low some minimum acceptable threshold
[E(Rp) - RL]
s
*higher numbers are better--meaning threshold is more standard deviations away from the expected return
*SFR is number of stdevs below the mean |
|
|
Term
|
Definition
generated by function ex where x is normally distributed
skewed to the right with a lower bound of 0
*useful for modeling assets whose values cannot drop below 0 |
|
|
Term
|
Definition
annual, semi-annual, monthly, etc |
|
|
Term
|
Definition
er-1
r= stated annual rate
EX: 10%; e0.1-1 = 1.105171
*to find what continuously compounded rate we would use to match a given holding period return---use the ln key
EX: HPR = 12% 12 ln = 11.778%
|
|
|
Term
|
Definition
computer program used to value assets by incorporating different risk factors and running reiterations 1000s of times to develop a mean and stdev
*used to value complex securities
*simulate profits/losses of trading strategies
*value portfolios of assets with non normal return dist |
|
|
Term
limitations of Monte Carlo |
|
Definition
limited to accuracy of assumptions and pricing model used
statistical and not analytic analysis--cannot provide the insights of analytic analysis |
|
|
Term
|
Definition
similar to Monte Carlo, using past changes in risk factors to estimate how changes affect security prices
randomly enacts these historical changes based on a forwardlooking model
*limited because past changes may not reflect future events
*no "what if" analysis--Monte Carlo has this |
|
|
Term
|
Definition
each item in the population being studied has an equal likelihood of being included in the sample |
|
|
Term
|
Definition
selecting every nth member from a population |
|
|
Term
|
Definition
difference between sample statistic (mean, stdev) and its corresponding population parameter |
|
|
Term
|
Definition
distribution of sample statistics drawn from many same size samples from a population
EX: distribution sample means |
|
|
Term
stratified random sampling |
|
Definition
separate the population into classified groups--we pull a certain number from each group in the proportion that that group represents the whole |
|
|
Term
|
Definition
observations taken over a period of time at specific and equally spaced time intervals |
|
|
Term
|
Definition
sample of observations taken at a single point in time
EX: EPS of all NASDAQ companies today |
|
|
Term
|
Definition
observations over time of multiple characteristics of the same entity |
|
|
Term
|
Definition
observations over time of the same characteristic for many entities |
|
|
Term
|
Definition
for simple random samples of size n from a populartion with an unknown mean and variance, the sampling distribution of the sample mean x(bar) approaches a normal prob distribution with x(bar) = mu and a variance equal to var/n
n> 30 is a large enough sample
*this means 30 different sample means pulled from 30 simple random samples of a population |
|
|
Term
standard error of the sample mean
[image] |
|
Definition
standard deviation of the distribution of the sample mean
σx(bar) = s
n1/2
Remember:
σ = population stdev
s= sample stdev
se= st dev of distribution of >30 sample means |
|
|
Term
|
Definition
*appropriate for using with sample size N<30
*use when population variance is unknown
Z is used when population variance is known
defined by Degrees of Freedom = n-1
similar to the normal distribution with fatter tails (more conservative confidence intervals), less peaked
as degrees of freedom increase, approaches the normal distribution |
|
|
Term
|
Definition
use z-statistic if pop variance is known for n>30
use t-statistic (more conservative) when pop variance is unknown for n>30
non-normal with n<30 does not work! |
|
|
Term
|
Definition
repeatedly using the same database to search for patterns until one that "works" is discovered |
|
|
Term
|
Definition
statistical significance of the pattern was overestimated because of data mining |
|
|
Term
|
Definition
some data is systematically excluded from the analysis because of lack of availability
sample is not truly random |
|
|
Term
|
Definition
EX: mutual fund databases only include data for those that still exist, and exclude those that cease to exist due to closure or merger |
|
|
Term
|
Definition
test using sample data that was not available on teh test date |
|
|
Term
|
Definition
using a period that is too short or too long
too short--may be circumstantial phenomena
too long--underlying economic factors may have changed |
|
|
Term
|
Definition
1. state hypothesis
2. select appropriate test statistic
3. specify the level of significance
4. state the decision rule regarding the hypothesis
5. collect the sample and calculate the sample stats
6. make a decision regarding the hypothesis
7. make a decision based on results of the test |
|
|
Term
|
Definition
Ho. the statement that we test but in actuality want to prove wrong
|
|
|
Term
|
Definition
rejection of null hypothesis when it is actually true |
|
|
Term
|
Definition
failure to reject the null hypothesis when it is actually false |
|
|
Term
|
Definition
the probability of making a type I error
EX: 5% for 95% confidence intervals |
|
|
Term
|
Definition
probability of correctly rejecting the null hypothesis when it's false
1-P(type II error) |
|
|
Term
|
Definition
not necessarily economic significance--there are other factors to be considered |
|
|
Term
|
Definition
the probability of obtaining a test statistic that would lead to a rejection of the null hypothesis, assuming the null hypothesis is true |
|
|