Shared Flashcard Set

Details

gen bus 303-stats Mullins
stats exam 1
50
Business
Undergraduate 2
09/28/2011

Additional Business Flashcards

 


 

Cards

Term
population definition & what is the name of the numerical measure that describes a characteristic of a population?
Definition
collection of all members of a group
parameter
Term
sample definition and what is the numerical measure that describes a characteristic of a sample?
Definition
a portion of the population selected for analysis
statistic
Term
inferential statistics
Definition
drawing conclusions about a population based only on sample data.
Term
descriptive statistics
Definition
collecting, summarizing, and presenting data.
Term
discrete vs continuous
1. # people in the room
2. time of commute
3. height
4. td's scored by pack
5. weight
Definition
both are characteristics of numerical (quantitative data)
1. discrete
2. continuous
3. continuous
4. discrete
5. continuous
Term
categroical (qualitative) vs numerical (quantitative)
1. marital status
2. deflects per hour
3. voltage
4. eye color
Definition
1. categorical
2. numerical & discrete
3. numerical & continuous
4. categorical
Term
nominal vs ordinal vs interval vs ratio data
1. 1st, 2nd places in a race
2. temperature f/C
3. money
4. height
5. age
6.type of car owned
7. student's letter grades
8. service quality rating
9. standardized exam score
Definition
qualitative (nominal, ordinal) vs. quantitative (interval, ratio)
nominal: categories (no ordering or direction)
ordinal: ordered categories (rankings, ratings, order or scaling)
interval: differences between measurements but no true zero
ratio: differences between measurements, true zero exists
1. ordinal
2. interval
3. ratio (you can have absolutely no money)
4. ratio
5. ratio
6.nominal
7.ordinal
8.ordinal
9. interval
Term
how are nominal/ordinal/interval/ratio graphed?
qualitative aka categorical (nominal/ordinal) vs quantitative aka numerical (interval/ratio)
Definition
categorical: bar chart, pie chart, pareto chart, (graphing data) summary table (tabulating data)
numerical: stem and leaf display (ordered array), histogram, polygon, ogive ( all frequency distribution and cumulative distributions)
Term
i measure 2 students and use their resulting scores to make a statement comparing them. Identify the scale of measurement used:
1. I can only say that the two students are different
2. I can say that one student scored 6 points higher than the other
3. I can say that one student scored higher than the other, but I can't specify how much higher.
4. I can say that the score for one student is 2x the score of the other.
Definition
1. nominal
2. interval
3. ordinal
4. ratio
Term
which is an example of qualitative data?
1. social security number
2. score on multiple choice exam
3. height, in meters
4. number of square feet of carpet laid
Definition
social security is qualitative
Term
which of the following is an example of quantitative data?
1. number on a baseball uniform
2. serial number on a one dollar bill
3. numer of dependents you claim on your income tax form
Definition
number of dependents you claim on your tax income form
Term
which one is not an example of descriptive statistics?
1. histogram
2. estimate of number of alaska residents who have visited canada
3. table summarizing data collected in a sample
4. proportion of mailed out surveys completed and returned
Definition
2. estimate of the number of alaska residents who have visited canada
inferential statistics: drawing conclusions about a population based on sample results
Term
ordered array
is it useful for large or small sets of data?
Does it help identify outliers?
Definition
a sequence of ranked data in order.
shows range
provides some signals about variability
may help identify outliers
if data array is large, the ordered array is less useful
Term
stem and leaf diagram
Definition
a simple way to see distribution details in a data set
Term
frequency distribution
Definition
a tabulation of the number of occurences of each score value or measurement
why use it: it is a way to summarize numerical data, it condenses the raw data into a more useful form, it allows for a quick visual interpretation of the data
Term
the histogram
Definition
graph of the data in a frequency distribution is called a histogram
the class boundaries are shown on the horizontal axis, the vertical axis is either the frequency, relative frequency or percentage, bars of the appropriate heights are used to represent the number of observations within each class
width of bars represents width of class interval
Term
scatter diagrams
Definition
used to examine possible relationships between two numerical variables
Term
time series plot
Definition
used to study patterns in the values of a variable over time- time is usually measured on the horizontal axis
Term
measures of central tendency:arithmetic mean
Definition
1. arithmetic mean: most common, advantage=uses actual numerical values, disadvantage= affected by extreme values (outliers)
Term
point estimate
Definition
like a sample mean, is a one-number estimate of the value of a population parameter
Term
median
Definition
advantage: less sensitive to extreme values, can be used for ordinal data
disadvantage: based on less information than the mean
median position= (n+1)/ 2 position in the ordered data- it is not the value of the median, it is only the position of the median in the ranked data
Term
mode
Definition
value that occurs most often
adv: not affected by extreme values, can be used for either numerical or categorical data
disadvantage: ignores much information in the data
there may be no mode
there may be several modes
Term
which is best measure of location of "center"
1. if outliers exist
2. when using categorical data
3.if outliers dont exist
Definition
1. median
2. mode
3. mean
Term
box & wisker plot
how to find position of 1st, 2nd and 3rd quartiles in ranked data
Definition
Q1=(n+1)/4
Q2= (n+1)/2
Q3=3(n+1)/4
advantage: you can use when you have extreme values
Term
geometric mean & geometric rate of return
Definition
geo mean=used to measure the rate of change of a variable over time.
ROR=measures the status of an investment over time
geo mean: = (X1 x X2 x...x Xn) ^ (1/n)
ROR=[(1+R1) x (1+R2) x ... x (1+Rn)]^(1/n) -1
Term
geometric vs arithmetic returns which is better?
Definition
geometric, it eliminates risk
Term
measure of variation: Range
disadvantages?
Definition
the simplest measure of variation
difference between the largest and the smallest values in a set of data
disadvantages:
ignores the way in which data are distributed
Term
measures of variation: interquartile range
Definition
some outlier problems can be eliminated by using the interquartile range. some high and low valued observations are eliminated and the range is calculated from the remaining values (middle 50%)
Q3-Q1
Term
the variance
Definition
average of squared deviations of values from the mean.
for pop: σ2 = Σ ( Xi - μ )2 / N
for sample: s2 = Σ ( xi - x )2 / ( n - 1 )
Term
standard deviation
Definition
is the square root of the variance
most commonly used measure of variation
shows variation about the mean
has the same units as the original data
pop: sqrt [ Σ ( Xi - μ )2 / N ]
sample: sqrt [ Σ ( xi - x )2 / ( n - 1 ) ]
Term
measures of variation: summary characteristics
Definition
the more the data are spread out, the greater the range, variance, and standard deviation.
if the values are all the same (no variation) all these measures will be zero
none of these measures are ever negative
Term
advantages of variance and standard deviation
Definition
each value in the data set is used in the calculation
values far from the mean are given extra weight (because the deviations from the mean are squared)
Term
coefficient of variation
Definition
measures variation relative to mean
always in %
can be used to compare two or more sets of data measured in different units
shows risk in stocks
CV: (standard deviation/mean)
Term
z scores
Definition
we use the standard deviation to standardize scores.
a z score is a measure of distance from the mean in terms of standard deviation units
it is the difference between a value and the mean, divided by the standard deviation
a z score about 3.0 or below -3.0 is considered an outlier
Term
left skewed
median>mean or median
Definition
median>mean
Term
the empirical rule
Definition
if the data distribution is approximately bell-shaped, then the interval, mean+ or - 1 standard deviation = 68% of the values in the population or the sample,
2 S= 95%
3 S= 99.7
Term
chebyshev rule
Definition
regardless of how the data are distributed, at least (1-1/K^2) x 100 of the values will fall within K standard deviations of the mean (for k>1)
at least 56% data within 1.5 S of mean
at least 75% data within 2 S of mean
at least 89% data within 3 S of mean
Term
in general, which of the following descriptive summary measures cannot be easily approximated from a box and wisker plot?
a. variance
b. the range
c. the interquartile range
d. the median
Definition
A. the variance
Term
sample covariance
Definition
measures the strength of the linear relationship between two variables (called bivariate data). it is a non-standardized measure of the joint variance of the two variables
only concerned with the strength of the relationship
no causal is implied
cov xy= sum of (x-xmean)(y-ymean)/n-1
Term
cov (x,y) >0 = move in ___ direction
cov (x,y)<0 = move in ___ direction
cov (x,y)=0 = x& Y are ____
Definition
same
opposite
independent
depends on the units of measurement of x and y, so cannot compare relative strength of the relationship between variables
Term
coefficient of correlation
Definition
measures the relative strength of the linear relationship between two variables
sample coef. of correlation: r= cov(x,y)/SxSy
Term
features of correlation coefficient
Definition
population= p sample=r
unt free standardized measure
ranges between 1 &-1
the closer to -1 the stronger the negative linear relationship
the closer to 1 the stronger the positive linear relationship
the closer to 0 the weaker the linear relationship
Term
a correlation of -.32 is stronger than .30
Definition
TRUE
Term
True or false: Descriptive statistics are used to draw conclusions about a population based on sample data
Definition
false, Inferential statistics are used to draw conclusions about a population based on sample data
Term
Which os the following is false? A pareto diagram: 1. is a bar chart where categories are shown in descending order of frequency 2. is used to portray numerical data on an interval scale 3. is often shown with a cumulative polygon 4. is used to separate the "vital few" from the "trivial many"
Definition

it is false that it is used to portray numerical data on an interval scale

paretos are used to portray categorical data

Term
You would like to represent the distribution of students in a class based on class. which is the best for presenting data? 1. pie chart 2. stem and leaf 3. scatter plot 4. time series plot
Definition
pie chart
Term
t/f
unlike a grouped frequency distribution, a stem and leaf plot usually preserves original data values
Definition
true
Term
t/f
scatter diagrams are used to examine possible relationships between numerical and categorical data
Definition
false just for numerical data
Term
priori vs empirical classical probability vs subjective
Definition
priori=each outcome is equally likely
p(y)=p(x)
empirical=like relative frequency
subjective= an individual judgment or opinion about the probability of occurrence
Term
the probability of at least one head in two flips is:
1..33
2. .5
3 .75
4. 1
Definition
.75
at least= 1- P(no head)
1-.25=.75
Supporting users have an ad free experience!