Term
|
Definition
The linear fit that matches the pattern of a set of paired data as closely as possible. Out of all possible linear fits, the least-squares regression line is the one that has the smallest possible value for the sum of the squares of the residuals.
|
|
|
Term
|
Definition
In regression, the explanatory or independent variable is the one that is supposed to "explain" the other. For example, in examining crop yield versus quantity of fertilizer applied, the quantity of fertilizer would be the explanatory or independent variable, and the crop yield would be thedependent variable. In experiments, the explanatory variable is the one that is manipulated; the one that is observed is the dependent variable.
|
|
|
Term
|
Definition
n.
1. Mathematics A mathematical variable whose value is determined by the value assumed by an independent variable.
2. Statistics The observed variable in an experiment or study whose changes are determined by the presence or degree of one or more independent variables. |
|
|
Term
|
Definition
is when the value of a variable is estimated at times which have not yet been observed. This estimate may be reasonably reliable for short times into the future, but for longer times, the estimate is liable to become less accurate. |
|
|
Term
|
Definition
Given a set of bivariate data (x, y), to impute a value of y corresponding to some value of x at which there is no measurement of y is called interpolation, if the value of x is within the range of the measured values of x. If the value of x is outside the range of measured values, imputing a corresponding value of y is called extrapolation.
|
|
|
Term
|
Definition
|
|
Term
|
Definition
A numerical value or a characteristic that can differ from individual to individual. See also categorical variable, qualitative variable, quantitative variable, discrete variable, continuous variable, and random variable.
|
|
|
Term
|
Definition
A variable that takes numerical values for which arithmetic makes sense, for example, counts, temperatures, weights, amounts of money, etc.For some variables that take numerical values, arithmetic with those values does not make sense; such variables are not quantitative. For example, adding and subtracting social security numbers does not make sense. Quantitative variables typically have units of measurement, such as inches, people, or pounds.
|
|
|
Term
|
Definition
is one whose values are adjectives, such as colors, genders, nationalities, etc. |
|
|
Term
|
Definition
A collection of units being studied. Units can be people, places, objects, epochs, drugs, procedures, or many other things. Much of statistics is concerned with estimating numerical properties (parameters) of an entire population from a random sample of units from the population.
|
|
|
Term
|
Definition
is a collection of units from a population.
|
|
|
Term
|
Definition
In statistics and quantitative research methodology, levels of measurement or scales of measure are types of data that arise in the theory of scale types developed by the psychologist Stanley Smith Stevens. The types are nominal, ordinal, interval, and ratio. |
|
|
Term
|
Definition
In statistics, ordinal data is a statistical data type consisting of numerical scores that exist on an ordinal scale, i.e. |
|
|
Term
|
Definition
In statistics and quantitative research methodology, levels of measurement or scales of measure are types of data that arise in the theory of scale types developed by the psychologist Stanley Smith Stevens. The types are nominal, ordinal, interval, and ratio. |
|
|
Term
|
Definition
is the discipline of quantitatively describing the main features of a collection of data, or the quantitative description itself. ... |
|
|
Term
|
Definition
-
In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation. ...
|
|
|
Term
|
Definition
A random variable X has a normal distribution with mean m and standard error s if for every pair of numbers a ≤ b, the chance that a < (X−m)/s < b is
P(a < (X−m)/s < b) = area under the normal curve between a and b.
If there are numbers m and s such that X has a normal distribution with mean m and standard error s, then X is said to have a normal distribution or to be normally distributed. If X has a normal distribution with mean m=0 and standard error s=1, then X is said to have a standard normal distribution. The notation X~N(m,s2) means that X has a normal distribution with mean m and standard error s; for example, X~N(0,1), means X has a standard normal distribution.
|
|
|
Term
|
Definition
a set of numbers is the rms of the set of deviations between each element of the set and the mean of the set. See alsosample standard deviation.
|
|
|
Term
|
Definition
a way to visualize bivariate data. A plot of pairs of measurements on a collection of "individuals" (which need not be people). For example, suppose we record the heights and weights of a group of 100 people. The scatterplot of those data would be 100 points. Each point represents one person's height and weight. In a scatterplot of weight against height, the x-coordinate of each point would be height of one person, the y-coordinate of that point would be the weight of the same person. In a scatterplot of height against weight, the x-coordinates would be the weights and the y-coordinates would be the heights.
|
|
|
Term
|
Definition
A measure of linear association between two (ordered) lists. Two variables can be strongly correlated without having any causal relationship, and two variables can have a causal relationship and yet be uncorrelated.
|
|
|
Term
Coefficient of determination |
|
Definition
denoted R2 and pronouncedR squared, indicates how well data points fit a line or curve. It is a statisticused in the context of statistical models whose main purpose is either theprediction of future outcomes or the testing of hypotheses, on the basis of other related information. It provides a measure of how well observed outcomes are replicated by the model, as the proportion of total variation of outcomes explained by the model.[1] |
|
|
Term
|
Definition
r is a measure of how nearly a scatterplot falls on a straight line. The correlation coefficient is always between −1 and +1. To compute the correlation coefficient of a list of pairs of measurements (X,Y), first transform X and Y individually into standard units. Multiply corresponding elements of the transformed pairs to get a single list of numbers. The correlation coefficient is the mean of that list of products. This page contains a tool that lets you generate bivariate data with any correlation coefficient you want.
|
|
|
Term
|
Definition
The difference between a datum and the value predicted for it by a model. In linear regression of a variable plotted on the vertical axis onto a variable plotted on the horizontal axis, a residual is the "vertical" distance from a datum to the line. Residuals can be positive (if the datum is above the line) or negative (if the datum is below the line). Plots of residuals can reveal computational errors in linear regression, as well as conditions under which linear regression is inappropriate, such as nonlinearity and heteroscedasticity. If linear regression is performed properly, the sum of the residuals from the regression line must be zero; otherwise, there is a computational error somewhere.
|
|
|
Term
|
Definition
Two variables are causally related if changes in the value of one cause the other to change. For example, if one heats a rigid container filled with a gas, that causes the pressure of the gas in the container to increase. Two variables can be associated without having any causal relation, and even if two variables have a causal relation, their correlation can be small or zero.
|
|
|
Term
|
Definition
a smooth curve fitted to the set of paired data in regression analysis; for linear regression the curve is a straight line |
|
|
Term
|
Definition
Data that is divided into ranges and in which the distance between the intervals is meaningful. |
|
|