Term
what can go wrong in data mining? which one is worse? |
|
Definition
learning things that aren't true patterns may not represent any underlying value. learning things that are true but not useful. Data may be at wrong level of detail #1 is worse. |
|
|
Term
what are the data mining styles. describe them. |
|
Definition
Hypothesis testing - answer questions or gain understanding directed - construct a model that explains or predicts one or more target variables undirected - find overall patterns that are not tied to a particular target |
|
|
Term
Why is hypothesis testing hard with long held beliefs? |
|
Definition
data reflects whatever assumptions have been made in the past |
|
|
Term
What are the hypothesis tests? |
|
Definition
test/control A/B tests champion/challenger tests |
|
|
Term
for hypothesis testing, what is a test and control |
|
Definition
test group and control group (treatments) choose overall group and randomly divide it into test/control groups any significant difference between them can be attributed to the treatment |
|
|
Term
for hypothesis testing, what is an A/B test |
|
Definition
compares 2 or more treatments determines effect of minor changes associated with direct marketing and web based retailing paired tests |
|
|
Term
for hypothesis testing, what is a champion/challenger test |
|
Definition
common form of A/B testing: compares new treatment (challenger) to existing treatment (champion) new model isn't adopted until it is proven better than the old one |
|
|
Term
Directed data mining focuses on ___ variables that are targets |
|
Definition
|
|
Term
___ looks for patterns that explain the target values |
|
Definition
|
|
Term
When is predictive modeling used |
|
Definition
when the target comes from a timeframe later than the inputs |
|
|
Term
when is profile modeling used |
|
Definition
when the target and inputs come from the same timeframe |
|
|
Term
what is the goal of undirected data mining |
|
Definition
to find overall patterns, then identify if those patterns are useful |
|
|
Term
What are the data mining tasks |
|
Definition
Preparing data for mining
exploratory data analysis
binary response modeling
classification of discrete values & predictions
Estimation of numeric values
finding clusters and associations
applying a model to new data |
|
|
Term
|
Definition
|
|