Term
|
Definition
Sort documents to user-defined classes |
|
|
Term
|
Definition
Automate the selection of positive and negative terms in a document. Useful for political polls, marketing. |
|
|
Term
|
Definition
Calculating the frequency of n-grams in a certain language that are usually spam words. |
|
|
Term
Rule based spam identification |
|
Definition
Filters spam based on rules and adds weight to certain n-grams and once it passes some threshold, its identified as spam. |
|
|
Term
Statistical approach spam identification |
|
Definition
These learn from a large set of examples--one spam set, one ham set. They can adapt based on what emails are marked as spam by all or specific users. |
|
|
Term
Rule based identification drawbacks |
|
Definition
They are, by nature, one step behind spammers because a pattern has to be identified first and by that time, the spam is already out. |
|
|
Term
|
Definition
Training set and test set that is pre-programmed with the correct answers. |
|
|
Term
Supervised learning method |
|
Definition
1. Label a corpus of artciels with desired categories to make training and test sets 2. Apply machine learning software to the labeled training system set that summarizes whats been learned. 3. Generate predictions for test set model 4. Deploy model on untested set |
|
|
Term
|
Definition
There are no pre-assumed categories but there are now cluster articles that have similar properties, like being about sports. Its less costly because you dont have to sit someone down and label every single document but the clusters may not be intuitive and clustering solutions are difficult to evaluate. |
|
|
Term
|
Definition
Looks at most relevant properties of spam |
|
|
Term
Kitchen sink feature engineering |
|
Definition
Use many features in the hope that some will be relevant and useful. Make every word a feature and choose a machine learning method that is good at focusing on few but important features and ignores irrelevant features. |
|
|
Term
Hand crafted strategy of feature enginering |
|
Definition
Carefully and thoughtfully identify a small set of features that are likely to be relevant. The downside is that you have the choose the features. |
|
|
Term
Naive Bayes for document classification |
|
Definition
Take a word. Count how much of that word is in spam and how much is in ham and calculat ethat ratio Then calculate the odds ratio (ham/total over spam/total). Combine the |
|
|
Term
|
Definition
Pretend you're dealing with an unstructured set of data that ignores syntax and topic structure. Put all the words of a document in a bag, draw a word and calculate which document its most likely to have come from. |
|
|
Term
|
Definition
Error-driven learning. It predicts outcomes and then adjusts the weights when it makes the wrong prediction. Initially the weights are uninformative but over time it builds up an ability to associate features with outcomes. Its a network with two layers; one node for each possible unput features and one for each possible outcome (spam and ham) |
|
|
Term
|
Definition
How do people learn regular and irregular forms of words? |
|
|
Term
|
Definition
Star with good performance on some task, then get substantially worse, and then gradually get better again. |
|
|
Term
|
Definition
A test given to kids with a made up noun, "wug" and see if kids can determine the plural form. |
|
|
Term
|
Definition
Quantity: keep it short and sweet. Not TMI. Quality: Don't lie or be sarcastic. Relation: Say things that are pertinent to the question. Manner: Be clear, brief, and orderly. |
|
|
Term
|
Definition
A robot that was an expert is moving shapes around. like, REALLY good. This showed that AI is successful but only in a very controlled and within a specific domain |
|
|
Term
|
Definition
A man sits in a room with a Chinese rule book. The input is in English, he translates it using the rule book, and outputs in perfect chinese. Does he know chinese? Does the room know chinese? |
|
|
Term
|
Definition
A therapy model that wasnt very good at her job. |
|
|
Term
|
Definition
the logical aspects of language and its meaning |
|
|
Term
|
Definition
How context contributes to meaning |
|
|