Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
Removes drift and prevents aliasing and/or emphasize a particular frequency range |
|
|
Term
|
Definition
remove a particular frequency range e.g. 60Hz |
|
|
Term
|
Definition
is the Probability of rejecting the Null when it is actually true |
|
|
Term
|
Definition
= E ((X-mu)(X-mu)') =1/n \sum(X-mu)(X-mu)') |
|
|
Term
|
Definition
m = mean(hw3data, 2); centeredData = hw3data - repmat(m, 1, numSamples); |
|
|
Term
Compute Covariance Matrix (PCA) |
|
Definition
C = (1 / (numSamples )) * centeredData * centeredData'; |
|
|
Term
Compute Eigenvectors of Covariance Matrix (and sort by decreasing eigenvalue) |
|
Definition
[V, D] = eig(C); [V, D] = eigsort(V, D); |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
V(:,1:N)*projectedData(1:10) + m |
|
|
Term
|
Definition
A\B is the matrix division of A into B, same as INV(A)*B , |
|
|
Term
What does the minimum square error solution minimize? |
|
Definition
\sum_{(x,y) pairs} (y-mx-b)^2 |
|
|
Term
|
Definition
* Reflect the point with the highest WSS through centroid (center) of the simplex
* If this produces the lowest WSS (best point) expand the simplex and reflect further
* If this is just a good point start at the top and reflect again * If this the highest WSS (worst point) compress the simplex and reflect closer |
|
|
Term
What are the three activation functions? |
|
Definition
1.)sigmoid 2.)linear 3.)threshold |
|
|
Term
|
Definition
the boundary between where the neuron outputs 0 or 1 (for threshold units), crosses through .5 for linear/sigmoid units |
|
|
Term
When does Nelder Mead finish running? |
|
Definition
Rules are repeated until the convergence criteria are meet. The simplex moves over WSS surface and should contracts around minimum. |
|
|
Term
What does fminsearch do to find a good fit to data? |
|
Definition
It minimizes the mean square error of the fit |
|
|
Term
Why is the weight vector perpendicular to the decision given by the network? |
|
Definition
|
|
Term
|
Definition
bias weight can be dealt with as a weight from a unit with activation always 1 |
|
|
Term
What is the perceptron learning rule? |
|
Definition
|
|
Term
Why does the Perceptron Learning Rule make sense? |
|
Definition
|
|
Term
When does the Perceptron Learning Rule converge? |
|
Definition
|
|
Term
Can the Perceptron Learning Rule work for every case? |
|
Definition
No, the Perceptron Learning Rule only works for linearly separable problems |
|
|
Term
What does the perceptron learning rule do when the inputs are classified correctly? |
|
Definition
|
|
Term
What does the Perceptron learning rules do for inputs that have outputs 0 but should be 1? |
|
Definition
It moves the weight vector towards the input by adding the input vector to the weight vector |
|
|
Term
What does the perceptron learning rule do for inputs that have output 1 but should be 0 |
|
Definition
It moves the weight vecotr away from the output by subtracting w_1^new = w_1^old + (target-output) x_1 |
|
|
Term
What is the Perceptron Convergence Theorem? |
|
Definition
for any data set which is linearly separable, the PLA is guaranteed to find a solution in a finite number of steps |
|
|
Term
What does the perceptron learning rule do for non-linearly separable problems? |
|
Definition
will not converge (continually bounces around) |
|
|
Term
What is the pocket algorithm? |
|
Definition
The pocket algorithm is a variant on the PLA -- stores best solution and only stores new weights if the solution is better the previous ones |
|
|
Term
Why do we need to havemulti-layer perceptrons with non-linear activation units? |
|
Definition
|
|
Term
What is a commonly used activation function? |
|
Definition
f(x) = 1/(1+e^{-x}) = sigma(x) |
|
|
Term
What is a gradient vector? |
|
Definition
The vector of partial derivatives with respect to each parameter Gradient(f(x_1,x_2)) = [ df/dx_1 df/dx_2 ]' |
|
|
Term
What is the gradient descent algorithm? |
|
Definition
x_1 (new) = x_1 (old) - eta (df/dx_1) |
|
|
Term
What is the derivative of the sigmoid function? |
|
Definition
|
|
Term
What are the disadvantages of the Nelder Mead algorithm? |
|
Definition
|
|