Machine learning algorithms
- Supervised learning
We already have the right answers given for a set of inputs and you need to predict the answer for a new input.
Regression problem - Predict a continuous value given a set of values.
Classification problem - based on the prior inputs given you can have a equation that would do the classification for you. One dimension point, two dimension line, three dimension plane.. .. based on the randomness too :)
- Unsupervised learning
Here you just have data and you would have to figure out the clusters that are formed and make a meaning out of it.
Learning algorithms:
- Linear regression: h(x) = t0+t1x - uni variant linear regression.
h - hypothesis function.
cost function J(t0,t1) = 1/2m ∑ 1 to m (h(xi) - yi) pow 2 - minimal value. (also know as squared error function.)
This cost function when plotted with t0,t1,value in three dimension would be a bowl and the bottom of the bowl is the ideal values of t0,t1.
The goal is to find t0,t1 with minimal effort / cpu even when m is very high.
Gradient descent:
- t1 := t1 - alpha * partial derivative (J(t1));
Feature scaling Mean normalization.
Normal equation method.
- (X transpose * X ) inverse * X transpose * Y
We already have the right answers given for a set of inputs and you need to predict the answer for a new input.
Regression problem - Predict a continuous value given a set of values.
Classification problem - based on the prior inputs given you can have a equation that would do the classification for you. One dimension point, two dimension line, three dimension plane.. .. based on the randomness too :)
- Unsupervised learning
Here you just have data and you would have to figure out the clusters that are formed and make a meaning out of it.
Learning algorithms:
- Linear regression: h(x) = t0+t1x - uni variant linear regression.
h - hypothesis function.
cost function J(t0,t1) = 1/2m ∑ 1 to m (h(xi) - yi) pow 2 - minimal value. (also know as squared error function.)
This cost function when plotted with t0,t1,value in three dimension would be a bowl and the bottom of the bowl is the ideal values of t0,t1.
The goal is to find t0,t1 with minimal effort / cpu even when m is very high.
Gradient descent:
- t1 := t1 - alpha * partial derivative (J(t1));
Feature scaling Mean normalization.
Normal equation method.
- (X transpose * X ) inverse * X transpose * Y