1. Definitions
- Hypothesis: Logistic regression hypothesis is defined as:
where function g is the sigmoid function defined as:
- Cost Function:
Watch out! do NOT add theta0 into the regularization item above, because you should not be regularizing theta 0 parameter.
2. Gradient Descent Algorithm
It is amazing that this algorithm is identical to the regularized linear regression.
Repeat {
}
They can be rewritten by using this vectorization:
Here is the code in Octave/MatLib:function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
% J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
% theta as the parameter for regularized logistic regression and the
% gradient of the cost w.r.t. to the parameters.
% Initialize some useful values
m = length(y); % number of training examples
n = length(theta); % number of features in a example
% Cost Function J:
h = sigmoid(X*theta);
J = (-1/m) * sum((y .* log(h)) + ((1 - y) .* log(1-h)));
t = theta(2:n);
J = J + (lambda/(2*m)) * (t' * t); %regularization on cost function
% gradient vector:
grad = (1/m) * (X' * (h -y));
grad(2:n) = grad(2:n) .+ (lambda/m)*t; %regularization on gradients
end
function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic
%regression parameters theta
% p = PREDICT(theta, X) computes the predictions for X using a
% threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)
p = (X*theta >=0); % identical to p = sigmoid(X*theta) >= 0.5;
end
function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
% J = SIGMOID(z) computes the sigmoid of z.
g = 1 ./ (1 + exp(-z));
end
3. Demo
I implement regularized logistic regression to predict whether microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip goes through various tests to ensure it is functioning correctly. Suppose you are the product manager of the factory and you have the test results for some microchips on two di erent tests. From these two tests, you would like to determine whether the microchips should be accepted or rejected. To help you make the decision, you have a dataset of test results on past microchips, from which you can build a logistic regression model.
Below is the plot of examples with a decision boundary created by my regularized logistic regression algorithm with lambda=1. The input of sigmoid function, theta'*x, is a cubical polynomial.
Thank you for the useful information..
ReplyDeleteMachine Learning training in Pallikranai Chennai
Data science training in Pallikaranai
Python Training in Pallikaranai chennai
Bigdata training in Pallikaranai chennai
Spark with ML training in Pallikaranai chennai