Regression with Gradient Descent in Low-level Matlab

现在的位置: 首页 > 综合 > 正文

Regression with Gradient Descent in Low-level Matlab

2013年08月23日 ⁄ 综合 ⁄ 共 2683字 ⁄ 字号小中大 ⁄ 评论关闭

from: http://www.csrdu.org/nauman/2010/06/25/regression-with-gradient-descent-in-low-level-matlab/

I just finished writing my first machine learning algorithm in
Matlab. The algorithm is based on gradient descent search for estimating
parameters of linear regression (but can be easily extended to
quadratic or even higher-dimensional polynomials). It’s fairly easy if
you know the theory behind the model. So, first a very brief theory
portion. This isn’t a tutorial on statistics. Go read a book if you
don’t know about regression.

Just to recall: Regression comes in when you want to estimate a response variable ( $y$
) given explanatory variables ( $x_i$
). The goal for linear regression is to come up with values for parameters
( $/theta_i$
) such that you get a ‘best possible fit’ for each jth instance in the dataset of m instances:

$y^j = /sum_{i=1}^m{/theta_i /times x_i^j}$

In matrix notation, that comes simply to:

$Y = /theta^T X$

(For ease, we assume that $x_1$
is a vector of all 1s thus making $/theta_1$
the y-intercept).

The idea of gradient descent is to come up with the best (locally) values of $/theta$
. It does this by repeatedly updating the value of $/theta$
using the formula:

$/theta_i = /theta_i - /alpha * /sum_{j=1}^m{(/theta^T X^j - Y^j) X_i^j}$

If you have no idea what this is or if you want to know this in-depth, read till the end.

Here’s the Matlab code for this whole procedure I wrote at first. Note the use of W for $/theta$
:

% Gradient descent algo for linear regression % author: Nauman (recluze@gmail.com) %set the data X=[1 1 1 1 1 1 1; 22 49 80 26 40 54 91]; Y=[20 24 42 22 23 26 55]; hold on; plot(X(2,:),Y, 'x'); % set the actual values of W W = [5.775 0.474]'; YAct = (W' * X); % GRADIENT DESCENT W = zeros(2,1); % initialize W to all zeros m = size(X,2); % number of instances n = size(W,1); % number of parameters alpha = 0.000025; % gradient descent step size disp('Starting Weights:'); W % 1) do the loop for W estimation using gradient descent ------------ for iter = 1 : 20 % let's just do 5 iterations for now ... not til convergence % for each iteration for i = 1 : n % looping for each parameter W(i) % find the derivative of J(W) for all instances sum_j = 0; for j = 1 : m hW_j = 0; for inner_i = 1 : n hW_j = hW_j + W(inner_i) * X(inner_i,j); end sum_j = sum_j + ( (hW_j - Y(j)) * X(i,j) ) ; end % calculate new value of W(i) W(i) = W(i) - alpha * sum_j; end % plot the thingy newY = (W' * X); gColor = 0.05 * iter; plot(X(2,:),newY, 'Color', [0 gColor 0]); % end 1) ------------ end % output the final weights disp ('Final calculated weights'); W % output actual weights from formula in Red W = [5.775 0.474]' plot(X(2,:),YAct, 'Color',[1 0 0]); % finish off hold off;

Then I figured that I was just doing loops and didn’t really need the
matrices. So, I went back and thought about the whole nested loop thing
and replaced the loop portion with this:

for i = 1 : n % looping for each parameter W(i) % calculate new value of W(i) W(i) = W(i) - alpha * sum( ((W'*X) - Y) .* X(i,:) ); end

Seems suspiciously like the original formula, doesn’t it? Here’s what
the plot look like. The red line shows the linear regression line
calculated from a closed-form formula and the other lines show
estimations of $/theta$
with the darkest colored ones being the initial guess and lighter ones are successive guesses.

And finally, as promised, here’s where you can learn all about this stuff: machine learning video lectures from Stanford
.

【上篇】uva 11549 – Calculator Conundrum （Floyd判圈法）
【下篇】根据客户的需求，做的一个WFCI和JSP的扩展

作者: narcotic

该日志由 narcotic 于11年前发表在综合分类下，最后更新于 2013年08月23日.
转载请注明: Regression with Gradient Descent in Low-level Matlab | 学步园 +复制链接

抱歉!评论已关闭.

学步园

Regression with Gradient Descent in Low-level Matlab

作者: narcotic

书签

最新文章New

本站推荐

返回首页