f



Help with orthogonal least squares line fit with slope constraint...

If I have a group of points, I know I can do a least-squares line fit
in the orthogonal sense by doing an eigenvector-type approach.  The
code I've been using for this is:

    Xm = mean(X);
    A = cov(X);
    [V,D] = eig(A);
    [ignore,imx] = max(diag(D)); % Choose eigenvector with max.
eigenvalue
    t=4;
    x = Xm + V(:,imx)'*t;
    t=-t;
    x2 = Xm + V(:,imx)'*t;
    slope=(x2(2)-x(2))/(x2(1)-x(1));
    b=((x2(2)+x(2))-(slope.*(x(1)+x2(1))))./2;

Now, I am looking to do a best fit line through the data, again in an
(orthogonal) least-squares sense, but this time where the slope of the
fitting line is a pre-determined (fixed) value.  Is there a simple
solution to this problem?  I am currently getting an approximate
solution by rotating the first line about Xm until it's slope is as
desired, but of course this doesn't always work well.  I have
considered doing a coordinate transform to make the predetermined
slope the x-axis, then doing a standard (non-orthogonal) regression
where the only uncertainty is in the vertical direction, but I'm
guessing there's probably a better way...

Any thoughts?  Thanks.
-SL
0
srjm72499 (54)
12/13/2007 8:21:13 PM
comp.soft-sys.matlab 211266 articles. 25 followers. lunamoonmoon (257) is leader. Post Follow

4 Replies
447 Views

Similar Articles

[PageSpeed] 3

So after you make your only direction of uncertainty point 
in the y direction what do you want to know about it?

Are you trying to minimise:

sum((X(:,2) - (slope*X(:,1) + b)).^2 where slope is known?

If so, it's b = Xm(2) - slope * Xm(1);

because the least squares solution always passes through 
the mean.

Cheers,
Andrew

P.S

slope = V(2,imx)/V(1,imx);

P.P.S. the regress function is less complicated than your 
eigenvector approach.

Coeffs = regress(X(:,2), [ones(size(X,1),1), X(:,1)]);
b = Coeffs(1); slope = Coeffs(2);





Steve <srjm72499@frontiernet.net> wrote in message 
<8a0a561b-f3b1-4dcd-8c61-
20d0fd4f7cf7@i29g2000prf.googlegroups.com>...
> If I have a group of points, I know I can do a least-
squares line fit
> in the orthogonal sense by doing an eigenvector-type 
approach.  The
> code I've been using for this is:
> 
>     Xm = mean(X);
>     A = cov(X);
>     [V,D] = eig(A);
>     [ignore,imx] = max(diag(D)); % Choose eigenvector 
with max.
> eigenvalue
>     t=4;
>     x = Xm + V(:,imx)'*t;
>     t=-t;
>     x2 = Xm + V(:,imx)'*t;
>     slope=(x2(2)-x(2))/(x2(1)-x(1));
>     b=((x2(2)+x(2))-(slope.*(x(1)+x2(1))))./2;
> 
> Now, I am looking to do a best fit line through the data, 
again in an
> (orthogonal) least-squares sense, but this time where the 
slope of the
> fitting line is a pre-determined (fixed) value.  Is there 
a simple
> solution to this problem?  I am currently getting an 
approximate
> solution by rotating the first line about Xm until it's 
slope is as
> desired, but of course this doesn't always work well.  I 
have
> considered doing a coordinate transform to make the 
predetermined
> slope the x-axis, then doing a standard (non-orthogonal) 
regression
> where the only uncertainty is in the vertical direction, 
but I'm
> guessing there's probably a better way...
> 
> Any thoughts?  Thanks.
> -SL

0
awbsmith (68)
12/14/2007 3:24:26 AM
Thanks for your reply.  However, I had always thought that the regress
function was a standard regression, and assumed that there was no
error in the x-direction; that is, the error is the vertical distance
to the fitting function (although I certainly could be mistaken).  I
am looking for an error metric where the error is the orthogonal
distance of each point to the line (so there could be errors in both
the x and y positions of the points).  However, after reading your
post (esp the part on going through the mean), I am not convinced that
the two solutions are different when the slope is given (although it
certainly is when no parameters of the line are specified - just think
about a bunch of points oriented near- vertical).  I think I'll do a
check to see if the ordinary regressions is also the orthogonal
regression when the slope is given...  Thanks again.
SL
0
srjm72499 (54)
12/14/2007 5:05:42 PM
FWIW, the distance di of a point (xi,yi) to an horizontal
line (L): "Y=c" is di=|yi-c|. Notice this is not true if the
line L) is not horizontal. That means working on distance is
equivalent to working on algebric coordinates y for
horizontal line.

- Solution of Minimizing least-square sum(di^2) is: c=mean(yi).
- Solution of Minimizing sum(di) is: c=median(yi).
- Solution of Minimizing max({di}) is: c=(max(yi)+min(yi))/2.

Bruno
0
brunoluong (348)
12/14/2007 6:17:32 PM
Steve <srjm72499@frontiernet.net> wrote in message 
<f8634c9c-5932-454e-
a975-5ea9b0931133@t1g2000pra.googlegroups.com>...
> Thanks for your reply.  However, I had always thought that the regress
> function was a standard regression, and assumed that there was no
> error in the x-direction; that is, the error is the vertical distance
> to the fitting function (although I certainly could be mistaken).  I
> am looking for an error metric where the error is the orthogonal
> distance of each point to the line (so there could be errors in both
> the x and y positions of the points).  However, after reading your
> post (esp the part on going through the mean), I am not convinced that
> the two solutions are different when the slope is given (although it
> certainly is when no parameters of the line are specified - just think
> about a bunch of points oriented near- vertical).  I think I'll do a
> check to see if the ordinary regressions is also the orthogonal
> regression when the slope is given...  Thanks again.
> SL
----------
Hello Steve,

  In case the slope of the line is fixed at a predetermined value, the solution 
to the best linear fit to a set of points for minimizing the mean square y-
differences is the same as that for minimizing the mean square orthogonal 
distances, namely that unique line with the required slope which runs 
through the mean point of the set of points.  It is therefore a very simple 
problem.  It is only when the line's slope is also allowed to vary that the 
solutions to the regression problem and the minimum mean square 
orthogonal distance problem become different.

  The second of these latter two problems is more difficult than the regression 
problem, because it involves finding eigenvectors for a matrix.  Incidentally, 
the eigenvectors you found can also be solved in terms of singular value 
decomposition using matlab's 'svd' function.  If X is the n by 2 matrix of given 
points in the set, then

 Xm = mean(X); % The mean point
 A = X-repmat(Xm,n,1); % Offset X to have mean (0,0)
 [U,S,V] = svd(A); % Find the eigenvectors of A

will have the desired eigenvectors as the columns of V, and conveniently, the 
eigenvector with the largest eigenvalue is guaranteed to be the first column 
of V.  Note also that when eigenvalues are nearly equal, as would be true of a 
nearly circular distribution, the 'svd' computation is somewhat more accurate 
than in the 'eig' function in this problem (as was once pointed out to me by 
John D'Errico.)

Roger Stafford

0
12/14/2007 6:58:13 PM
Reply: