COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Can someone explain PROC TRANSREG's BoxCox outputs

• Email
• Follow

Let's suppose I run PROC TRANSREG with the following model and output
statements:

model BoxCox(y/lambda=-3 TO 3 by 0.20)=
identity(intercept) method=univariate;
output out=BoxCoxOut residuals predicted;

I get a transformation parameter of lambda=3 from my data. Now I
examine the output data set, and I see that, for example at one point,
y takes on the value 15. Now the output data set contains Ty, which is
the transformed value of y. I can compute where the Ty of 1124.6666667
comes from. The BoxCox formula is (y**lambda-1)/lambda, so
substituting y=15 and lambda=3, I get 124.6666667. Nice.

I also see in the output data set a column Py, which I assume is the
predicted value, and Ry, which is the residual. What is this Py? It
seems to be a constant in this simple case, and it is NOT the mean of
y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
please explain what Py is in this output data set. Thanks!

--
Paige Miller
paige\dot\miller \at\ kodak\dot\com


 0
Reply paige.miller (581) 3/30/2007 8:00:54 PM

See related articles to this posting

On Mar 30, 4:00 pm, "Paige Miller" <paige.mil...@kodak.com> wrote:
> Let's suppose I run PROC TRANSREG with the following model and output
> statements:
>
>     model BoxCox(y/lambda=-3 TO 3 by 0.20)=
>                 identity(intercept) method=univariate;
>    output out=BoxCoxOut residuals predicted;
>
> I get a transformation parameter of lambda=3 from my data. Now I
> examine the output data set, and I see that, for example at one point,
> y takes on the value 15. Now the output data set contains Ty, which is
> the transformed value of y. I can compute where the Ty of 1124.6666667
> comes from. The BoxCox formula is (y**lambda-1)/lambda, so
> substituting y=15 and lambda=3, I get 124.6666667. Nice.
>
> I also see in the output data set a column Py, which I assume is the
> predicted value, and Ry, which is the residual. What is this Py? It
> seems to be a constant in this simple case, and it is NOT the mean of
> y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
> please explain what Py is in this output data set. Thanks!
>
> --
> Paige Miller
> paige\dot\miller \at\ kodak\dot\com

What is this Py?

>it is NOT the mean of
> y substituted into the BoxCox formula (y**lambda-1)/lambda.

That is correct It is not. Otherwise there is only one value. Py is
the predicted value of Ty not y.  It is equal to intercept + beta*x in
the following example.

Here is an example.
HTH

data x;
do x = 1 to 8 by 0.025;
y = exp(x + normal(7));
output;
end;
run;

proc transreg data=x ss2 details;

model BoxCox(y/lambda=-3 TO 3 by 0.20)=
identity(x)/ method=univariate;
output out=BoxCoxOut residuals predicted;
run;

title 'intercept=0.01551366 Identity(x)=0.99578183';
title2 'py=0.01551366 + 0.99578183*x';

data BoxCoxOut;
set BoxCoxOut;
py2=0.01551366 + 0.99578183*x;

run;

proc print data=BoxCoxOut;
var py:;
run;


 0
Reply shiling99 (640) 3/30/2007 8:37:16 PM

>>> Paige Miller <paige.miller@KODAK.COM> 03/30/07 3:00 PM >>> wrote
<<<<
Let's suppose I run PROC TRANSREG with the following model and output
statements:

model BoxCox(y/lambda=-3 TO 3 by 0.20)=
identity(intercept) method=univariate;
output out=BoxCoxOut residuals predicted;

I get a transformation parameter of lambda=3 from my data. Now I
examine the output data set, and I see that, for example at one point,
y takes on the value 15. Now the output data set contains Ty, which is
the transformed value of y. I can compute where the Ty of 1124.6666667
comes from. The BoxCox formula is (y**lambda-1)/lambda, so
substituting y=15 and lambda=3, I get 124.6666667. Nice.

I also see in the output data set a column Py, which I assume is the
predicted value, and Ry, which is the residual. What is this Py? It
seems to be a constant in this simple case, and it is NOT the mean of
y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
please explain what Py is in this output data set. Thanks!
>>>>

I'm not COMPLETELY sure, but I bet it has something to do with the fact
that the mean of the transform is not the
transform of the mean

e.g.

X           log X
10          1
100        2
1000      3

mean(X) = 1110/3 = 370
mean(log(X) = 2
10^2 = 100 not equal 370

but if X is constant

10        1
10        1
10        1

mean(X) = 10
mean(log(X)) = 1
10 ^ 1 = 10

so that has me puzzled....

HTH a little, anyway.

Peter

 0
Reply flom (915) 3/30/2007 10:00:10 PM

Thanks to those who replied. I think I understand what is happening
now.

--
Paige Miller
paige\dot\miller \at\ kodak\dot\com


 0
Reply paige.miller (581) 4/3/2007 1:13:35 PM

3 Replies
39 Views

Similar Articles

12/2/2013 11:02:58 AM
page loaded in 534462 ms -1