Can someone explain PROC TRANSREG's BoxCox outputs

  • Follow


Let's suppose I run PROC TRANSREG with the following model and output
statements:

    model BoxCox(y/lambda=-3 TO 3 by 0.20)=
		identity(intercept) method=univariate;
   output out=BoxCoxOut residuals predicted;

I get a transformation parameter of lambda=3 from my data. Now I
examine the output data set, and I see that, for example at one point,
y takes on the value 15. Now the output data set contains Ty, which is
the transformed value of y. I can compute where the Ty of 1124.6666667
comes from. The BoxCox formula is (y**lambda-1)/lambda, so
substituting y=15 and lambda=3, I get 124.6666667. Nice.

I also see in the output data set a column Py, which I assume is the
predicted value, and Ry, which is the residual. What is this Py? It
seems to be a constant in this simple case, and it is NOT the mean of
y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
please explain what Py is in this output data set. Thanks!

--
Paige Miller
paige\dot\miller \at\ kodak\dot\com

0
Reply paige.miller (581) 3/30/2007 8:00:54 PM

On Mar 30, 4:00 pm, "Paige Miller" <paige.mil...@kodak.com> wrote:
> Let's suppose I run PROC TRANSREG with the following model and output
> statements:
>
>     model BoxCox(y/lambda=-3 TO 3 by 0.20)=
>                 identity(intercept) method=univariate;
>    output out=BoxCoxOut residuals predicted;
>
> I get a transformation parameter of lambda=3 from my data. Now I
> examine the output data set, and I see that, for example at one point,
> y takes on the value 15. Now the output data set contains Ty, which is
> the transformed value of y. I can compute where the Ty of 1124.6666667
> comes from. The BoxCox formula is (y**lambda-1)/lambda, so
> substituting y=15 and lambda=3, I get 124.6666667. Nice.
>
> I also see in the output data set a column Py, which I assume is the
> predicted value, and Ry, which is the residual. What is this Py? It
> seems to be a constant in this simple case, and it is NOT the mean of
> y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
> please explain what Py is in this output data set. Thanks!
>
> --
> Paige Miller
> paige\dot\miller \at\ kodak\dot\com

What is this Py?

>it is NOT the mean of
> y substituted into the BoxCox formula (y**lambda-1)/lambda.

That is correct It is not. Otherwise there is only one value. Py is
the predicted value of Ty not y.  It is equal to intercept + beta*x in
the following example.

Here is an example.
HTH

data x;
      do x = 1 to 8 by 0.025;
         y = exp(x + normal(7));
         output;
         end;
      run;

   proc transreg data=x ss2 details;

      model BoxCox(y/lambda=-3 TO 3 by 0.20)=
                identity(x)/ method=univariate;
      output out=BoxCoxOut residuals predicted;
      run;


	  title 'intercept=0.01551366 Identity(x)=0.99578183';
	  title2 'py=0.01551366 + 0.99578183*x';


	  data BoxCoxOut;
	     set BoxCoxOut;
		 py2=0.01551366 + 0.99578183*x;

	run;

	proc print data=BoxCoxOut;
    var py:;
    run;

0
Reply shiling99 (640) 3/30/2007 8:37:16 PM


>>> Paige Miller <paige.miller@KODAK.COM> 03/30/07 3:00 PM >>> wrote
<<<<
Let's suppose I run PROC TRANSREG with the following model and output
statements:

    model BoxCox(y/lambda=-3 TO 3 by 0.20)=
                identity(intercept) method=univariate;
   output out=BoxCoxOut residuals predicted;

I get a transformation parameter of lambda=3 from my data. Now I
examine the output data set, and I see that, for example at one point,
y takes on the value 15. Now the output data set contains Ty, which is
the transformed value of y. I can compute where the Ty of 1124.6666667
comes from. The BoxCox formula is (y**lambda-1)/lambda, so
substituting y=15 and lambda=3, I get 124.6666667. Nice.

I also see in the output data set a column Py, which I assume is the
predicted value, and Ry, which is the residual. What is this Py? It
seems to be a constant in this simple case, and it is NOT the mean of
y substituted into the BoxCox formula (y**lambda-1)/lambda. Someone,
please explain what Py is in this output data set. Thanks!
>>>>

I'm not COMPLETELY sure, but I bet it has something to do with the fact
that the mean of the transform is not the
transform of the mean

e.g.

X           log X
10          1
100        2
1000      3

mean(X) = 1110/3 = 370
mean(log(X) = 2
10^2 = 100 not equal 370

but if X is constant

10        1
10        1
10        1

mean(X) = 10
mean(log(X)) = 1
10 ^ 1 = 10

so that has me puzzled....

HTH a little, anyway.

Peter
0
Reply flom (915) 3/30/2007 10:00:10 PM

Thanks to those who replied. I think I understand what is happening
now.

--
Paige Miller
paige\dot\miller \at\ kodak\dot\com

0
Reply paige.miller (581) 4/3/2007 1:13:35 PM

3 Replies
28 Views

(page loaded in 0.105 seconds)


Reply: