f

#### Re: missing numerical values = - infinity? #4 367832

```The main problem revealed in this thread is the non-intutive
results of applying a comparision operator to missing values
for numeric varia bles.  There seems to be no major complaint
about how missing values are treated in an arithmetic
expression.

So, what if SAS introduced a new class of comparison
operators, like the below?

X := Y        X:^= Y
X :< Y        X:^< Y
X :> Y        X:^> Y
X :>=
X :<=

Let's say all of these comparison operators would generate a
TRUE or FALSE only when both X and Y are not missing.
Otherwise they would result in a missing value.  That is, in
the assignment statements below, Z could be ., 0, or 1.

Z = X:=Y ;
Z = X:^=Y ;

This would have no deleterious effect on extant programs, but
would provide a welcome flexibility in dealing with a long-
standing problem.  There would be no need for

if missing(x,y)=0 then result = (X>Y);

Instead you could have:

Result = (x:>y);

Introducng these operators to the OP's example of

if birth_weight :< 2500 then low_birth_weight=1;
else if birth_weight :>= 2500 then low_birth_weight=0;

would result in low_birth_weght = . if birth_weight is missing.
Just what the OP wanted.

Is this notion of enough interest to be thought through
more thorougly, with the possibility of adding it to the
sasware ballot?

Regards,
Mark

> -----Original Message-----
> From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
> Peter Flom
> Sent: Tuesday, December 29, 2009 4:08 PM
> To: SAS-L@LISTSERV.UGA.EDU
> Subject: Re: missing numerical values = - infinity?
>
> Dale McLerran <stringplayer_2@YAHOO.COM> wrote
> >Peter,
> >
> >I don't like creating additional variables to contain information
> >about the missing values.  It increases file size and complexity.
> >I much prefer having the information about different kinds of
> >missing values coded directly with the variable when the response
> >is missing.
> >
> >Dale
> >
>
> I can see the point of that.
>
> I guess it's partly a matter of taste, partly what you are used to
> doing, and partly a question of how often you need these different
> missing values
>
> Peter
>
> Peter L. Flom, PhD
> Statistical Consultant
> Website: http://www DOT statisticalanalysisconsulting DOT com/
> Writing; http://www.associatedcontent.com/user/582880/peter_flom.html
```
 0
mkeintz
12/30/2009 12:10:42 AM
comp.soft-sys.sas 142828 articles. 3 followers.

2 Replies
858 Views

Similar Articles

[PageSpeed] 13

```Hi, all,

Thought this code might be of interest, especially with regard to Dan
Nordlund's post with a reference about floating point reps:

------------------------------------------- Code
-------------------------------------------------

options noovp noerrorabend nocenter mprint mprintnest ls = 80 ps = 54
compress = yes spool nosymbolgen nomlogic source source2;
title "SAS v&sysvlong on &sysscpl";

data missings ( keep = x xhex xaddrl );
do ade = rank( '_' ), rank( ' ' ), rank( 'A' ) to rank( 'Z' );
x = input( '.' || byte( ade ), ??best. );
end;
do a = -constant( 'BIG' ), -1, -constant( 'SMALL' ),
0,
constant( 'SMALL' ), .1, .2, .3, .4, .5, .6, .7, .8, .9, 1,
2, 5, 9, 10, constant( 'BIG' );
x = a;
end;
stop;

output:
xhex = put( peekclong( xaddrl ), hex16. );
output;
return;
run;

proc print width = min data = missings;
var xhex x;
format x best32.;
run;

---------------------------------- Output
------------------------------------------

SAS v9.01.01M3P020206 on XP_PRO           01:04 Wednesday, December
30, 2009   1

Obs          xhex                              x

1    0000000000D2FFFF                        _
2    0000000000D1FFFF                        .
3    0000000000BEFFFF                        A
4    0000000000BDFFFF                        B
5    0000000000BCFFFF                        C
6    0000000000BBFFFF                        D
7    0000000000BAFFFF                        E
8    0000000000B9FFFF                        F
9    0000000000B8FFFF                        G
10    0000000000B7FFFF                        H
11    0000000000B6FFFF                        I
12    0000000000B5FFFF                        J
13    0000000000B4FFFF                        K
14    0000000000B3FFFF                        L
15    0000000000B2FFFF                        M
16    0000000000B1FFFF                        N
17    0000000000B0FFFF                        O
18    0000000000AFFFFF                        P
19    0000000000AEFFFF                        Q
21    0000000000ACFFFF                        S
22    0000000000ABFFFF                        T
23    0000000000AAFFFF                        U
24    0000000000A9FFFF                        V
25    0000000000A8FFFF                        W
26    0000000000A7FFFF                        X
27    0000000000A6FFFF                        Y
28    0000000000A5FFFF                        Z
29    FFFFFFFFFFFFEFFF     -1.7976931348623E308
30    000000000000F0BF                       -1
31    0000000000001080    -2.2250738585072E-308
32    0000000000000000                        0
33    0000000000001000     2.2250738585072E-308
34    9A9999999999B93F                      0.1
35    9A9999999999C93F                      0.2
36    333333333333D33F                      0.3
37    9A9999999999D93F                      0.4
38    000000000000E03F                      0.5
39    333333333333E33F                      0.6
40    666666666666E63F                      0.7
41    9A9999999999E93F                      0.8
42    CDCCCCCCCCCCEC3F                      0.9
43    000000000000F03F                        1
44    0000000000000040                        2
45    0000000000001440                        5
46    0000000000002240                        9
47    0000000000002440                       10
48    FFFFFFFFFFFFEF7F      1.7976931348623E308
```
 0
droide
12/30/2009 6:06:53 AM
```""Keintz, H. Mark"" <mkeintz@WHARTON.UPENN.EDU> wrote in message
> The main problem revealed in this thread is the non-intutive
> results of applying a comparision operator to missing values
> for numeric varia bles.  There seems to be no major complaint
> about how missing values are treated in an arithmetic
> expression.
>
> So, what if SAS introduced a new class of comparison
> operators, like the below?
>
>  X := Y        X:^= Y
>  X :< Y        X:^< Y
>  X :> Y        X:^> Y
>  X :>=
>  X :<=
>
> Let's say all of these comparison operators would generate a
> TRUE or FALSE only when both X and Y are not missing.
> Otherwise they would result in a missing value.  That is, in
> the assignment statements below, Z could be ., 0, or 1.

Not sure I see the point - at the moment, missing values and zeroes are
FALSE, positive and negative numbers are TRUE.  If your new comparison
operators resulted in a missing value, the comparison would evaluate to
FALSE, and SAS would presumably slide in a zero.  I couldn't guess how many
programs would be affected if this behavior were changed.

I don't think it's possible to make a language "intuitive" to everyone.  I
worked with a guy once who got angrily agitated when it turned out that the
NODUP operand of PROC SORT did not give him what he wanted.  He should have
used NODUPKEY.  He couldn't be bothered to look it up, just assumed that SAS
would do what he wanted regardless of what he typed.

The lesson here is to RTFM.  Missing values are pretty thoroughly
documented, and have been for upwards of 30 years.

```
 0
Lou
12/30/2009 2:15:05 PM