Empty columns

  • Follow


Hi,

I am new here, and it is my first post on this newsgroup.

I am using :
SAS 9.1.3 Service Pack 3
XP_PRO plate-forme

My Goal :
I have a table with 400 columns and thousands lines.
I know some of the columns are empty.
I would like to write a SAS program which display anywhere the name of
the columns which are empty.

My problem :
The columns have not a similar name, for example the first is "year",
the second is "abcd" etc and I do not want to write a line for each
column like this
[code]
do obsnum=1 to 1 000 000;
   if (year=".")

end;
[/code]

I know there is a way of typing only one time the name of the columns,
using an array, like this :
[code]
array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
                        DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
[/code]
But Is there any other way, smarter than typing 400 columns names ?

PS : I am a french guy and I apologize for mistakes.

0
Reply finiderir3 (1) 4/16/2007 1:39:46 PM

Run a proc means on your data set.  Any empty variables will have
minimum and maximum equal to missing.  Create a subset using the mean
dataset, and a where statement to isolate any such variables.

HTH,
Reeza


0
Reply fkhurshed (247) 4/16/2007 1:50:18 PM


You may want to try coding like the following:

  array Num(*) _numeric_;
  array Str(*) $ _character_;
  do i=1 to dim(Num);
    if Num(i) eq . then do;
        ...
    end;
  end;
  do i=1 to dim(Str);
    if Str(i) eq ' ' then do;
        ...
    end;
  end;

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L@LISTSERV.UGA.EDU] On Behalf Of
finiderir3@gmail.com
Sent: Monday, April 16, 2007 8:40 AM
To: SAS-L@LISTSERV.UGA.EDU
Subject: Empty columns

Hi,

I am new here, and it is my first post on this newsgroup.

I am using :
SAS 9.1.3 Service Pack 3
XP_PRO plate-forme

My Goal :
I have a table with 400 columns and thousands lines.
I know some of the columns are empty.
I would like to write a SAS program which display anywhere the name of
the columns which are empty.

My problem :
The columns have not a similar name, for example the first is "year",
the second is "abcd" etc and I do not want to write a line for each
column like this [code] do obsnum=1 to 1 000 000;
   if (year=".")

end;
[/code]

I know there is a way of typing only one time the name of the columns,
using an array, like this :
[code]
array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
                        DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
[/code]
But Is there any other way, smarter than typing 400 columns names ?

PS : I am a french guy and I apologize for mistakes.


-----Message Disclaimer-----

This e-mail message is intended only for the use of the individual or
entity to which it is addressed, and may contain information that is
privileged, confidential and exempt from disclosure under applicable law.
If you are not the intended recipient, any dissemination, distribution or
copying of this communication is strictly prohibited. If you have
received this communication in error, please notify us immediately by
reply email to Connect@principal.com and delete or destroy all copies of
the original message and attachments thereto. Email sent to or from the
Principal Financial Group or any of its member companies may be retained
as required by law or regulation.

Nothing in this message is intended to constitute an Electronic signature
for purposes of the Uniform Electronic Transactions Act (UETA) or the
Electronic Signatures in Global and National Commerce Act ("E-Sign")
unless a specific statement to the contrary is included in this message.

While this communication may be used to promote or market a transaction
or an idea that is discussed in the publication, it is intended to provide
general information about the subject matter covered and is provided with
the understanding that The Principal is not rendering legal, accounting,
or tax advice. It is not a marketed opinion and may not be used to avoid
penalties under the Internal Revenue Code. You should consult with
appropriate counsel or other advisors on all matters pertaining to legal,
tax, or accounting obligations and requirements.
0
Reply Huang.JS (242) 4/16/2007 1:59:19 PM

Here is a solution that is similar to Reeza's but a little more automated.
Please note that this will not work with any character data. Character
variables will have to be dealt with differently. There may be an sql
solution but I am not the person to give you that answer.

data a;
input a b c d;
cards;
1 . 3 4
2 . 4 5
3 . 5 6
proc means noprint data=a ;
     output out=a nmiss=;
       var _numeric_;
run;

proc transpose data=a out=a(where=(col1=0));
proc print;
run;

By the way, I saw no mistakes with your English. You wouldn't even want to
try my French, however.

Nat Wooding
Environmental Specialist III
Dominion, Environmental Biology
4111 Castlewood Rd
Richmond, VA 23234
Phone:804-271-5313, Fax: 804-271-2977



             "finiderir3@gmail
             .com"
             <finiderir3@GMAIL                                          To
             .COM>                     SAS-L@LISTSERV.UGA.EDU
             Sent by: "SAS(r)                                           cc
             Discussion"
             <SAS-L@LISTSERV.U                                     Subject
             GA.EDU>                   Empty columns


             04/16/2007 09:39
             AM


             Please respond to
             "finiderir3@gmail
                   .com"
             <finiderir3@GMAIL
                   .COM>






Hi,

I am new here, and it is my first post on this newsgroup.

I am using :
SAS 9.1.3 Service Pack 3
XP_PRO plate-forme

My Goal :
I have a table with 400 columns and thousands lines.
I know some of the columns are empty.
I would like to write a SAS program which display anywhere the name of
the columns which are empty.

My problem :
The columns have not a similar name, for example the first is "year",
the second is "abcd" etc and I do not want to write a line for each
column like this
[code]
do obsnum=1 to 1 000 000;
   if (year=".")

end;
[/code]

I know there is a way of typing only one time the name of the columns,
using an array, like this :
[code]
array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
                        DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
[/code]
But Is there any other way, smarter than typing 400 columns names ?

PS : I am a french guy and I apologize for mistakes.



-----------------------------------------
CONFIDENTIALITY NOTICE:  This electronic message contains
information which may be legally confidential and/or privileged and
does not in any case represent a firm ENERGY COMMODITY bid or offer
relating thereto which binds the sender without an additional
express written confirmation to that effect.  The information is
intended solely for the individual or entity named above and access
by anyone else is unauthorized.  If you are not the intended
recipient, any disclosure,  copying, distribution, or use of the
contents of this information is prohibited and may be unlawful.  If
you have received this electronic  transmission in error, please
reply immediately to the sender that you have received the message
in error, and delete it.  Thank you.
0
Reply nathaniel.wooding (1453) 4/16/2007 2:02:35 PM

One of many solutions:

%let cvars=;
%let nvars=;

proc sql;
  select compress(name) into :cvars separated by " " from
  sashelp.vcolumn
  where libname="SASHELP" and memname="CLASS" and type="char";
  select compress(name) into :nvars separated by " " from
  sashelp.vcolumn
  where libname="SASHELP" and memname="CLASS" and type="num";

quit;

data all;
  set sashelp.class;
  array nums(*) &nvars;
  array chars(*) &cvars;
  do i=1 to dim(nums);
     put nums(i)=;
  end;
  do i=1 to dim(chars);
     put chars(i)=;
  end;
run;

I think you can adapt that to your needs.
Gerhard






On Mon, 16 Apr 2007 06:39:46 -0700, finiderir3@gmail.com
<finiderir3@GMAIL.COM> wrote:

>Hi,
>
>I am new here, and it is my first post on this newsgroup.
>
>I am using :
>SAS 9.1.3 Service Pack 3
>XP_PRO plate-forme
>
>My Goal :
>I have a table with 400 columns and thousands lines.
>I know some of the columns are empty.
>I would like to write a SAS program which display anywhere the name of
>the columns which are empty.
>
>My problem :
>The columns have not a similar name, for example the first is "year",
>the second is "abcd" etc and I do not want to write a line for each
>column like this
>[code]
>do obsnum=1 to 1 000 000;
>   if (year=".")
>
>end;
>[/code]
>
>I know there is a way of typing only one time the name of the columns,
>using an array, like this :
>[code]
>array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
>                        DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
>[/code]
>But Is there any other way, smarter than typing 400 columns names ?
>
>PS : I am a french guy and I apologize for mistakes.
0
Reply gerhard.hellriegel (2531) 4/16/2007 2:11:24 PM

On Mon, 16 Apr 2007 06:39:46 -0700, finiderir3@gmail.com
<finiderir3@GMAIL.COM> wrote:

>Hi,
>
>I am new here, and it is my first post on this newsgroup.
>
>I am using :
>SAS 9.1.3 Service Pack 3
>XP_PRO plate-forme
>
>My Goal :
>I have a table with 400 columns and thousands lines.
>I know some of the columns are empty.
>I would like to write a SAS program which display anywhere the name of
>the columns which are empty.
....
Hi,
Sometimes searching the archive helps. I found several "elegant" solutions
to a bit more generalized problem, i.e., finding the number of missings for
all the variables. Follow this thread:
http://www.listserv.uga.edu/cgi-bin/wa?A2=ind0109D&L=sas-l&D=1&H=0&O=D&T=1&P=19370
Cheers,
Chang
0
Reply chang_y_chung (1988) 4/16/2007 2:49:42 PM

You might find a approach similar to this helpful.  This is something
like a solution I saw here on SAS-L

proc format;
   value allmiss .=. other=1;
   value $allmiss ' '=' ' other='1';
   run;
data work.test;
   set sashelp.class;
   retain missN missC;
   call missing(missN,missC);
   if _n_ eq 2 then call missing(of _character_);
   if _n_ eq 5 then call missing(of _numeric_);
   run;
ods trace on;
ods output OneWayFreqs=OneWayFreqs(keep=table frequency where=(frequency=._));
proc freq;
   tables _all_;* / missing;
   format  _character_ $allmiss. _numeric_ allmiss.;
   run;
ods trace off;
%let allMiss=;
proc sql noprint;
   select scan(table,-1,' ') into :allmiss separated by ' '
   from OneWayFreqs;
   quit;
   run;
%put _user_;

On 4/16/07, finiderir3@gmail.com <finiderir3@gmail.com> wrote:
> Hi,
>
> I am new here, and it is my first post on this newsgroup.
>
> I am using :
> SAS 9.1.3 Service Pack 3
> XP_PRO plate-forme
>
> My Goal :
> I have a table with 400 columns and thousands lines.
> I know some of the columns are empty.
> I would like to write a SAS program which display anywhere the name of
> the columns which are empty.
>
> My problem :
> The columns have not a similar name, for example the first is "year",
> the second is "abcd" etc and I do not want to write a line for each
> column like this
> [code]
> do obsnum=1 to 1 000 000;
>   if (year=".")
>
> end;
> [/code]
>
> I know there is a way of typing only one time the name of the columns,
> using an array, like this :
> [code]
> array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
>                        DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
> [/code]
> But Is there any other way, smarter than typing 400 columns names ?
>
> PS : I am a french guy and I apologize for mistakes.
>
0
Reply datanull (3058) 4/16/2007 2:57:31 PM

On Mon, 16 Apr 2007 10:57:31 -0400, data _null_; <datanull@GMAIL.COM> wrote:

>You might find a approach similar to this helpful.  This is something
>like a solution I saw here on SAS-L
>
>proc format;
>   value allmiss .=. other=1;
>   value $allmiss ' '=' ' other='1';
>   run;
>data work.test;
>   set sashelp.class;
>   retain missN missC;
>   call missing(missN,missC);
>   if _n_ eq 2 then call missing(of _character_);
>   if _n_ eq 5 then call missing(of _numeric_);
>   run;
>ods trace on;
>ods output OneWayFreqs=OneWayFreqs(keep=table frequency where=(frequency=._));
>proc freq;
>   tables _all_;* / missing;
>   format  _character_ $allmiss. _numeric_ allmiss.;
>   run;
>ods trace off;
>%let allMiss=;
>proc sql noprint;
>   select scan(table,-1,' ') into :allmiss separated by ' '
>   from OneWayFreqs;
>   quit;
>   run;
>%put _user_;
....
Hi,
The combination of the ods output destination and proc freq seems to be the
winner here. Even the formats are not necessary! An implementation is below.
(winXP sas 9.1.3 sp4) HTH.
Cheers,
Chang

/* test data - height and name are all missing */
data one;
  set sashelp.class;
  name = "";
  height = .;
  if _n_ < 4 then age = .;
run;

/* list vars with all missing */
ods listing close;
ods output oneWayFreqs=f(keep=table frequency cumFrequency);
  proc freq data=one;
    tables _all_/missprint; /* this option is critical! */
  run;
ods output close;
ods listing;

proc sql;
  select cats(tranwrd(table, "Table ", " ")) as var
         label="all missing on:"
  from   f
  group by calculated var
  having frequency = sum(frequency) and cumFrequency=.;
quit;
/* on log
all missing on:
---------------
Height
Name
*/
0
Reply chang_y_chung (1988) 4/16/2007 4:15:40 PM

I used the FORMAT because I believe it will speed up the PROC FREQ
step considerably when the data are large and especially with high
cardinality.

On 4/16/07, Chang Chung <chang_y_chung@hotmail.com> wrote:
> On Mon, 16 Apr 2007 10:57:31 -0400, data _null_; <datanull@GMAIL.COM> wrote:
>
> >You might find a approach similar to this helpful.  This is something
> >like a solution I saw here on SAS-L
> >
> >proc format;
> >   value allmiss .=. other=1;
> >   value $allmiss ' '=' ' other='1';
> >   run;
> >data work.test;
> >   set sashelp.class;
> >   retain missN missC;
> >   call missing(missN,missC);
> >   if _n_ eq 2 then call missing(of _character_);
> >   if _n_ eq 5 then call missing(of _numeric_);
> >   run;
> >ods trace on;
> >ods output OneWayFreqs=OneWayFreqs(keep=table frequency where=(frequency=._));
> >proc freq;
> >   tables _all_;* / missing;
> >   format  _character_ $allmiss. _numeric_ allmiss.;
> >   run;
> >ods trace off;
> >%let allMiss=;
> >proc sql noprint;
> >   select scan(table,-1,' ') into :allmiss separated by ' '
> >   from OneWayFreqs;
> >   quit;
> >   run;
> >%put _user_;
> ...
> Hi,
> The combination of the ods output destination and proc freq seems to be the
> winner here. Even the formats are not necessary! An implementation is below.
> (winXP sas 9.1.3 sp4) HTH.
> Cheers,
> Chang
>
> /* test data - height and name are all missing */
> data one;
>  set sashelp.class;
>  name = "";
>  height = .;
>  if _n_ < 4 then age = .;
> run;
>
> /* list vars with all missing */
> ods listing close;
> ods output oneWayFreqs=f(keep=table frequency cumFrequency);
>  proc freq data=one;
>    tables _all_/missprint; /* this option is critical! */
>  run;
> ods output close;
> ods listing;
>
> proc sql;
>  select cats(tranwrd(table, "Table ", " ")) as var
>         label="all missing on:"
>  from   f
>  group by calculated var
>  having frequency = sum(frequency) and cumFrequency=.;
> quit;
> /* on log
> all missing on:
> ---------------
> Height
> Name
> */
>
0
Reply datanull (3058) 4/16/2007 4:40:57 PM

This test program creates a test dataset (NUM2 and CHAR1 are all
missing), then uses SQL first to generate some code snippets (macro
var NAMES), then to run the snippets (CREATE TABLE).  Output dataset
COUNTS has the same number of variables as input dataset TEST.  The
values will be 1 if the sum of the number of missing values equals the
number of observations in the dataset (i.e., all missing), 0
otherwise.

Regards,
Frank DiIorio

data test;
input num1 num2 char1 $;
datalines;
1 . .
.. . .
2 . .
;

proc sql noprint;
     select catx(' ', '(sum(missing(', name, ")) = count(*)) as ",
name)
     into :names separated by ', '
     from dictionary.columns
     where libname='WORK' & memname='TEST'
     ;
     create table counts as
     select &names.
     from test
     ;
quit;

On Apr 16, 8:39 am, "finider...@gmail.com" <finider...@gmail.com>
wrote:
> My Goal :
> I have a table with 400 columns and thousands lines.
> I know some of the columns are empty.
> I would like to write a SAS program which display anywhere the name of
> the columns which are empty.
>
> I know there is a way of typing only one time the name of the columns,
> using an array, like this :
> [code]
> array  cols{11} $200 STATE COUNTY PR_TYPE LAST_NM FIRST_NM
>                         DEGREE TAX_ID ADDR CITY ZIP PHONE   ;
> [/code]
> But Is there any other way, smarter than typing 400 columns names ?
>
> PS : I am a french guy and I apologize for mistakes.


0
Reply frank_diiorio (39) 4/17/2007 1:29:41 AM

Hi,

A similar question was previously posted.
Try this code and tell me whether it does not work:

%macro drop (data = );

	proc contents 	data = &data noprint out = list_var (keep = name);
	run;

	data _null_;
		set list_var;
		call symput("Var"!!compress(_n_), compress(name));
		call symput('N',compress(_n_));
	run;

	%global drop_list;
	%let drop_list = ;

	%do i = 1 %to &N;

		proc sql noprint;
			select count(distinct &&var&i ) into: NBVAL from &data;
		quit;

		%if &nbval = 0 %then %let drop_list = &drop_list &&var&i ;

	%end;

	/*data out;
		set &data;
		drop &drop_list;
	run;*/

	%put &drop_list;

%mend drop;

I suppose it would be also possible to use a data _null_ step and a
call execute statement once the data set list_var is produced.

Regards,

Catherine.

0
Reply cat.b41 (102) 4/19/2007 4:58:48 AM

10 Replies
32 Views

(page loaded in 0.231 seconds)

Similiar Articles:













7/26/2012 3:07:34 AM


Reply: