read and merge lots of txt files, macro?

  • Follow


1000+ txt files. two types. Type one file contains a list of varible
names (character var). Type two file contains columns of real data
with ','. I need to read in all txt files into sas files, pair up the
one has Var names with the one has read number. Eventually I will
creat new dataset with columns of data and the first row is the var
names.
For example, the file 'D_Ctable_3_2000' is like:

                                        title wording...
                        (blank line)
                                 (blank line )
         VARIABLE    FIELD   DEC.    VARIABLE
                NAME    TYPE   POS.    DESCRIPTION
  ---------------  -------  ----
--------------------------------------
(row8)    Sales      Numeric     0  blah,blah
             Prices     Numeric     0  blah,blah
             Year       Numeric     0  blah,blah
            MONTH    Numeric     0  blah,blah
             State      Numeric     0  blah,blah
           market      Numeric     0  blah,blah

Then the second type of files are numbers only, column by column, such
as 'C_table_3_2000'

1,2,3,4,5,6,
11,22,33,44,55,66
..............

I want to read in all files, pair 'D_C_table_3_2000' with
"C_table_3_2000", create 'table_3_2000' which contains both var names
and values. My whole set of data files are titled as
'D_C_table_3_2000', 'D_H_table_6_2008', etc..

I have a macro which saves file names into a excel file with name and
path. It just I need to subsitute Proc Import with Data step since I
want to use "firstobs=8" "missover" 'infile var $ 1-17', this kind of
things.

Millions of thanks!!!
0
Reply lvhome2009 (4) 1/19/2010 10:53:58 PM

I am not sure what good an excel file would do you.
It also seems strange that your filenames do not have extensions.

If you make a macro that takes as input the filename and then reads
the descriptor file and generates code to read the data file.
Something like this might work.

  %macro input( file );

     data attrib;
       infile "D_&file" firstobs=3D8 truncover end=3Deof;
       attrib name $32 type $10 dec 8 label $40. ;
       input name type dec label $40.;
       file "&file..sas" ;
       if _n_=3D1 then put
          "date &file;"
         /"infile '" "&file" "' dsd ;"
         /'input'
       ;
       put name @;
       if upcase(type) ne 'NUMERIC' then put '$ ' @;
       if eof then put ';' / 'run;';
     run;
     %inc "&file..sas";
  %mend;

Then read the list of files and call it once for each file pair.

filename files pipe "dir /b D_*";
data _null_;
   infile files truncover;
   input file $40.;
   call execute( '%input(' || file || ')' );
run;


On Jan 19, 5:53=A0pm, slightmoon <lvhome2...@gmail.com> wrote:
> 1000+ txt files. two types. Type one file contains a list of varible
> names (character var). Type two file contains columns of real data
> with ','. I need to read in all txt files into sas files, pair up the
> one has Var names with the one has read number. Eventually I will
> creat new dataset with columns of data and the first row is the var
> names.
> For example, the file 'D_Ctable_3_2000' is like:
>
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 title wording...
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 (blank line)
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0(blank=
 line )
> =A0 =A0 =A0 =A0 =A0VARIABLE =A0 =A0FIELD =A0 DEC. =A0 =A0VARIABLE
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 NAME =A0 =A0TYPE =A0 POS. =A0 =A0DESCRIPT=
ION
> =A0 --------------- =A0------- =A0----
> --------------------------------------
> (row8) =A0 =A0Sales =A0 =A0 =A0Numeric =A0 =A0 0 =A0blah,blah
> =A0 =A0 =A0 =A0 =A0 =A0 =A0Prices =A0 =A0 Numeric =A0 =A0 0 =A0blah,blah
> =A0 =A0 =A0 =A0 =A0 =A0 =A0Year =A0 =A0 =A0 Numeric =A0 =A0 0 =A0blah,bla=
h
> =A0 =A0 =A0 =A0 =A0 =A0 MONTH =A0 =A0Numeric =A0 =A0 0 =A0blah,blah
> =A0 =A0 =A0 =A0 =A0 =A0 =A0State =A0 =A0 =A0Numeric =A0 =A0 0 =A0blah,bla=
h
> =A0 =A0 =A0 =A0 =A0 =A0market =A0 =A0 =A0Numeric =A0 =A0 0 =A0blah,blah
>
> Then the second type of files are numbers only, column by column, such
> as 'C_table_3_2000'
>
> 1,2,3,4,5,6,
> 11,22,33,44,55,66
> .............
>
> I want to read in all files, pair 'D_C_table_3_2000' with
> "C_table_3_2000", create 'table_3_2000' which contains both var names
> and values. My whole set of data files are titled as
> 'D_C_table_3_2000', 'D_H_table_6_2008', etc..
>
> I have a macro which saves file names into a excel file with name and
> path. It just I need to subsitute Proc Import with Data step since I
> want to use "firstobs=3D8" "missover" 'infile var $ 1-17', this kind of
> things.
>
> Millions of thanks!!!

0
Reply Tom 1/20/2010 1:09:09 AM


Sorry, Tom, Let me try one more time. For each quarter in each year, I
have two types of data.

D_RC_0300.txt is like this:

                    FILE LAYOUT FOR SCHEDULE RC-TABLE
                              DATA DELIMITED BY COMMA


       VARIABLE      FIELD       DEC.    VARIABLE
              NAME     TYPE        POS.    DESCRIPTION
         --------------      -------         ----
--------------------
         SYSTEM      Numeric     0          System Code
                DIST      Numeric    0          District Code
           ASSOC      Numeric     0          Association Code
           MONTH      Numeric     0          Month of Report
              YEAR     Numeric     0          Year of Report
          UNINUM      Numeric     0         System, District, and
Association
                                                         codes
concatenated
             CASH      Numeric     0          Cash
         MKTINV       Numeric     0          Marketable Investments
         ACTREC      Numeric     0          Accounts Receivable
         ACRLNS      Numeric     0          Accrual Loans and Leases
net of
                                                         unearned
income and unapplied
                                                         loan payments

RC_0300.txt file looks like this:

1,2,3,4,5,6,7,8,9,10
11,22,33,44,55,66,77,88,99,100
111,222,333,444,555,66,77,888,999,102
...........

The output data file I want to create is gonna look like this:
RC_MERGE_0300.sas7bdat

STSTEM, DIST, ASSOC, MONTH, YEAR, CASH, MKTINV, ACTREC, ACRLNS
1,2,3,4,5,6,7,8,9,10
11,22,33,44,55,66,77,88,99,100
111,222,333,444,555,66,77,888,999,102
.......

The problem is I have 20 pairs of tables (one with Var names, one with
numbers) in each quarter in each year (10 years) so I cannot merge
datafiles manually. I am gonna post my Macro here. It did not run
after I change Proc Import to Data step.
0
Reply slightmoon 1/20/2010 8:01:01 PM

2 Replies
529 Views

(page loaded in 0.105 seconds)

Similiar Articles:













7/22/2012 7:17:34 AM


Reply: