Re: read and merge lots of txt files, macro?

  • Follow


This is not code I have time to write, but what it sounds like is you want
to write a multistep process.

First, query the directory listing from an infile PIPE to get the files that
have the layouts (D_ files?).  Or if you know what they'll be called due to
them having a year/quarter indicator or something, you can skip that step
and read them in with a wildcard or something.

Then, write code to parse those files into the layout, and write input code
with it.

Then, use that layout [either via SQL SELECT INTO or %include] to use the
input code, creating the infile name from the earlier directory listing
and/or from the layout file's name.

Make sense?
The first part would just be inputting the layout with a fixed layout that
was intelligent enough to determine what's really a valid line and what's
just header junk.  Let's say you called the first column 'name' and second
'numchar' (numeric/charater indicator).

The second part would be something like
data input_code;
 retain inputstring $5000.;
  set input_file;
  if _n_ = 1 then input_file = "input";
  input_file = catx(' ',input_file,name);
  if numchar='character' then
    input_file = catx(' ',input_file,'$');
  call symput('inputstr',input_file);
run;

The third part would be just:
data real_data;
  infile "whatever your filename is";
  input
   &inputstr
  ;
run;

give or take.

-Joe

On Wed, Jan 20, 2010 at 2:01 PM, slightmoon <lvhome2009@gmail.com> wrote:

> Sorry, Tom, Let me try one more time. For each quarter in each year, I
> have two types of data.
>
> D_RC_0300.txt is like this:
>
>                    FILE LAYOUT FOR SCHEDULE RC-TABLE
>                              DATA DELIMITED BY COMMA
>
>
>       VARIABLE      FIELD       DEC.    VARIABLE
>              NAME     TYPE        POS.    DESCRIPTION
>         --------------      -------         ----
> --------------------
>          SYSTEM      Numeric     0          System Code
>                DIST      Numeric    0          District Code
>           ASSOC      Numeric     0          Association Code
>           MONTH      Numeric     0          Month of Report
>              YEAR     Numeric     0          Year of Report
>          UNINUM      Numeric     0         System, District, and
> Association
>                                                         codes
> concatenated
>             CASH      Numeric     0          Cash
>         MKTINV       Numeric     0          Marketable Investments
>         ACTREC      Numeric     0          Accounts Receivable
>         ACRLNS      Numeric     0          Accrual Loans and Leases
> net of
>                                                         unearned
> income and unapplied
>                                                         loan payments
>
> RC_0300.txt file looks like this:
>
> 1,2,3,4,5,6,7,8,9,10
> 11,22,33,44,55,66,77,88,99,100
> 111,222,333,444,555,66,77,888,999,102
> ..........
>
> The output data file I want to create is gonna look like this:
> RC_MERGE_0300.sas7bdat
>
> STSTEM, DIST, ASSOC, MONTH, YEAR, CASH, MKTINV, ACTREC, ACRLNS
> 1,2,3,4,5,6,7,8,9,10
> 11,22,33,44,55,66,77,88,99,100
> 111,222,333,444,555,66,77,888,999,102
> ......
>
> The problem is I have 20 pairs of tables (one with Var names, one with
> numbers) in each quarter in each year (10 years) so I cannot merge
> datafiles manually. I am gonna post my Macro here. It did not run
> after I change Proc Import to Data step.
>
0
Reply snoopy369 (1752) 1/20/2010 9:46:10 PM


0 Replies
161 Views

(page loaded in 0.009 seconds)


Reply: