parsing text

  • Follow


What kind of script would parse this example into correct fields?

Name
Address
City
State
Zip Code

Name 
Address
City
State
Zip code
0
Reply info 4/14/2004 8:39:14 PM

What's the delimiter, or is it fixed-width?

"jc" <info@realtorpro.biz> wrote in message
news:ce483f0b.0404141239.5fbca17f@posting.google.com...
> What kind of script would parse this example into correct fields?
>
> Name
> Address
> City
> State
> Zip Code
>
> Name
> Address
> City
> State
> Zip code


0
Reply Matt 4/14/2004 9:26:01 PM


in article ce483f0b.0404141239.5fbca17f@posting.google.com, jc at
info@realtorpro.biz wrote on 4/14/04 1:39 PM:

> What kind of script would parse this example into correct fields?
> 
> Name
> Address
> City
> State
> Zip Code
> 
> Name 
> Address
> City
> State
> Zip code

There is really pretty vague information. Where is the data now, text file,
a field, another database?
Are there labels for these fields?
Is there always the same number of fields for each record, or do some have
2nd address, phone numbers, etc?


It easier to assist you if we know all of the perimeters. In most cases, it
is helpful to see an an example of the data being manipulated.

;)

0
Reply Lee 4/14/2004 9:57:29 PM

I've seen this before. Been a while though. As lame as it sounds (no 
disrespect to the original poster inplied), the field are delimited by 
carriage returns and the records are delimited by double carriage returns.



> Lee Smith <canepa6645@charter.net> wrote in
> news:BCA301D8.62DD%canepa6645@charter.net: 

> There is really pretty vague information. Where is the data now, text
> file, a field, another database?
> Are there labels for these fields?
> Is there always the same number of fields for each record, or do some
> have 2nd address, phone numbers, etc?
> 
> 
> It easier to assist you if we know all of the perimeters. In most
> cases, it is helpful to see an an example of the data being
> manipulated. 
> 
> ;)
> 

0
Reply Brent 4/14/2004 11:18:41 PM

In article <Xns94CBA5F924054wbsimonhotmailcom@207.217.125.201>, Brent
Simon <wbsimonNOSPAM@hotmail.com> wrote:

>I've seen this before. Been a while though. As lame as it sounds (no 
>disrespect to the original poster inplied), the field are delimited by 
>carriage returns and the records are delimited by double carriage returns.


If so, then you could clean this up in BBedit (or some other robust text
editor) quite easily.  

Run a search'n'replace for a double carriage return, change it
(temporarily) to XXX or something.  

Run a search'n'replace for a single carriage return, replace it with the
string ",".  

Run a search'n'replace for XXX, replace it with the string " [carriage
return] " -- double quote, carriage return, double quote.

Got to the beginning of the file, add a double quote as the first
character.  Go to the end, add one as the last character.

Import your brand new comma delimited file into FM.  It should take less
time to do than it took to type this post.

Steve Brown
0
Reply eyebrown 4/15/2004 11:48:50 AM

in article eyebrown-1504040748100001@sdn-ap-029tnnashp0194.dialsprint.net,
eyebrown@mindspring.com at eyebrown@mindspring.com wrote on 4/15/04 4:48 AM:

> If so, then you could clean this up in BBedit (or some other robust text
> editor) quite easily.
> 
> Run a search'n'replace for a double carriage return, change it
> (temporarily) to XXX or something.
> 
> Run a search'n'replace for a single carriage return, replace it with the
> string ",".  
> 
> Run a search'n'replace for XXX, replace it with the string " [carriage
> return] " -- double quote, carriage return, double quote.
> 
> Got to the beginning of the file, add a double quote as the first
> character.  Go to the end, add one as the last character.
> 
> Import your brand new comma delimited file into FM.  It should take less
> time to do than it took to type this post.

Hi Steve,

Good response if the data is in text format. BBEdit's powerful Find and
Replace and it's Tools can make cleanup of most files a snap. For the really
tough files, you can't beat it's ability to use Grep patterns to really
manipulate text prior to import. IMHO, it is a lot easier to deal with the
variables using BBEdit than to write scripts and calculations in FileMaker
to do it after import.

BTW, I prefer the tab import when dealing with text and I usually use the
pipe character || to temporally mark the break between records.

However, since a script was requested in the original posts, it may be that
a script or calculation is what is needed. In which case I think we need to
wait for j.c. to respond.

Lee

0
Reply Lee 4/15/2004 2:10:40 PM

"Lee Smith" <canepa6645@charter.net> wrote in message
news:BCA3E5EF.6322%canepa6645@charter.net...
> in article eyebrown-1504040748100001@sdn-ap-029tnnashp0194.dialsprint.net,
> eyebrown@mindspring.com at eyebrown@mindspring.com wrote on 4/15/04 4:48
AM:
>
> > If so, then you could clean this up in BBedit (or some other robust text
> > editor) quite easily.
> >
> > Run a search'n'replace for a double carriage return, change it
> > (temporarily) to XXX or something.
> >
> > Run a search'n'replace for a single carriage return, replace it with the
> > string ",".
> >
> > Run a search'n'replace for XXX, replace it with the string " [carriage
> > return] " -- double quote, carriage return, double quote.
> >
> > Got to the beginning of the file, add a double quote as the first
> > character.  Go to the end, add one as the last character.
> >
> > Import your brand new comma delimited file into FM.  It should take less
> > time to do than it took to type this post.
>
> Hi Steve,
>
> Good response if the data is in text format. BBEdit's powerful Find and
> Replace and it's Tools can make cleanup of most files a snap. For the
really
> tough files, you can't beat it's ability to use Grep patterns to really
> manipulate text prior to import. IMHO, it is a lot easier to deal with the
> variables using BBEdit than to write scripts and calculations in FileMaker
> to do it after import.
>
> BTW, I prefer the tab import when dealing with text and I usually use the
> pipe character || to temporally mark the break between records.
>
> However, since a script was requested in the original posts, it may be
that
> a script or calculation is what is needed. In which case I think we need
to
> wait for j.c. to respond.
>
> Lee
>
I use BBEdit, but and use similar techniques for one-time proceses. But GREP
confuses the bejeebers out of me. Right up there with ProcMail recipes.

As long as we're talking Mac (though I didn't see anything about which
platform in the original post), I'll chime in with a comment about
TextSpresso.

Powerful (VERY), flexible, easy to customize filtering tool that prepares
(read: cleans up) text for import. I have a good half-dozen text files I run
through it prior to FMP import. If you have to deal with messy source
material on regular basis, check out the 30-day free trial at
http://taylor-design.com.

Matt


0
Reply Matt 4/15/2004 10:55:20 PM

"Lee Smith" wrote...
>
> BTW, I prefer the tab import when dealing with text and I usually use the
> pipe character || to temporally mark the break between records.
>
> However, since a script was requested in the original posts, it may be that
> a script or calculation is what is needed. In which case I think we need to
> wait for j.c. to respond.
>
> Lee

One advantage to the comma and quotes over tabs is that if there are already
commas in the text, it doesn't confuse the import. I have found that if there
are tabs and commas in the imported file, Filemaker uses both to separate text
if the commas aren't trapped inside quotes.

For doing the same text replace functions on Windows, I've found little to top
BKReplacem. It allows you to build multiple files sets (which is a godsend if
you are doing it on a regular basis with the same replacements) with regular and
grep replacements, and is blazingly fast.

Kent


0
Reply K 4/15/2004 11:48:31 PM

in article jbFfc.125054$Pk3.26320@pd7tw1no, K&V P at
random_characters@shaw.ca.invalid wrote on 4/15/04 4:48 PM:

> For doing the same text replace functions on Windows, I've found little to top
> BKReplacem. It allows you to build multiple files sets (which is a godsend if
> you are doing it on a regular basis with the same replacements) with regular
> and grep replacements, and is blazingly fast.

This is the first time that I've heard of BKReplacem, Is it comparable to
BBEdit?  What do you mean when you say  "build multiple files sets"? Is it
like saving the Greps in BBEdit? I have a Macro in Quickeys that runs 14
Grep patterns on one of my regular site captures, is that kind of like what
you mean?

Lee

0
Reply Lee 4/16/2004 12:14:45 AM

"Lee Smith" wondered...
> K&V P at wrote :
>
> > For doing the same text replace functions on Windows, I've found little to
top
> > BKReplacem. It allows you to build multiple files sets (which is a godsend
if
> > you are doing it on a regular basis with the same replacements) with regular
> > and grep replacements, and is blazingly fast.
>
> This is the first time that I've heard of BKReplacem, Is it comparable to
> BBEdit?  What do you mean when you say  "build multiple files sets"? Is it
> like saving the Greps in BBEdit? I have a Macro in Quickeys that runs 14
> Grep patterns on one of my regular site captures, is that kind of like what
> you mean?
>
> Lee

It is not a text editor. It is a freeware replace tool. It can operate on a
single file, a predefined list of files, all files in a folder (recursive), or a
wildcard list like any file that matches dailytape*.txt in the \sales\ folder.
Or any combination of those all at once.

You define anything from simple ASCII sets of strings (change xxx to yyyy and aa
to bcb) to find xxx(anystring)yyy and replace it with (anystring) to incredibly
complex greps like changing date formatted MM/DD/YY to YYYY-MM-DD including
calculating whether the century should be 19 or 20. Trigger it and off it goes.
It makes a backup copy of any matching files (you can turn backups off if you
want), then performs all the find and replaces sequentially. It can do file xxxx
with one replace set, then files yyy and zzz with replace set B, then any file
in the \whatever\ folder with replace set C. On my Athlon 2400, it will do a
couple of million simple text replacements in about a minute.

I use it daily (triggered by Windows scheduler)to turn just under two million
lines in 12 crude text files into tab delimited text for importing into four
Filemaker file for further processing for our website and bestseller files at a
book store. It runs replace set 1 at 4:30 am, then launches a Filemaker file to
do some stuff and exits, then it runs replace set 2 at about 6:00am, then
launches another Filemaker file for final processing.

One caveat, GREP slows it down significantly.

There is more detail than I care to worry about here:
http://www.boolean.ca/replace/examples.html

Kent

Simple sample group below every 2 lines is one replace item in the set:


telelist set:

{}t{}e

{}tMRB{}e   : find[tab][LineBreak] replace with [tab]MRB[LineBreak]

{}tMH{}e

{}tMHR{}e    : find[tab]MH[LineBreak] replace with [tab]MHR[LineBreak]

{}tVW{}e

{}tVAN{}e    : find[tab]VW[LineBreak] replace with [tab]VAN[LineBreak]

{}tRC{}e

{}tRAI{}e    : etc. etc. etc.


0
Reply K 4/16/2004 4:00:53 AM

M$ Word works fine for this, too.

-- 

Shadenfroh
shadenfroh@yahoo.com

<eyebrown@mindspring.com> wrote in message
news:eyebrown-1504040748100001@sdn-ap-029tnnashp0194.dialsprint.net...
In article <Xns94CBA5F924054wbsimonhotmailcom@207.217.125.201>, Brent
Simon <wbsimonNOSPAM@hotmail.com> wrote:

>I've seen this before. Been a while though. As lame as it sounds (no
>disrespect to the original poster inplied), the field are delimited by
>carriage returns and the records are delimited by double carriage returns.


If so, then you could clean this up in BBedit (or some other robust text
editor) quite easily.

Run a search'n'replace for a double carriage return, change it
(temporarily) to XXX or something.

Run a search'n'replace for a single carriage return, replace it with the
string ",".

Run a search'n'replace for XXX, replace it with the string " [carriage
return] " -- double quote, carriage return, double quote.

Got to the beginning of the file, add a double quote as the first
character.  Go to the end, add one as the last character.

Import your brand new comma delimited file into FM.  It should take less
time to do than it took to type this post.

Steve Brown


0
Reply Shadenfroh 5/9/2004 3:21:42 PM

10 Replies
252 Views

(page loaded in 0.161 seconds)

Similiar Articles:













7/23/2012 12:26:24 PM


Reply: