I just try to import a 2.5 GB text file - how's your experience here?
A minor flaw is that the 'remaining bytes' count started at a negative
number, counting down even further.
Yet another flaw might be that the file requires the extension .txt,
..tab or other matching types (Mac OSX 10.3.8) - you can't force a file
selection e.g. by option-click, but rename the file first.
However, you may select any file by activating the 'any file type'
option, then switch over to the tab separated file and select yet
another file to open.
The import is running for a day by now (iBook G4, 800 MHz, on external
160 GB FireWire drive). Let's see about the results. I guess I better
should write a perl script for the desired data mining ;-)
What's your biggest file import?
Regards
Martin
|
|
0
|
|
|
|
Reply
|
Martin
|
1/11/2005 3:17:04 PM |
|
So Martin, just curious -- are these going into records (how many) or one
huge text field? After all, fields are supposed to be able to hold 4GB now.
"Martin Trautmann" <t-use@gmx.net> wrote in message
news:slrncu7rf9.r6.t-use@ID-685.user.individual.de...
>
> I just try to import a 2.5 GB text file - how's your experience here?
>
> A minor flaw is that the 'remaining bytes' count started at a negative
> number, counting down even further.
>
> Yet another flaw might be that the file requires the extension .txt,
> .tab or other matching types (Mac OSX 10.3.8) - you can't force a file
> selection e.g. by option-click, but rename the file first.
>
> However, you may select any file by activating the 'any file type'
> option, then switch over to the tab separated file and select yet
> another file to open.
>
> The import is running for a day by now (iBook G4, 800 MHz, on external
> 160 GB FireWire drive). Let's see about the results. I guess I better
> should write a perl script for the desired data mining ;-)
>
> What's your biggest file import?
>
> Regards
> Martin
|
|
0
|
|
|
|
Reply
|
Bill
|
1/11/2005 8:50:15 PM
|
|
I never did import such a big text file. I hope the text file consists of
thousands of different records (lines), and not "one big record with 2.5 gb
size"?
My first suggestion is, that you disable all indexing before importing (go
to define fields, options, storage, set indexing to off, AND automatically
must be switched OFF). Then try importing again, it should run faster this
way. But even with disabled indexing, with 2.5 gb data importing might run
several days. Be patient, it depends on the speed of your Mac and your disk.
While importing, simply ignore silly count numbers. Only the results is
important.
After importing, the database should be a multiple of 2.5 gb. Make a backup
copy, and THEN switch indexing on as needed. Dabase size will increase more
and more, so keep the database on a large partition (your 160gb should be
large enough, i hope). The first time you index a field (either by setting
the options, or when you make a find) Filemaker will take a LARGE amount of
time and disk space, you must be patient again.
Good look!
Chris
"Martin Trautmann" <t-use@gmx.net> schrieb im Newsbeitrag
news:slrncu7rf9.r6.t-use@ID-685.user.individual.de...
>
> I just try to import a 2.5 GB text file - how's your experience here?
>
> A minor flaw is that the 'remaining bytes' count started at a negative
> number, counting down even further.
>
> Yet another flaw might be that the file requires the extension .txt,
> .tab or other matching types (Mac OSX 10.3.8) - you can't force a file
> selection e.g. by option-click, but rename the file first.
>
> However, you may select any file by activating the 'any file type'
> option, then switch over to the tab separated file and select yet
> another file to open.
>
> The import is running for a day by now (iBook G4, 800 MHz, on external
> 160 GB FireWire drive). Let's see about the results. I guess I better
> should write a perl script for the desired data mining ;-)
>
> What's your biggest file import?
>
> Regards
> Martin
>
|
|
0
|
|
|
|
Reply
|
christian
|
1/12/2005 7:44:54 AM
|
|
ps., geht mich ja eigentlich nix an, aber trotzdem: ...@gmx.de, Du kommst
aus Deutschland? Welche Stadt?
mfg Chris
|
|
0
|
|
|
|
Reply
|
christian
|
1/12/2005 7:48:14 AM
|
|
On Tue, 11 Jan 2005 15:50:15 -0500, Bill Marriott wrote:
> So Martin, just curious -- are these going into records (how many) or one
> huge text field? After all, fields are supposed to be able to hold 4GB now.
It's about 10 fields and 30 Mega-Records
.... Status: 16 MB remaining (counting downwards by now)
- which means, up to the previous behavior,
that 3 of these MB will increase the .fp7 size by 1 GB each (*),
resulting in a total .fp7 file size of approximately 8 GB,
finished import in about 2 or three more days
(*) Status yesterday evening: file size 2 GB, "19 MB" remaining,
Status this morning: file size 3 GB, "16 MB" remaining
I did a grep on the text version, which finished in a few seconds - I
guess that a perl script will be 1000 times faster than the desired FMP
operation, doing some address statistics.
|
|
0
|
|
|
|
Reply
|
Martin
|
1/12/2005 9:25:20 AM
|
|
On Wed, 12 Jan 2005 08:44:54 +0100, christian st�ben wrote:
> My first suggestion is, that you disable all indexing before importing
Should have done so - unfortunately two fields where indexed before.
Nevertheless, indexing would have to be done at any stage.
> While importing, simply ignore silly count numbers. Only the results is
> important.
Not much chance, other than ignoring the count numbers - but they are
just stupid.
> The first time you index a field (either by setting
> the options, or when you make a find) Filemaker will take a LARGE amount of
> time and disk space, you must be patient again.
I guess I'll turn on every field indexing within file field definition -
otherwise I'll have ten times the trouble to wait for ages. Hm - I
guess, doing a search with a field entry in every field might perform
even better.
Thanks,
Martin
|
|
0
|
|
|
|
Reply
|
Martin
|
1/12/2005 9:28:17 AM
|
|
On Wed, 12 Jan 2005 08:48:14 +0100, christian st�ben wrote:
> ps., geht mich ja eigentlich nix an, aber trotzdem: ...@gmx.de, Du kommst
> aus Deutschland? Welche Stadt?
Freiburg - ich bearbeite den Export einer Telefon-CD.
Schoenen Gruss
Martin
|
|
0
|
|
|
|
Reply
|
Martin
|
1/12/2005 9:29:13 AM
|
|
Oha, sowas gibt es noch; eine Telefon-CD die sich komplett exportieren l��t?
Welche w�re es denn? (habenhabenhabenwollen)
mfg Chris
|
|
0
|
|
|
|
Reply
|
christian
|
1/12/2005 9:56:57 AM
|
|
On Wed, 12 Jan 2005 10:56:57 +0100, christian st�ben wrote:
> Oha, sowas gibt es noch; eine Telefon-CD die sich komplett exportieren l��t?
> Welche w�re es denn? (habenhabenhabenwollen)
Es gibt entweder D-Info oder Klicktel - und fuer beides im Ausland die
Patches. Mittlerweile ist Invers-Suche aber auch in D moeglich. Kann
sein, dass auch die Telekom-CD exportierbar sein mag.
Schoenen Gruss
Martin
|
|
0
|
|
|
|
Reply
|
Martin
|
1/12/2005 10:12:27 AM
|
|
On 11 Jan 2005 15:17:04 GMT, Martin Trautmann wrote:
> I just try to import a 2.5 GB text file - how's your experience here?
Finished by now...
After 44 hours it complained about a full startup disk: FMP used two
swap files within /private/tmp on the local disc, instead of swapping to
the free external area. It created a 3.80 GB file there, then yet
another one, while the original .fp7 file (3.85 GB) was touched
continously, updating its last modified date, but not increasing this or
the big or the small (12.3 KB) swap file.
It continued for four more hours - totalling in two days input.
Then if finished - and complained, that the file is damaged and should
be recovered.
Maybe someone wants to give it yet another try. My results are
discouraging: slow, flawy - or probably even buggy.
|
|
0
|
|
|
|
Reply
|
Martin
|
1/12/2005 10:05:57 PM
|
|
Ich habe im moment viiiel Rechenzeit �brig und k�nnte das auf meinem PC
(Dateiformat is mit Mac identisch) auch probieren. Sollte funktionieren, mit
etwas Gl�ck auch ohne "damaged should be recovered". Was h�r ich da? Alter
Schlemihl? Habenhabenhabenwollen?
Spa� beiseite, Angebot war ernst gemeint...
mfg Chris
ps.: scheint ja doch nicht so dramatisch mit der Dateigr��e zu sein. 2.5 GB
Text auf 3.85 gb fp7, ich h�tte mit bedeutend gr��erer Datei gerechnet.
|
|
0
|
|
|
|
Reply
|
christian
|
1/13/2005 7:01:45 AM
|
|
On Thu, 13 Jan 2005 08:01:45 +0100, christian st�ben wrote:
> Ich habe im moment viiiel Rechenzeit �brig und k�nnte das auf meinem PC
> (Dateiformat is mit Mac identisch) auch probieren. Sollte funktionieren, mit
> etwas Gl�ck auch ohne "damaged should be recovered". Was h�r ich da? Alter
> Schlemihl? Habenhabenhabenwollen?
>
> Spa� beiseite, Angebot war ernst gemeint...
surprisingly enough, the damaged file could be opened - it hat about 3 M
records missing and 80000 empty entries, but it was easy to tail the
remaining lines and import them as yet another step. Now I'm at 33 M
records, waiting for the indexing to finish - Strange enough that all
fields had the default option none, index if required, did claim to
start indexing when I asked to perform a search in every field, but
resulted in not indexed fields.
Now I started indexing manually, which goes on extremely slow.
> ps.: scheint ja doch nicht so dramatisch mit der Dateigr��e zu sein. 2.5 GB
> Text auf 3.85 gb fp7, ich h�tte mit bedeutend gr��erer Datei gerechnet.
Current state 4,2 GB with 2 of 11 fields indexed, 400 MB in the
/private/var/tmp/folders/`id`/FileMaker
So I hear you want a copy of the address CD ;-)
580 MB .bz2 - what's your address?
Sch�nen Gru�
Martin
|
|
0
|
|
|
|
Reply
|
Martin
|
1/13/2005 12:21:28 PM
|
|
Martin Trautmann <t-use@gmx.net> wrote:
> On Thu 2005-01-13 (13:22), Martin Trautmann wrote:
> So my recommmendation is: Don't use FMP on bigger files, if you don't
> really have to. A simple grep does perform rather fast in order too find
> the desired lines, a simple sed took some more time in order to modify one
> of the fields (took about 16 hours), while a sort is really fast.
Are you using 7v3? Supposedly, some serious problems with import speed
were fixed in the v3 release.
--
To send email, remove the invalid and nospams.
|
|
0
|
|
|
|
Reply
|
md03NOSPAM
|
1/20/2005 3:26:46 PM
|
|
On Thu, 20 Jan 2005 07:26:46 -0800, Michael Diehr wrote:
> Martin Trautmann <t-use@gmx.net> wrote:
>
> > On Thu 2005-01-13 (13:22), Martin Trautmann wrote:
> > So my recommmendation is: Don't use FMP on bigger files, if you don't
> > really have to. A simple grep does perform rather fast in order too find
> > the desired lines, a simple sed took some more time in order to modify one
> > of the fields (took about 16 hours), while a sort is really fast.
>
> Are you using 7v3?
Yes, I do
.... still indexing.
|
|
0
|
|
|
|
Reply
|
Martin
|
1/20/2005 3:55:36 PM
|
|
|
13 Replies
212 Views
(page loaded in 0.125 seconds)
|