truncate file

  • Follow


Hi!

I know the boost::filesystem library was formed to a proposal for the
new standard. This library includes some operations on files. And I
wonder why this still does not include truncation of files to a
specified size.

Is there not need to truncate files? It can't be done in standard C++
without rewriting/copying the remaining file contents. But at least in
Windows and Linux I have implemented just that: truncate a file (which
isn't opened by any program) to a given size. Shouldn't this be provided
by the standard library?

Frank

-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Frank 9/2/2007 6:08:47 PM

On Sun,  2 Sep 2007 18:08:47 CST, Frank Birbacher
<bloodymir.crap@gmx.net> wrote in comp.lang.c++.moderated:

> Hi!
> 
> I know the boost::filesystem library was formed to a proposal for the
> new standard. This library includes some operations on files. And I
> wonder why this still does not include truncation of files to a
> specified size.

The real question, I think, is why should it include this feature?

> Is there not need to truncate files? It can't be done in standard C++
> without rewriting/copying the remaining file contents. But at least in
> Windows and Linux I have implemented just that: truncate a file (which
> isn't opened by any program) to a given size. Shouldn't this be provided
> by the standard library?

Perhaps some programmers in some situations need to truncate files.
You state that there are two operating systems where it is possible to
do so.  Perhaps there are some operating systems where it is not
possible to do so.

> Frank

There are two issues here, as I see it.

The first is that questions about features to be adopted by the C++
standard in the future really belong on the moderated group
news:comp.std.c++, rather than here.

The second is a cost/benefit analysis of the proposed feature.

The benefit question is something like this:  What percentage of C++
programmers have ever needed to perform this function?  50%, 10%, 1%,
less?

The cost question is complex.

Do all operating systems even allow this?  Some might not, for
security or other reasons.  Even on the systems where you have
developed such code, you mention that the file can't be open by any
program.  How can you tell?  Are you sure that every platform that
allows truncation provides a function for your program to determine
whether it will be allowed to truncate a particular file?

On the operating systems that you have used, is there a simple, single
system call that performs the truncation?  Or is there an elaborate
function that must be written?  Is the function different on different
platforms?

When you add something to the standard library, every implementation
must provide it.  That is time and effort and cost which could be
spent on other things.  So to make a serious proposal you need to
provide an estimate of the cost (on all systems), and an estimate of
the number of programs and programmers who would benefit.

-- 
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Jack 9/3/2007 11:23:46 PM


On posix there is a simple and efficient "truncate" call (constant
time I believe?). I can't seem to find a win32 truncate call. Does
anyone know if you can efficiently truncate on win32?

> When you add something to the standard library, every implementation
> must provide it.
No...

It would be a crappy library if takes an operation that is constant
time and forces the programmer to implement it in such a way that it
is linear time. This is phenomenally bad since we are talking about
disk access time. The operation could take hours as opposed to
milliseconds on large files. This would just encourage developers to
use system primitives, such as truncate, directly which defeats the
whole purpose of having a portable library to write on top of.

For systems that don't provide truncate natively, it is trivial to
implement as a linear time operation in terms of existing operations
exposed through the C and C++ standard library. This means that there
is an operations that always works, and works efficiently on at least
some platforms, which allows developers to write both optimal and
portable code. If truncate isn't included in a file system library it
forces developers to choose between the two optimal or portable.

So truncate is a good idea for a standard library, but this probably
isn't the place to address the subject. You would probably be better
off submitting a patch to boost, and emailing the lists where they are
working on tr2 or c++0x.


-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply catphive 9/4/2007 4:41:28 AM

In article <1188891513.580440.100470@g4g2000hsf.googlegroups.com>,
catphive@catphive.net says...
> On posix there is a simple and efficient "truncate" call (constant
> time I believe?). I can't seem to find a win32 truncate call. Does
> anyone know if you can efficiently truncate on win32?

Yes -- SetEndOfFile. I tend to agree with your basic premise: just about
any system can provide this for most storage-based kinds of files,
though (of course) it wouldn't work for things like interactive streams.
It's common enough that providing portable to what's usually availalble
seems quite reasonable and useful, at least to me.

-- 
     Later,
     Jerry.

The universe is a figment of its own imagination.

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Jerry 9/4/2007 7:36:15 PM

On Tue,  4 Sep 2007 04:41:28 CST, catphive@catphive.net wrote in
comp.lang.c++.moderated:

> On posix there is a simple and efficient "truncate" call (constant
> time I believe?). I can't seem to find a win32 truncate call. Does
> anyone know if you can efficiently truncate on win32?
> 
> > When you add something to the standard library, every implementation
> > must provide it.
> No...

Well, yes, every implementation must provide it.  Unless the standard
allows it to either return a failure indication (like rename() or
remove()), or throw an exception.  Even then it is providing a version
that always indicates failure.

> It would be a crappy library if takes an operation that is constant
> time and forces the programmer to implement it in such a way that it
> is linear time. This is phenomenally bad since we are talking about
> disk access time. The operation could take hours as opposed to
> milliseconds on large files. This would just encourage developers to
> use system primitives, such as truncate, directly which defeats the
> whole purpose of having a portable library to write on top of.

[snip]

It would be a crappy library that provided a function to perform every
single possible task in an (implementation-defined and/or platform
specific) constant time that the programmer could otherwise implement
in linear time, or anything other than constant time.

The question still is:  How many programmers actually need/use this,
in what percentage of actual programs?

Every function added to the standard library adds cost to every
implementation.  It must be documented, code reviewed, and tested by
every implementer.

Your logic, and that of the OP, seem to suggest that you believe that
for every case where a (reasonably popular) platform provides an API
to do something faster than writing the equivalent code in standard,
portable C++, the standard should add something to the standard
library to access that platform specific API.

Every single "more efficient" API in the operating system?  No, I
imagine you really wouldn't say that.

So which "more efficient" APIs to you support, and which ones do you
not?  Obviously, only the ones likely to be useful to the greatest
number of programmers and the greatest number of applications.

Which brings us back to the question of how many programmers actually
perform file truncation, and in what percentage of the programs that
they write?  What if it is only 1% of the programmers, and then only
in 5% of their programs.  And in that hypothetical 0.05% of all C++
programs (or whatever actual percentage it might be, higher or lower),
what percentage of their actual execution time do they spend
truncating files?  How much actual useful time would they save?

Are there cases where a program absolutely fails to meet its
requirements because it rewrites and renames a file to truncate it,
and meets them successfully when it uses a "more efficient" platform
specific API?  You can say two things, if so:

1.  Obviously, the program is not maximally portable because it cannot
meet its requirements on a platform that does not provide such a "more
efficient" API to truncate files.

2.  It could be ported to various platforms that do provide such an
API at the expense of writing platform specific functions for
different platforms to call the platform specific API.

If there is only the time and effort to support a certain number of
language extensions and library additions to C++ or any other
language, and there most certainly are limitations in that area, then
triage must be performed.

The list of all possible proposed extensions and additions must be
divided into two categories, those that make the cut for inclusion and
those that do not.

Generally speaking, those that make the cut will be those which add
the most benefit to the greatest number of programs and programmers.

This is not to say that a file truncation function might not be a
worthwhile addition to the C++ library.  It is merely to say that the
way to make a case for it is to come up with a realistic estimate of
the number of programs that would gain, and how much they would gain.

The argument that "It would be a crappy library if takes an operation
that is constant time and forces the programmer to implement it in
such a way that it is linear time" is simply insufficient.  There are
most likely a very large number of operations where such extensions
could be added, in either the Windows or POSIX API, far more than
feasible to cram into the standard.

The way to make a case for your favorite extension to be added to the
language or library is to demonstrate that it would provide more
benefit that all the other favorite extensions that other C++
programmers want.

-- 
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Jack 9/5/2007 12:50:46 AM

In article <gm5sd3h29iq3fnpkamh6c04f0qam6ap9gg@4ax.com>,
jackklein@spamcop.net says...

[ ... ]

> It would be a crappy library that provided a function to perform every
> single possible task in an (implementation-defined and/or platform
> specific) constant time that the programmer could otherwise implement
> in linear time, or anything other than constant time.
> 
> The question still is:  How many programmers actually need/use this,
> in what percentage of actual programs?
> 
> Every function added to the standard library adds cost to every
> implementation.  It must be documented, code reviewed, and tested by
> every implementer.

My guess is that it would be used more often, in more programs than,
say, strpbrk. In addition, it would usually be quite a bit simpler to
implement than many of the string functions.

> Your logic, and that of the OP, seem to suggest that you believe that
> for every case where a (reasonably popular) platform provides an API
> to do something faster than writing the equivalent code in standard,
> portable C++, the standard should add something to the standard
> library to access that platform specific API.

It's one thing if ONE reasonably popular platform provides a capability.
In this case, however, essentially _every_ reasonably popular platform
seems to provide the capability -- which I think tends to indicate that
quite a few people really do use the capabiity on a fairly regular
basis.

> So which "more efficient" APIs to you support, and which ones do you
> not?  Obviously, only the ones likely to be useful to the greatest
> number of programmers and the greatest number of applications.

There's a balance between what's gained and what it costs to achieve
that gain. In this case you gain quite a bit and the cost appears to be
quite minimal. Disk I/O is sufficiently expensive that avoiding copying
entire files is a pretty big gain. In many cases, the function is likely
to be a one-liner that does nothing more than provide the standard name
as a front-end for the capability that's already present.

[ ... ]

> The way to make a case for your favorite extension to be added to the
> language or library is to demonstrate that it would provide more
> benefit that all the other favorite extensions that other C++
> programmers want.

It's not _just_ a matter of showing greater benefit, but also of showing
reasonably low cost, and (being at all honest) being willing to put in
some work on it -- saying "I think we should have this" is a lot less
likely to get results than saying "Here's the language I think should be
added to the standard, and here are sample implementations for 9
operating systems that constitute over 99% of the market."

-- 
     Later,
     Jerry.

The universe is a figment of its own imagination.

      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Jerry 9/9/2007 12:45:26 AM

>> I know the boost::filesystem library was formed to a proposal for the
>> new standard. This library includes some operations on files. And I
>> wonder why this still does not include truncation of files to a
>> specified size.
>
> The real question, I think, is why should it include this feature?

Because it is a reasonably common file operation.

> Perhaps some programmers in some situations need to truncate files.
> You state that there are two operating systems where it is possible to
> do so.  Perhaps there are some operating systems where it is not
> possible to do so.

Good point. Lets come from the opposite direction.
Why does C++ support any operations on files? Not all environments have the 
concepts of files (and I have programmed on one that does not)
Why does C++ support dynamic memory  new/delete etc? Not all environments 
have dynamic memory (and again I have programmed on one that does not)
If keyboard and mouse functions are excluded on the grounds that not all 
environments have them and it is OS-specific, the same could be said about 
files and dynamic memory.
Not all environments have those. Perhaps C++ should be reduced further still 
to the lowest-common denominator.

> On the operating systems that you have used, is there a simple, single
> system call that performs the truncation?  Or is there an elaborate
> function that must be written?  Is the function different on different
> platforms?
>
> When you add something to the standard library, every implementation
> must provide it.  That is time and effort and cost which could be
> spent on other things.  So to make a serious proposal you need to
> provide an estimate of the cost (on all systems), and an estimate of
> the number of programs and programmers who would benefit.

Given that both rename() and remove() are supplied, I see no reason why some 
form of truncate() should not be supplied.
I would argue that this a reasonably common operation. In several decades 
programming I have wanted to truncate on a number of occasions.
The alternatives are considerably more inefficient and I am reasonably 
confident that it is implementable for nearly every OS that supports file 
operations.

Stephen Howe 



-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Stephen 9/11/2007 3:52:28 PM

In article <m6GdnZO_DsqoS3vbnZ2dnUVZ8vmdnZ2d@pipex.net>, Stephen Howe
<sjhoweATdialDOTpipexDOTcom@giganews.com> wrote:

> 
> Given that both rename() and remove() are supplied, I see no reason why some 
> form of truncate() should not be supplied.
> I would argue that this a reasonably common operation. In several decades 
> programming I have wanted to truncate on a number of occasions.
> The alternatives are considerably more inefficient and I am reasonably 
> confident that it is implementable for nearly every OS that supports file 
> operations.
> 
> Stephen Howe 
     Rename() and remove() [from <cstdio> not <algorithm>] are in the
standard library because they are inherited from c89.  I would state
not everyone is posix or ms windows what have your.  It is possible to
truncate on both windows and posix compliant filesystems but not all
the world is posix nor is it microsoft.    I don't see truncate in c99
either. BTW.   If you need truncation use an available os function if
not just copy and be done with it.  If you write a wrapper with some
nice name like truncate_file(...) then its easy to [hopefullly the
interface works with most os functions to truncate files] write these
wrappers with conditional compilation or some build decisions.

-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Carl 9/11/2007 11:56:29 PM

Hi!

Carl Barron schrieb:
> I don't see truncate in c99
> either. BTW.

Doesn't seem much like an argument. But maybe I should first propose
truncate for the C standard and then argue C++ shall adopt it. :)

> If you need truncation use an available os function if
> not just copy and be done with it.

Copying is not an option, it would take too long for large files
(several GB) and occupy all I/O resources.

>  If you write a wrapper with some
> nice name like truncate_file(...) then its easy to [hopefullly the
> interface works with most os functions to truncate files] write these
> wrappers with conditional compilation or some build decisions.

That's what I did. On Linux its a two-liner:

void truncateFile(
		std::string const& filename,
		boost::iostreams::stream_offset const newLength
	)
{
        if(truncate(filename.c_str(), newLength))
		raiseError("could not truncate");
}

On Windows you have to:
1. open the file
2. seek to desired position
3. call "SetEndOfFile
4. close the file
which is littered with the IMHO broken error reporting facilities of the
Windows API.

@Jack: I'm sorry not to respond and do discussion earlier. Actually I'm
overwhelmed by the various opinions and strong arguments on this issue.

One point: If someone sets up a proposal for this the standards
committee will take longer to finalize any other proposals. (Apart from
the fact the deadline for handing in proposals for the upcoming revision
of the standard has already passed) Personally I'd like the new standard
be finished soon than "truncate" be accepted at all means. Anyway I
think the "truncate" functionality is an important operation which C++
currently lacks. There are all kinds of methods to *enlarge* files using
fwrite, fprintf, fputc, streambuf functions and iostreams,
ostream_iterator, and what not. But how can I make a file *smaller* ??
There is not a single way I know of. That's why I think "truncate" is
really missing. It's like a std::vector without resize and erase.

Frank

-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Frank 9/12/2007 10:56:19 AM

Frank Birbacher wrote:

> One point: If someone sets up a proposal for this the standards
> committee will take longer to finalize any other proposals. (Apart from
> the fact the deadline for handing in proposals for the upcoming revision
> of the standard has already passed) Personally I'd like the new standard
> be finished soon than "truncate" be accepted at all means. Anyway I
> think the "truncate" functionality is an important operation which C++
> currently lacks. There are all kinds of methods to *enlarge* files using
> fwrite, fprintf, fputc, streambuf functions and iostreams,
> ostream_iterator, and what not. But how can I make a file *smaller* ??
> There is not a single way I know of. That's why I think "truncate" is
> really missing. It's like a std::vector without resize and erase.

Is it? Just from my personal experience: While I need I/O operations all the
time (and let it be for debugging output) I actually never had the need in
the last ten years to truncate a file. (Except, truncating it to zero, i.e.
overwriting it.) You might be working in a different problem domain than I
do, but it looks to me like a very specific problem.

I would consider operations as "open a window, draw a line" much more crucial
for the general success of C++, but, of course, C++ is a multi-purpose language
and there is no point in putting a feature into the standard library that would
better be solved by a problem-specific library of a third-party vendor. We
don't have a "open window" in the STL, and for good reason, even though a good
part of programs would benefit from it.

What you want is a well-equipped tool-chain that can detect the specific operating
system, and check for the available features you need. I would consider truncation
as a special feature, not one of general interest.

So long,
	Thomas

-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Thomas 9/12/2007 11:45:49 AM

In article <5kpsu5F4vubaU1@mid.dfncis.de>, Frank Birbacher
<bloodymir.crap@gmx.net> wrote:

> 
> Carl Barron schrieb:
> > I don't see truncate in c99
> > either. BTW.
> 
> Doesn't seem much like an argument. But maybe I should first propose
> truncate for the C standard and then argue C++ shall adopt it. :)
   Not an argument just history:)   It is the reason why rename() and
remove() [from cstdio] are in the c++ standard library.

-- 
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]

0
Reply Carl 9/12/2007 6:22:45 PM

10 Replies
396 Views

(page loaded in 0.203 seconds)

Similiar Articles:













7/24/2012 10:44:00 AM


Reply: