Standard Interface for Data Compression Algorithms

  • Follow


I have many problems on my applications that can be solved with this
idea (a common compression interface).
Before getting to work, I was looking for it and...here (http://
www.ross.net/compression/interface.html), from the well known Ross
Williams,  I have founded an attempt to create an Standard Interface
for Data Compression Algorithms
Wat happend with that? isn't a usefull idea? What do you think about?
Exists any implementation of this or similar things?

Poyo
0
Reply poyocapo (2) 3/27/2009 3:14:49 PM

"poyo" <poyocapo@gmail.com> wrote in message 
news:18e6e253-87e5-4d0f-8503-8641dc9db264@f19g2000yqh.googlegroups.com...
>I have many problems on my applications that can be solved with this
> idea (a common compression interface).
> Before getting to work, I was looking for it and...here (http://
> www.ross.net/compression/interface.html), from the well known Ross
> Williams,  I have founded an attempt to create an Standard Interface
> for Data Compression Algorithms
> Wat happend with that? isn't a usefull idea? What do you think about?
> Exists any implementation of this or similar things?
>
> Poyo

IMO, likely apart from certain narrowly defined uses, something like this is 
not likely to be particularly useful or beneficial.

in the cases where it is useful, normally a specialized means is used to 
multiplex the algo.
in other cases, to make a multiplexed compression algo useful is likely to 
require a specialized set of headers, breaking byte-level compatibility with 
other versions of the same algo (as an example, what if we put a gzip header 
inside a PNG file? or neglect the header when giving deflated data to gzip 
or a gzip clone? what of this "new and improved" header? ...).

....


now, formats like AVI and MKV have used codec multiplexing to good effect, 
but these are examples of solving a specific problem with a specific 
solution, rather than trying to address the whole of the larger problem.

a more narrowly defined problem and solution would be:
here is a multiplexed stream format, use it as you will.

now, this no longer serves to try to coerce everyone else to go along (given 
this is how "standards" are percieved, people are "expected" for follow 
them, even if they are not particularly beneficial to a project), rather 
they will use it if infact it is useful to them.

or such...



0
Reply cr88192 3/29/2009 9:11:33 PM


On 29 mar, 18:11, "cr88192" <cr88...@hotmail.com> wrote:
> IMO, likely apart from certain narrowly defined uses, something like this is
> not likely to be particularly useful or beneficial.
> ...
> a more narrowly defined problem and solution would be:
> here is a multiplexed stream format, use it as you will.
>
> now, this no longer serves to try to coerce everyone else to go along (given
> this is how "standards" are percieved, people are "expected" for follow
> them, even if they are not particularly beneficial to a project), rather
> they will use it if infact it is useful to them.
>
> or such...

Maybe I have not expressed myself in the right way (english isn't my
native language).
I just want to have an Common API to make an Abstraction Layer that
can wrapper any data compression algorithm.
Maybe the Original concept has been misinterpreted due the name of
"Standard Interface for Data Compression Algorithms".

With something like this it's possible (and that it's my goal):
- a high degree of interchangeability of each, allowing to write
programs that select from one of many algorithms even at
runtime,depending on needs.
- to replace or add algorithms (with improved or new ones) at runtime
- assistance for cataloging, testing, benchmarking and comparing
algorithms.
- assistance for developing new algorithms

I don't want to discuss about Standards.
I referend this work just because it was disscused in 1991 by Ross
Williams, Jean-loup Gailly, Dan Bernstein and others.
I have no idea if this was implemented (or something, a library or
even another similar idea although not a standard).
Anyway I don't agree at all with el orignal document but I think that
there are good ideas.

Poyo
Regards
0
Reply poyo 3/30/2009 10:07:22 PM

"poyo" <poyocapo@gmail.com> wrote in message 
news:0db45fcc-5ea0-4584-979b-9464b70ed558@q16g2000yqg.googlegroups.com...
> On 29 mar, 18:11, "cr88192" <cr88...@hotmail.com> wrote:
>> IMO, likely apart from certain narrowly defined uses, something like this 
>> is
>> not likely to be particularly useful or beneficial.
>> ...
>> a more narrowly defined problem and solution would be:
>> here is a multiplexed stream format, use it as you will.
>>
>> now, this no longer serves to try to coerce everyone else to go along 
>> (given
>> this is how "standards" are percieved, people are "expected" for follow
>> them, even if they are not particularly beneficial to a project), rather
>> they will use it if infact it is useful to them.
>>
>> or such...
>
> Maybe I have not expressed myself in the right way (english isn't my
> native language).
> I just want to have an Common API to make an Abstraction Layer that
> can wrapper any data compression algorithm.
> Maybe the Original concept has been misinterpreted due the name of
> "Standard Interface for Data Compression Algorithms".
>
> With something like this it's possible (and that it's my goal):
> - a high degree of interchangeability of each, allowing to write
> programs that select from one of many algorithms even at
> runtime,depending on needs.
> - to replace or add algorithms (with improved or new ones) at runtime
> - assistance for cataloging, testing, benchmarking and comparing
> algorithms.
> - assistance for developing new algorithms
>
> I don't want to discuss about Standards.
> I referend this work just because it was disscused in 1991 by Ross
> Williams, Jean-loup Gailly, Dan Bernstein and others.
> I have no idea if this was implemented (or something, a library or
> even another similar idea although not a standard).
> Anyway I don't agree at all with el orignal document but I think that
> there are good ideas.
>

ok. if there is no need to multiplex, then a common approach is to use a 
linked list and function pointers.

so, one uses a function to select the appropriate handler struct for a given 
algo name;
then wrapper functions call the function pointers in the struct with the 
data;
the backend functions handle transferring all this to the relevant back-end 
encoder/decoder functions.

of course, to really "generalize" this is a problem (such as how much input 
until processing and producing output, ...).


though not quite the same, I use interfaces like this for implementing my 
VFS system, ...
other uses could include loading and saving various formats (such as 
graphical and audio formats), ...

although, in my case, the specifics are much more hidden (my VFS's frontend 
API looks mostly the same as the file IO API provided in 'stdio.h', ...).


so, one possible approach for such an API is to make it look like a "chained 
VFS", namely, one provides it with a virtual file descriptor representing 
the backend stream (buffer, file, socket, ...), and along with the codec and 
access mode, one gets a new virtual file descriptor representing the 
compressed stream.

although, I think in some cases I had already done similar in my VFS, where 
the codec was partly handled by embedding magic characters in the 'mode' 
string (for example, one could open a file as "rtz" or "wtz" to read or 
write text with deflate compression), but this particular detail was handled 
by the backend, and not by a generalized wrapper.

if I were to do something like this explicitly in a VFS, it would probably 
be as "filters":
VFILE *vffilter(VFILE *fd, char *filter, char *mode);

filter would identify the codec, and mode would indicate how the stream is 
desired to be accessed (r, w, r+, w+, rt, wt, ...). "r+" and "w+" based 
modes would be the default for read/write files (would likely assume a 
shared read/write state and buffers), but for sockets there could be "rw", 
which would be similar but assume a separate read and write state.
....

but, as is, I don't think anything like this is needed in my case.
also, doing compression/decompression in this way could be, in some cases, 
particularly resource intensive ("r+" and "w+" based modes would likely 
require keeping a buffered version of the uncompressed data either in 
memory, or in a specialized page-file).
....

this did bring up the random idea for a compressed filesystem format (as an 
alternative to read/write ZIP), where the filesystem is compressed in the 
external representation, but in it's buffered form, it would essentially 
resemble FAT32. when buffered in memory, clusters are uncompressed, but are 
deflated whenever a cluster moves out to disk, the cluster size being kept a 
little large mostly to provide more chance for compression (however, bigger 
clusters would reduce the amount of data kept in memory at any given time, 
where 4kB would allow effective use of memory but limited compression, 16kB 
would offer a tradeoff, and 64kB would allow maximizing deflates' 
effectiveness, but one can only have 16 clusters in-memory per MB of RAM 
used, so likely 16MB or more would be used for the file cache, allowing 256 
cached clusters...).

or such...


> Poyo
> Regards 


0
Reply cr88192 3/30/2009 11:18:03 PM

3 Replies
100 Views

(page loaded in 0.081 seconds)

Similiar Articles:













7/8/2012 7:31:51 AM


Reply: