Storing binary images/files

  • Follow


I have looked at the archives, and while I probably missed it, I can't find anything on the specific issue I have at hand, namely:

I have to deal with millions of individual files (images). I have been able to reduce the file sizes to something between 200 and 500 bytes. The minimum size that a file can have on my Dell Desktop using Windows 7 is 4k. I don't know if that's a Windows requirement, or from whence that 4k minimum derives.

When paying for storage of millions of images, there is a GREAT deal of difference in cost between storing 4k per file and the 200-to-500 bytes per file of even the 8-bit-deep black-and-white files I currently have. By the same token, there is no value in spending energy on "getting to binary" if I am 'locked' to storage blocks of 4k.

I have looked at the "bwpack" function, but I'm not sure that does anything for me outside the MatLab environment.

Here are my questions:

Is there a method within MatLab to STORE images/matricies/files as true binary?
Can that storage mechanisim overcome 4k minimum storage size that my OS seems to impose?

Thanks,
0
Reply Paul 5/21/2010 9:54:05 PM

Paul Skvorc wrote:

> I have to deal with millions of individual files (images). I have been 
> able to reduce the file sizes to something between 200 and 500 bytes. 
> The minimum size that a file can have on my Dell Desktop using Windows 7 
> is 4k. I don't know if that's a Windows requirement, or from whence that 
> 4k minimum derives.

The NTFS cluster size is 4 Kb for filesystem sizes up to 16 Tb. That isn't 
something you can change.
http://support.microsoft.com/kb/140365

> When paying for storage of millions of images, there is a GREAT deal of 
> difference in cost between storing 4k per file and the 200-to-500 bytes 
> per file of even the 8-bit-deep black-and-white files I currently have. 
> By the same token, there is no value in spending energy on "getting to 
> binary" if I am 'locked' to storage blocks of 4k.

> Is there a method within MatLab to STORE images/matricies/files as true 
> binary?

What do you mean by "true binary" ? You can write whatever you want using fwrite()

> Can that storage mechanisim overcome 4k minimum storage size that my OS 
> seems to impose?

Have you considered using zip or tar?

http://www.mathworks.com/access/helpdesk/help/techdoc/ref/zip.html
0
Reply Walter 5/21/2010 10:13:10 PM


Walter Roberson <roberson@hushmail.com> wrote in message <ht70j7$h0l$1@canopus.cc.umanitoba.ca>...
> Paul Skvorc wrote:
> 
> > I have to deal with millions of individual files (images). I have been 
> > able to reduce the file sizes to something between 200 and 500 bytes. 
> > The minimum size that a file can have on my Dell Desktop using Windows 7 
> > is 4k. I don't know if that's a Windows requirement, or from whence that 
> > 4k minimum derives.
> 
> The NTFS cluster size is 4 Kb for filesystem sizes up to 16 Tb. That isn't 
> something you can change.
> http://support.microsoft.com/kb/140365
> 
> > When paying for storage of millions of images, there is a GREAT deal of 
> > difference in cost between storing 4k per file and the 200-to-500 bytes 
> > per file of even the 8-bit-deep black-and-white files I currently have. 
> > By the same token, there is no value in spending energy on "getting to 
> > binary" if I am 'locked' to storage blocks of 4k.
> 
> > Is there a method within MatLab to STORE images/matricies/files as true 
> > binary?
> 
> What do you mean by "true binary" ? You can write whatever you want using fwrite()
> 
> > Can that storage mechanisim overcome 4k minimum storage size that my OS 
> > seems to impose?
> 
> Have you considered using zip or tar?
> 
> http://www.mathworks.com/access/helpdesk/help/techdoc/ref/zip.html

>The NTFS cluster size is 4 Kb for filesystem sizes up to 16 Tb. That isn't 
>something you can change.
>http://support.microsoft.com/kb/140365

Thanks. I suspected so.

>Have you considered using zip or tar?
>http://www.mathworks.com/access/helpdesk/help/techdoc/ref/zip.html

It's not been my experience that zipping individual, small, black-and-white files results in any meaningful reduction in size. Furthermore, the time overhead of unzipping individual files would completely outwiegh what small gain might be realized in size. These comments of course assuming you're referring to ".zip" files. I don't know what "tar" is, but any "operation", like compression, is going to be time prohibitive for processing individual files.

> What do you mean by "true binary" ? You can write whatever you want using
>fwrite()

I haven't experimented much with fwrite(). Do I understand you correctly that one can, using fwrite():
1) Generate a file that is one bit deep, where:
2) A cell/pixel value is either "0" or "1", rather than 8 bits deep where "black" is 0 and "white" is 255, and in doing so,
3) "Save" seven bits of "space" for every cell/pixel, thereby,
5) Improving storage (not considering the 4k block floor) by 7 parts of 8?
If so, that would be a "big deal" and I will have to invest some time to investigate and become more familiar with fwrite.

Thank you for your help.
Paul
0
Reply Paul 5/24/2010 8:48:07 AM

Paul Skvorc wrote:
> Walter Roberson <roberson@hushmail.com> wrote in message 
> <ht70j7$h0l$1@canopus.cc.umanitoba.ca>...
>> Paul Skvorc wrote:

>> > When paying for storage of millions of images, there is a GREAT deal 
>> of > difference in cost between storing 4k per file and the 200-to-500 
>> bytes > per file of even the 8-bit-deep black-and-white files I 
>> currently have.

>> Have you considered using zip or tar?

> It's not been my experience that zipping individual, small, 
> black-and-white files results in any meaningful reduction in size.

zip and tar are both methods of bundling multiple original files into 
one file. This allows you to make efficient use of the storage overhead, 
needing only one round-up per group of files rather than per-file.

> Furthermore, the time overhead of unzipping individual files would 
> completely outwiegh what small gain might be realized in size. These 
> comments of course assuming you're referring to ".zip" files. I don't 
> know what "tar" is, but any "operation", like compression, is going to 
> be time prohibitive for processing individual files.

tar does not do any compression. zip can store files uncompressed.
The experiments I have heard of have shown that for larger files, the 
reduction in disk I/O for using compressed files is a definite win, 
gaining you more than you lose in processing time for the uncompression, 
at least when streaming decompression is done.

>> What do you mean by "true binary" ? You can write whatever you want using
>> fwrite()

> I haven't experimented much with fwrite(). Do I understand you correctly 
> that one can, using fwrite():
> 1) Generate a file that is one bit deep, where:
> 2) A cell/pixel value is either "0" or "1", rather than 8 bits deep 
> where "black" is 0 and "white" is 255, and in doing so,
> 3) "Save" seven bits of "space" for every cell/pixel, thereby,
> 5) Improving storage (not considering the 4k block floor) by 7 parts of 8?
> If so, that would be a "big deal" and I will have to invest some time to 
> investigate and become more familiar with fwrite.

No, it is not possible in any of the operating systems that Matlab 
supports to create a file that is "one bit deep". All of the operating 
systems do disk I/O on a minimum of one byte. For example, you cannot 
fseek() to bit #193 of a file, as would be the case if "one bit deep" 
files were supported.

However, fwrite() and fread() in Matlab support reading or writing bits, 
using the 'bitN' "precision" specifier. I have not experimented with it 
myself, but the documentation does _imply_ that it unpacks bytes into 
bits. The documentation does not mention any limitations about having to 
read or write an entire byte at a time, but I would want to experiment 
with that to be sure. If you are storing an entire logical array then 
you probably would not really care about whether Matlab is able to 
internally position at bit boundaries, just that it packs and unpacks 
the entire array correctly.

fwrite(fid, LogicalArray, 'ubit1')
0
Reply Walter 5/24/2010 3:14:04 PM

3 Replies
177 Views

(page loaded in 0.07 seconds)

Similiar Articles:













7/24/2012 1:49:17 AM


Reply: