f



Re: A Random Data Compression method -draft #4

Another interesting idea of move to front is instead of encoding value's 
like zero's.... and then later passing huffman over it...

How about directly outputting codes...

Instead of a value table and position table and such... there would be an 
encoding table for the move to front...

the front positions would have the shortest codes... and the later positions 
longer codes...

Decoding happens in opposite... longer codes are looked up... the value 
outputted... and then moved to the front to get a shorter code next time...

Added benefit of this new idea is that outputted codes immediatly get 
shorter codes on the next try...

while with static huffman and perhaps even dynamic huffman it would still 
depend on the ultimate frequencies but not so with move to front...

Hmm kinda interesting... this could also be combined with bubble to front... 
for a quicker implementation but would probably be less compression 
efficient....

Only problem remaining is what prefixes/codes to give the the element of the 
move to front ? that's where dynamic huffman uses a tree to adept to 
frequencies...

and that's why huffman works nicely normally...

With move to front the prefixes could be a simple list for starters:

always ends with a one.

0
01
001
0001
00001
000001

Other idea:

reserve position 0 as a special pattern/compression pattern.

move to front position zero is outputted as 0.
move to front position x is outputted as 1 + value.

This way re-occuring patterns always get the short 0 code.

Other variations could be:

position 0 is outputted as 1
position 1 is outputted as 01
position 2 is outputted as 001
position 3 is outputted as 0001
position 4 is outputted as 00001
position 5 is outputted as 000001
position 6 is outputted as 0000001
position 7 and higher is outputted as 00000001 + value.

This Skybuck's Universal Code can be used for the prefixes.

Savings are achieved when positions 0 to 6 are hit.

Bits are doubled for 7 and higher.

For position 7 and higher cases... perhaps a huffman code could be used... 
but then again that might be used for position 1 to 6 as well... and 
scrapping the idea. and only using position 0.


Let's see original idea:

Random input:

1  2  3  4  5  6  7  8  9  10 11 12
11 10 10 01 01 10 11 01 01 01 01 01

2 bit
MoveToFrontTable
[00,01,10,11]
[0, 101, 110, 111]

First input:
11

First output:
111

Table changed:
[00,01,10,11]
[100,101,110, 0]

Remarkable discovery: move to front is not necessary to move table 
forward... why would you do that ? it apprently assumes values close to each 
other but for random probablt doesn't make sense... so I am just gonna swap 
;) :) like he says ;)
Now I am just gonna modify it... add a bit in front of it and zero the other 
one ;)

Second input:
10

Second output:
110

Table changed:

[00,01,10,11]
[100,101,0, 111]

Third input:
10

Third output:
0

Table stay same.

Fourth input:
01

Fouth output:
101

Table changed:
[00,01,10,11]
[100, 0, 110, 111]

Well it's easy to see where this goes:

Output5: 0
Output6: 110
Output7: 111
Output8: 101
Output9: 0
Output10: 0
Output11: 0
Output12: 0

Final result:
111110010101101111010000

Compared to input:
111010010110110101010101

No saving so far ;)

Perhaps table can be encoded more efficient:

0 is reserved for least occurence.
10 is for second
110 is for third
111 is for fourth...

using this new table... it should be possible to get some kind of saving.

Reading 0 means efficient code.
Reading 1 means read next, if next is zero stop.
Reading 1 means read next, if next is one continue.
Reading 1 means read next, if next is one continue, if next is zero stop.
Reading 1 means read next, if next is one continue, if next is one stop.

So decoding this without conflict should be possible.

Not gonna do a full example again... I've seen enough ;)

Maybe later I gotta go eat ;)

Bye,
  Skybuck.







0
Skybuck
1/10/2011 7:26:07 PM
comp.compression 4695 articles. 0 followers. Post Follow

0 Replies
607 Views

Similar Articles

[PageSpeed] 36

Reply: