COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### PPMII BinSumm table

• Email
• Follow

I have studied the paper "PPM: one step to practicality", which is
really interesting, and I have gone through PPMII Source code, which is
very complex to understand, and Especially,
InitBinEsc[]={0x3CDD,0x1F3F,0x59BF,0x48F3,0x64A1,0x5ABC,0x6632,0x6051}
array seems to be the seed for generation of BinSumm table

Can any one explain how Dmitry Shkarin estimated values for
InitBinEsc[].
I hope these table is responsible for better compression for textual
data.

- George Herring


 0

See related articles to this posting

George Herring wrote:
> I have studied the paper "PPM: one step to practicality", which is
> really interesting, and I have gone through PPMII Source code, which is
> very complex to understand, and Especially,
> InitBinEsc[]={0x3CDD,0x1F3F,0x59BF,0x48F3,0x64A1,0x5ABC,0x6632,0x6051}
> array seems to be the seed for generation of BinSumm table
>
> Can any one explain how Dmitry Shkarin estimated values for
> InitBinEsc[].
> I hope these table is responsible for better compression for textual
> data.

PPMII has been one of those state-of-the-art ppm implementations which
I am sure many must have had a hard time understanding (Me being one of
them).

I havent really gone into details of this part of code so cant offer
much help here.

If someone who has unravelled the mystery of this code can post a code
walkthrough or atleast a well commented version of the code, it will be
great.

Sachin Garg [India]
http://www.sachingarg.com


 0

Looks like an initialization of escape estimation from binary
contexts.  Dmitry's ppmd compressor is, I believe, aimed toward
"textual data" as he puts it.  I would assume that this means he is
initializing the escape estimation from binary contexts with values
generated from typical text sources.

Try setting all the values to 1 or 0 to see if it makes much of
a difference in compression ratios.  (assuming that doesn't cause
the encoder/decoder to fail).

- Michael

Sachin Garg wrote:
> George Herring wrote:
> > I have studied the paper "PPM: one step to practicality", which is
> > really interesting, and I have gone through PPMII Source code, which is
> > very complex to understand, and Especially,
> > InitBinEsc[]={0x3CDD,0x1F3F,0x59BF,0x48F3,0x64A1,0x5ABC,0x6632,0x6051}
> > array seems to be the seed for generation of BinSumm table
> >
> > Can any one explain how Dmitry Shkarin estimated values for
> > InitBinEsc[].
> > I hope these table is responsible for better compression for textual
> > data.
>
> PPMII has been one of those state-of-the-art ppm implementations which
> I am sure many must have had a hard time understanding (Me being one of
> them).
>
> I havent really gone into details of this part of code so cant offer
> much help here.
>
> If someone who has unravelled the mystery of this code can post a code
> walkthrough or atleast a well commented version of the code, it will be
> great.
>
> Sachin Garg [India]
> http://www.sachingarg.com


 0

Sachin Garg wrote:
>
> PPMII has been one of those state-of-the-art ppm implementations which
> I am sure many must have had a hard time understanding (Me being one of
> them).
>
> I havent really gone into details of this part of code so cant offer
> much help here.
>
> If someone who has unravelled the mystery of this code can post a code
> walkthrough or atleast a well commented version of the code, it will be
> great.
>

A few months ago I spent a while going over Dmitry's papers and
eventually got the Information Inheritance stuff working.

You can see my code that does PPM with the Information Inheritance trick
(but no SEE) here
http://dclib.sourceforge.net/entropy_encoder_model/entropy_encoder_model_kernel_5.h.html

I think the code is easier to understand than Dmitry's but it is still
complex :)

 0

I have also tried the code PPMII, which is especially implemented for
textual data. The dimension of BinSumm table is 25x64.  Each column is
further subdivided depending on sequence of symbols.

However, InitBinEsc values remain to be a mystery to me. I feel those
values have been obtained by an extensive experimental textual data.

-- Raja Nagulan


 0

Davis wrote:
> Sachin Garg wrote:
> >
> > PPMII has been one of those state-of-the-art ppm implementations which
> > I am sure many must have had a hard time understanding (Me being one of
> > them).
> >
> > I havent really gone into details of this part of code so cant offer
> > much help here.
> >
> > If someone who has unravelled the mystery of this code can post a code
> > walkthrough or atleast a well commented version of the code, it will be
> > great.
> >
>
> A few months ago I spent a while going over Dmitry's papers and
> eventually got the Information Inheritance stuff working.
>
> You can see my code that does PPM with the Information Inheritance trick
> (but no SEE) here
> http://dclib.sourceforge.net/entropy_encoder_model/entropy_encoder_model_kernel_5.h.html

Heh, coincidently, I had also implemented Information Inheritance in my
code, again without SEE :-)  (But that was a couple of years ago)

By how much did you scaled the inherited value, Dmitry's factor of 4
seems to be dependent on his SEE model.

> I think the code is easier to understand than Dmitry's but it is still
> complex :)

Another more interesting part of Dmitry's code is its speed and thatz
probably why all the complexity comes in. I still sometimes wonder how
much that memory manager really contributes to the speed.

Sachin Garg [India]
http://www.sachingarg.com


 0

Sachin Garg wrote:
> Davis wrote:
>
>>Sachin Garg wrote:
>>
>>>PPMII has been one of those state-of-the-art ppm implementations which
>>>I am sure many must have had a hard time understanding (Me being one of
>>>them).
>>>
>>>I havent really gone into details of this part of code so cant offer
>>>much help here.
>>>
>>>If someone who has unravelled the mystery of this code can post a code
>>>walkthrough or atleast a well commented version of the code, it will be
>>>great.
>>>
>>
>>A few months ago I spent a while going over Dmitry's papers and
>>eventually got the Information Inheritance stuff working.
>>
>>You can see my code that does PPM with the Information Inheritance trick
>>(but no SEE) here
>>http://dclib.sourceforge.net/entropy_encoder_model/entropy_encoder_model_kernel_5.h.html
>
>
> Heh, coincidently, I had also implemented Information Inheritance in my
> code, again without SEE :-)  (But that was a couple of years ago)
>
> By how much did you scaled the inherited value, Dmitry's factor of 4
> seems to be dependent on his SEE model.
>
In the scaling equation I believe I scaled by 5 where Dmitry used 4.
There were also a few other constants that I changed slightly from what
Dmitry used and for me it resulted in slightly better compression.  So
it seems that they do indeed depend on how the entire model works.

>
>>I think the code is easier to understand than Dmitry's but it is still
>>complex :)
>
>
> Another more interesting part of Dmitry's code is its speed and thatz
> probably why all the complexity comes in. I still sometimes wonder how
> much that memory manager really contributes to the speed.
>

Tell me about it.  I can't believe how fast and small it is :)

I haven't looked at his memory manager but I would imagine that
making system calls to get/free memory all the time would slow it
down a significant amount.

 0

Sachin Garg wrote:
> Davis wrote:
>
>>Sachin Garg wrote:
>>
>>>PPMII has been one of those state-of-the-art ppm implementations which
>>>I am sure many must have had a hard time understanding (Me being one of
>>>them).
>>>
>>>I havent really gone into details of this part of code so cant offer
>>>much help here.
>>>
>>>If someone who has unravelled the mystery of this code can post a code
>>>walkthrough or atleast a well commented version of the code, it will be
>>>great.
>>>
>>
>>A few months ago I spent a while going over Dmitry's papers and
>>eventually got the Information Inheritance stuff working.
>>
>>You can see my code that does PPM with the Information Inheritance trick
>>(but no SEE) here
>>http://dclib.sourceforge.net/entropy_encoder_model/entropy_encoder_model_kernel_5.h.html
>
>
> Heh, coincidently, I had also implemented Information Inheritance in my
> code, again without SEE :-)  (But that was a couple of years ago)
>
> By how much did you scaled the inherited value, Dmitry's factor of 4
> seems to be dependent on his SEE model.
>

In the scaling equation I believe I scaled by 5 where Dmitry used 4.
There were also a few other constants that I changed slightly from what
Dmitry used and for me it resulted in slightly better compression.  So
it seems that they do indeed depend on how the entire model works.

>
>>I think the code is easier to understand than Dmitry's but it is still
>>complex :)
>
>
> Another more interesting part of Dmitry's code is its speed and thatz
> probably why all the complexity comes in. I still sometimes wonder how
> much that memory manager really contributes to the speed.
>

Tell me about it.  I can't believe how fast and small it is :)

I haven't looked at his memory manager but I would imagine that
making system calls to get/free memory all the time would slow it
down a significant amount.  In my code I allocate all the memory
at the beginning and it seems to be a very worth while optimization.

 0

Davis King wrote:
> Sachin Garg wrote:
> > Davis wrote:
> >
> >>Sachin Garg wrote:
> >>
> >>>PPMII has been one of those state-of-the-art ppm implementations which
> >>>I am sure many must have had a hard time understanding (Me being one of
> >>>them).
> >>>
> >>>I havent really gone into details of this part of code so cant offer
> >>>much help here.
> >>>
> >>>If someone who has unravelled the mystery of this code can post a code
> >>>walkthrough or atleast a well commented version of the code, it will be
> >>>great.
> >>>
> >>
> >>A few months ago I spent a while going over Dmitry's papers and
> >>eventually got the Information Inheritance stuff working.
> >>
> >>You can see my code that does PPM with the Information Inheritance trick
> >>(but no SEE) here
> >>http://dclib.sourceforge.net/entropy_encoder_model/entropy_encoder_model_kernel_5.h.html
> >
> >
> > Heh, coincidently, I had also implemented Information Inheritance in my
> > code, again without SEE :-)  (But that was a couple of years ago)
> >
> > By how much did you scaled the inherited value, Dmitry's factor of 4
> > seems to be dependent on his SEE model.
> >
>
> In the scaling equation I believe I scaled by 5 where Dmitry used 4.
> There were also a few other constants that I changed slightly from what
> Dmitry used and for me it resulted in slightly better compression.  So
> it seems that they do indeed depend on how the entire model works.
>
> >
> >>I think the code is easier to understand than Dmitry's but it is still
> >>complex :)
> >
> >
> > Another more interesting part of Dmitry's code is its speed and thatz
> > probably why all the complexity comes in. I still sometimes wonder how
> > much that memory manager really contributes to the speed.
> >
>
>
> Tell me about it.  I can't believe how fast and small it is :)
>
> I haven't looked at his memory manager but I would imagine that
> making system calls to get/free memory all the time would slow it
> down a significant amount.

>In my code I allocate all the memory
> at the beginning and it seems to be a very worth while optimization.

Dmitry had probably used some standard memory managment technique and
trying to understand the code without knowing about it would be wasted
effort. If someone is familiar with that type of memory manager, it
might help in interpreting it.

Probably somewhere out there therez an online article describing that
memory manager. :-)

Sachin Garg [India]
http://www.sachingarg.com


 0

Sachin Garg wrote:

>
> Dmitry had probably used some standard memory managment technique and
> trying to understand the code without knowing about it would be wasted
> effort. If someone is familiar with that type of memory manager, it
> might help in interpreting it.
>
> Probably somewhere out there therez an online article describing that
> memory manager. :-)
>
>

Maybe he is doing this: http://www.boost.org/libs/pool/doc/concepts.html

Or something similar.  I haven't looked at his code though so I'm just
guessing.

 0

9 Replies
100 Views

Similar Articles

12/8/2013 5:34:33 PM
[PageSpeed]

Similar Artilces:

Itemization in Table
Hi! I have a latex-problem with an itemization in a tabular: there is always an empty line before the first item and after the last one. I have absolutly no idea why. How can I get rid of that empty lines? Thanks in advance, Daniel Here is some sample code that showes the abovementioned problem: \documentclass[a4paper,11pt,oneside]{book} \usepackage[latin1]{inputenc} \begin{document} \begin{table}[hbp] \def\rr{\rightskip=0pt plus1em \spaceskip=.3333em \xspaceskip=.5em\relax} \setlength{\tabcolsep}{1ex} \def\arraystretch{1.20} \setlength{\tabcolsep}{1ex} \small \begin{tabular}{|p{0.2\columnwidth}|p{0.75\columnwidth}|} \hline \multicolumn{1}{|c}{\em Col1} & \multicolumn{1}{|c|}{\em Col2} \\ \hline\hline {\rr Something} & {\rr \begin{itemize} \item{Item 1} \item{Item 2} \item{Item 3} \end{itemize} } \\ \hline {\rr Something} & {\rr \begin{itemize} \item{Item 1} \item{Item 2} \item{Item 3} \end{itemize} } \\ \hline \end{tabular} \end{table} \end{document} "Daniel Fabian" <strap@fabiand.net> schrieb: > Hi! > > I have a latex-problem with an itemization in a tabular: there is > always

Table in margin
Hi Table in Margin. \begin{table} \caption{} \begin{tabular}{lll} \\ \\ \end{tabular} \end{table} we need to set in margin. Is this possible. Am 02.04.2010 23:32, schrieb S Murugan: > \caption{} > \begin{tabular}{lll} > \\ > \\ > \end{tabular} Try the following example: %--------------------------- \documentclass[a4paper,twoside]{article} \usepackage[latin1]{inputenc} \usepackage[T1]{fontenc} \usepackage[ngerman]{babel} \usepackage[marginpar=4cm,includemp]{geometry} \usepackage{caption,marginnote,blindtext} \usepackage{showframe} \begin{document} \blindtext \marginnote{% \begin{minipage}{\marginparwidth} \captionof{table}{captiontext} \begin{tabular}{lll} foo1 & foo2 \\ bar1 & bar2 \\ baz1 & baz2 \end{tabular} \end{minipage} } \blindtext \newpage \blindtext \marginnote{% \begin{minipage}{\marginparwidth} \captionof{table}{captiontext} \begin{tabular}{lll} foo1 & foo2 \\ bar1 & bar2 \\ baz1 & baz2 \end{tabular} \end{minipage} } \blindtext \end{document} %--------------------------- ....Rolf

Width of a table
Hi all I would like to tell SPSS via syntax that the width of my table (using ctables, SPSS version 18) should not be broader than 12 cm. I found something for colums, but not for the whole table. Can someone help me? Thanks, Laszlo There is no direct way to do this, but there are two possibilities. Table properties include a setting to rescale wide tables to fit the page. = So you could set that via a tablelook. You can then control that indirect= ly by appropriate setting on your Page Layout. This might only affect the = print properties and pdf, though. The table sizing is probably not affecte= d in the Viewer window. A second way is to set the column widths so that the aggregate size is what= you want. There is some control for this in table properties and in CTABL= ES, but you can also set the columns widths and the row labels widths using= the SPSSINC MODIFY TABLES extension command available from the SPSS Commun= ity (www.ibm.com/developerworks/spssdevcentral). It would be easy to exten= d this command with a plugin scriptlet that took the total width as a param= eter and distributed the resizing over the columns to achieve the intended

Table Overwrite
I wasn't able to figure out the vocab to search for this on usenet. I'm sure it's an easy solution, but I have no experience with SQL Server: Situation: tbl_CompanyData is a 1735 x 20 table with a CompanyID key in the first column tbl_MergedCompanyData is a 1735 x 20 table imported from Excel. We found it much easier to enter data into an Excel file. Problem: Keep the CompanyID field in tbl_CompanyData, but replace all the other 19 rows with data from tbl_MergedCompanyData, which contains all the tbl_CompanyData PLUS new data that users filled in. At present, the rows don't match up, but I suppose I could pre-sort the tables. Preferrably, any solution would be smart enough to find CompanyID N in tbl_MergedCompanyData and replace the data in the fifth column in tbl_CompanyData where CompanyID is N with data from the fifth column in tbl_MergedCompanyData. Any thoughts would be appreciated. Please let me know if I can clarify my problem. Thank you, Ryan UPDATE tbl_CompanyData SET col5 = (SELECT col5 FROM tbl_MergedCompanyData WHERE companyid = tbl_CompanyData.companyid) Remember, a table isn't a spreadsheet

Align a table
Hi all, I am trying to align a table within a web page both horizontally and vertically so it appears centred on the web browser. Not the contents of the table but the table itself. Any suggestions, Thanks, Sketcher Sketcher said: > >Hi all, > >I am trying to align a table within a web page both horizontally and >vertically so it appears centred on the web browser. Not the contents of >the table but the table itself. > >Any suggestions, That's an HTML (or Style Sheet) question. It's not something you should be doing with Javascript. In article <BQ8lc.6408qP2.14164@news.indigo.ie>, sketcher@eircom.net enlightened us with... > Hi all, > > I am trying to align a table within a web page both horizontally and > vertically so it appears centred on the web browser. Not the contents of > the table but the table itself. > > Any suggestions, > Use CSS. See comp.infosystems.www.authoring.stylesheets over that way ==> -- -- ~kaeli~ Dijon vu - the same mustard as before. http://www.ipwebdesign.net/wildAtHeart http://www.ipwebdesign.net/kaelisSpace Cloning a table In my .NET code, with SQL Server 2000 as the backend, I need to create an exact copy, without data, of an existing table in the same database, that the user selects from a combo box. Any ideas on how to do this? You can create a copy without constraints or indexes using SELECT ... INTO. Specify a WHERE clause to exclude data too: SELECT * INTO MyNewTable FROM MyTable WHERE 1 = 0 Note that the user will need CREATE TABLE permissions. This isn't the kind of thing one usually does in database applications though. -- Hope this helps. Dan Guzman SQL Server MVP "Michael Jackson" <michaeldjackson@cox.net> wrote in message news:%MsXd.6532Wy.1516@okepread02... > In my .NET code, with SQL Server 2000 as the backend, I need to create an > exact copy, without data, of an existing table in the same database, that > the user selects from a combo box. Any ideas on how to do this? >

Create Table?
Hi, I have a table that has the following fields; LName, FName and MName. Also there are other fields in the table. I need to create another table with those fields combined as "Trim([LName] & " " & [FName] & " " & [MName]). How can I create this table? Thanks in advance... GCM GCM wrote: > Hi, > > I have a table that has the following fields; LName, FName and MName. > Also there are other fields in the table. I need to create another > table with those fields combined as "Trim([LName] & " " & [FName] & " > " & [MName]). How can I create this table? > > Thanks in advance... > GCM SELECT (Trim(LName) & ' ' & Trim(FName) & ' ' & Trim(MName)) AS FullName INTO NewTable FROM ExistingTable WHERE ... "DFS" <nospam@dfs_.com> wrote in message news:49Nsg.89047\$qd2.45685@bignews6.bellsouth.net... > GCM wrote: >> Hi, >> >> I have a table that has the following fields; LName, FName and MName. >> Also there are other fields in the table. I need to create another >> table with those fields

mileage table
I have a few applications that calculate the mileage from one city or zipcode to another using the best routes. The main problem I have with those programs is that I can only enter one city at a time. I am in need of a table that has all the mileage figured out so I can enter it into a database and eventually do as hundreds of cities at once. Does anyone have a table like that or know where I might find one. Shawn Yates wrote: > I have a few applications that calculate the mileage from one city or > zipcode to another using the best routes. The main problem I have with > those programs is that I can only enter one city at a time. I am in > need of a table that has all the mileage figured out so I can enter it > into a database and eventually do as hundreds of cities at once. Does > anyone have a table like that or know where I might find one. > I went to http://worldatlas.com/aatlas/infopage/howfar.htm It states How Far Is It? (between cities) To determine the distance between two significant cities of the world, the method (shown below) uses data from the US Census Bureau and a supplementary list of cities from around the world to find the latitude

Vector Table
Hey everyone, I know this is a stupid question, BUT, i have read both the c++ books i have at home, and have searched through several websites...but i still can't find the answer. The question is, is there such a thing called "Vector Table" that is already a function in C++? By function i mean, like array. I have written some code that would create the Vector Table, but i was just wondering if there was a simpler way? Thanx in advance. "kittykat" <f_arikat@nospam.hotmail.com> wrote: > Hey everyone, > I know this is a stupid question, I wouldn... a (standard library) function in C++. So "Vector Table" and "array" are similar ("like") in that neither is a function in C++. > I have written some code that would create the Vector Table, but i was > just wondering if there was a simpler way? I wouldn't be suprised. But you'll have to explain what your code does. > Thanx in advance. This is a dangerous practice -- you really should read what I've written before thanking me ;-) Jonathan :) i mean, is there a variable, like array, where you can write something like vecTable[12], and that would

xml to table
Hello, I know this code will convert on the fly an XML document to a table: SELECT xml.extract('//EMPNO/text()') as empno, xml.extract('//ENAME/text()') as ename, xml.extract('//JOB/text()') as job, xml.extract('//MGR/text()') as mgr, xml.extract('//HIREDATE/text()') as hiredate, xml.extract('//SAL/text()') as sal, xml.extract('//DEPTNO/text()') as deptno from table(xmlSequence(extract(xmlType( '<?xml version = ''1.0''?>' || '<ROWSET>' || ' <ROW num="1">' || ' <EMPNO>7369</EMPNO>' || ' <ENAME>SMITH</ENAME>' || ' <JOB>CLERK</JOB>' || ' <MGR>7902</MGR>' || ' <HIREDATE>12/17/1980 0:0:0</HIREDATE>' || ' <SAL>800</SAL>' || ' <DEPTNO>20</DEPTNO>' || ' </ROW>' || ' <ROW num="2">' || ' <EMPNO>7499</EMPNO>' || ' <ENAME>ALLEN</ENAME>'