So, I’m pretty sure I understand the basic algorithm for arithmetic coding — the details may be fussy, but the overall concept is clear. But I’m completely stumped by the details of the JT version. You see, the format specifies that there may be 1 or 2 probability context tables. The algorithm is fairly clear if there is just one probability context table. But nowhere in the specification can I find out what the frak the second probability context table means. And of course, it seems that basically every arithmetic codec example in the file I’m looking at has that second probability context table.
So, you say, why not check the JT spec’s example source code? And in fact, we find this there:
Int32 iSym; // Symbol
Int32 cCount; // Number of occurrences
Int32 cCumCount; // Cumulative number of occurrences
Int32 iNextCntx = 0; // Next context if this symbol seen
Each Context Entry object has an index to the Next Context in it! Great!
Unfortunately, that iNextCntx symbol appears exactly twice in the source code — once when it is defined, and once when its value is used. Nowhere is the value set. Nor does it correspond to any obvious field in the file’s data structures. Nor do I see any obvious way to derive it.
For instance, here’s the first pair of probability context tables I see in the file I’ve been looking at:
1st count 2
symbol -2 occurs 817 times, value 0
symbol 1 occurs 32 times, value 0
2nd count 2
symbol 1 occurs 244 times, value 0
symbol -2 occurs 31 times, value 0
(Here -2 is the special symbol meaning “insert out-of-band value here”.)
Okay, there is one clue I see here. The symbol count field has value 1124, which is
817+32+244+31. What does the symbol count field mean?
When two Probability Context Tables are being used, Symbol Count specifies the number of Symbols to be decoded by the Arithmetic CODEC. There is a subtlety present in the method CodecDriver::addOutputSymbol() when it is passed an Escape symbol. Only if the Codec is using Probability Context Table 0 when it receives an Escape symbol does it emit a Value from the “Out-Of-Band” data array. Because of this subtlety, the number of Symbols decoded can be larger than the number of Values produced, thus the reason for writing this field distinct from Value Element Count.
And indeed, the value element count is 1093, which is
1124 - 31, so that fits. But it doesn’t really provide any additional insight to me. Unless….
Hmmm. What if it switches over to the 2nd context whenever it sees a symbol 1 in the 1st context (32 times), then stays in the 2nd context until it sees a symbol -2? That would make the numbers fit, and explain why you don’t output the out-of-band value if you hit a -2 in the 2nd context.
Update: Let this be a lesson to me: before I do anything with JT, I need to read fully on the subject on both versions of the spec. In the 9.5 version of the spec, the field marked “reserved” in the 8.1 spec has been relabeled… “next context”. Problem solved!
Though it does make you wonder why they send the number of bits needed for this field, when there are only two possible values for it…