JT Version Issues

Does anyone out there have a grasp on what’s going on with the JT file versions?

According to the files themselves, here are the versions of the JT files I have: 8.0 (lots), 8.2, 9.0, 9.1, 9.3, and 9.4. I get pretty decent results from most of the 8.0 files and the 9.0 files. Meanwhile, the two released versions of the JT file standard are labeled 8.1 and 9.5. In other words, I don’t have any JT files that are actually for the exact versions covered by the standard documents.

Meanwhile, ISO/DIS 14306 is an ISO version of the JT standard, apparently based on the 8.1 standard. But the document is 200+ pages longer than the JT 8.1 spec. I think it might include the XT spec, but that’s only approximately 125 pages in the latest version I have. (Though both JT and XT strike me as a bit on the vague side compared to STEP! Maybe they’ve actually defined how things work instead of just vaguely hinting at it.)

Does anyone out there have a solid grasp on the current state of JT? Are most people using 9.5? Or is the ISO standard going to freeze things back closer to 8.1?

(PS I’ve solved the Arithmetic Codec issue I was having, but I don’t have the brainpower tonight to sit down and describe the solution. Of course, now I’ve got a serious Huffman coding issue.)

Posted in Uncategorized | Leave a comment

Utterly Flummoxed By Arithmetic Coding

So, I’m pretty sure I understand the basic algorithm for arithmetic coding — the details may be fussy, but the overall concept is clear. But I’m completely stumped by the details of the JT version. You see, the format specifies that there may be 1 or 2 probability context tables. The algorithm is fairly clear if there is just one probability context table. But nowhere in the specification can I find out what the frak the second probability context table means. And of course, it seems that basically every arithmetic codec example in the file I’m looking at has that second probability context table.

So, you say, why not check the JT spec’s example source code? And in fact, we find this there:

class CntxEntry 
{ 
public: 
 
 Int32 iSym;          // Symbol 
 Int32 cCount;        // Number of occurrences 
 Int32 cCumCount;     // Cumulative number of occurrences 
 Int32 iNextCntx = 0; // Next context if this symbol seen 
}; 

Each Context Entry object has an index to the Next Context in it! Great!

Unfortunately, that iNextCntx symbol appears exactly twice in the source code — once when it is defined, and once when its value is used. Nowhere is the value set. Nor does it correspond to any obvious field in the file’s data structures. Nor do I see any obvious way to derive it.

For instance, here’s the first pair of probability context tables I see in the file I’ve been looking at:

1st count 2
symbol -2 occurs 817 times, value 0
symbol 1 occurs 32 times, value 0
2nd count 2
symbol 1 occurs 244 times, value 0
symbol -2 occurs 31 times, value 0

(Here -2 is the special symbol meaning “insert out-of-band value here”.)

Okay, there is one clue I see here. The symbol count field has value 1124, which is 817+32+244+31. What does the symbol count field mean?

When two Probability Context Tables are being used, Symbol Count specifies the number of Symbols to be decoded by the Arithmetic CODEC. There is a subtlety present in the method CodecDriver::addOutputSymbol() when it is passed an Escape symbol. Only if the Codec is using Probability Context Table 0 when it receives an Escape symbol does it emit a Value from the “Out-Of-Band” data array. Because of this subtlety, the number of Symbols decoded can be larger than the number of Values produced, thus the reason for writing this field distinct from Value Element Count.

And indeed, the value element count is 1093, which is 1124 - 31, so that fits. But it doesn’t really provide any additional insight to me. Unless….

Hmmm. What if it switches over to the 2nd context whenever it sees a symbol 1 in the 1st context (32 times), then stays in the 2nd context until it sees a symbol -2? That would make the numbers fit, and explain why you don’t output the out-of-band value if you hit a -2 in the 2nd context.

Update: Let this be a lesson to me: before I do anything with JT, I need to read fully on the subject on both versions of the spec. In the 9.5 version of the spec, the field marked “reserved” in the 8.1 spec has been relabeled… “next context”. Problem solved!

Though it does make you wonder why they send the number of bits needed for this field, when there are only two possible values for it…

Posted in Uncategorized | 5 Comments

… and problems

Where am I now?

Running what I’ve got now on the first ten TriStrip objects in the file I’m looking at gets me three simple triangle-based objects I can handle. It also gets me a bunch of — guess what? — objects compressed with Arithmetic coding rather than Huffman coding! Which means I get to do the hard bits all over again.

What’s worse, my current code can’t even parse the Arithmetic coded objects correctly. In theory, their layout in the file should be exactly the same as the Huffman coded objects. In practice, something is going wrong between reading the probability tables and the out-of-band data.

What’s even worse: I went looking in the JT 9.5 spec to see if I could find the missing fields there. And then I said, wait, is this an Int32 Compressed Data Packet Mk. 2 I’m looking at? So I tried to see which one the TriStrip object uses.

Whoops. JT 9.5 seems to no longer have a TriStrip object. Instead it has a completely different format for storing triangle information.

Well, I guess we’ll burn that bridge when we come to it…

Posted in Uncategorized | Leave a comment

Success!

A ladder!
The picture is the first TriStrip data I’ve managed to successfully read and process. I converted it to our internal mesh data structure, wrote it out as an STL file, read it into 3D-Tool, and generated the JPEG from there.

It did take me another day of maneuvering to get to this point, though it was all uninteresting bugs, like using newly created empty vectors for the vertex data instead of the carefully constructed data I’d just imported. For caution’s sake, my code is currently only set up to only process the first TriStrip object it finds; last time I checked there were still parsing errors when I tried to read in the entire file.

As for the meaning of the Primitive List Indices and Vertex Data Indices arrays: I’m currently interpreting the Primitive List Indices as a series of pointers into the Vertex Data Indicies, which themselves are pointers into the array of vertex data. So if the Primitive List Indices starts 0, 6, 12, 18, I interpret that as meaning that Vertex Data Indices 0 to 5 are one TriStrip, 6 to 11 are the next TriStrip, etc. This seems to work well so far!

Posted in Uncategorized | 1 Comment

Everything I thought I knew was wrong…

Pouring over the JT 9.5 spec, trying to figure out what I might have missed that could explain the weird numbers I get, I discovered that the Int32 Probability Contexts min_value field is prescriptive rather than descriptive. That is to say, its value has been subtracted from every associated value field, so to get the correct value, you need to add it back in.

It’s described as a U32 field, but looking at the values I’m getting make it clear it should actually be signed. Which means the associated value field should be signed, too. This is going to require some code wrangling, but should get me very different results than I’ve been getting so far. And of course, since the results I’ve been getting were murky and incomprehensible, that’s a Very Good Thing.

Update: BINGO! With improved min value / negative number handling (and Null for the prediction type), the Vertex Data Indices values range from 0 to 1307 — exactly the range I wanted to see! What’s more, the values make MUCH more sense. This I can work with.

Posted in Uncategorized | 1 Comment

What happened to Huffman?

Just decided it would make sense to check the JT 9.5 standard to see if it explained more or was somehow different. And guess what? Huffman coding has been written completely out of the spec. The CODEC types are now Bitlength, Arithmetic, and a new one, “Chopper”. Yet of course my sample JT files still have Huffman coding, so I have to implement it anyway. Argh.

Posted in Uncategorized | Leave a comment

Out of Band Experience

When I last left you, I was trying to figure out how to get the Vertex Data Indices to be reasonable values. In an ideal world, I’d be explaining how to do it now. In this world, though, all I’ve got is some serious confusion.

I think the problem may revolve around out-of-band values used with the Huffman codec. Basically, the idea is to use the sophisticated Huffman routine on just the most common values from the array you are encoding, and then have a special value encoded which says, “Look at the out of band array for the next value.” The out of band values are stored using another instance of the Int32CDP.

And that’s where the problem lies. The JT specification neglects to say what the prediction mode is for the out of band values. And none of them seem to make sense, based on the numbers I’m seeing.

First, logic seems to say that none of the values in the out-of-bounds array should be values also found in the Huffman tree — that would just be wasteful. Yet there they are, in most of the scenarios I’ve looked at.

So, let’s just try a few and see what the actual values are. If I do no prediction model at all, just taking the values in the out-of-bounds array at face value, I get indices ranging in value from 2 to 5048. Those values have the benefit of actually being all positive and not too ludicrously big. I don’t see how it makes any sense to start at 2 rather than 0. And 5048 is much bigger than the vertex array, which has 1308 vertices. Even dividing it by 3 still gives you 1682 or so. So this doesn’t seem like a great answer, but at least it’s not too insane.

Lag1 gives a range from -40439 to 1551. 1551 almost sounds sane, but -40439 is not an acceptable answer. Lag2 is -21744 to 1856… the latter seems tantalizingly close to 1854, which is the number of vertex data indices, but the negatives are a big negative. Stride1 starts at -3105930, which is right out, Stride2 at -836303. StripIndex is a more reasonable -9210 to 4478. Ramp goes from 2 to 31501, which I guess makes it the second most plausible.

Huh. I don’t really know what to make of all this information.

Update: I completely missed two prediction types! Xor1 gives us -2908 to 14187, Xor2 -671 to 12282. So neither one of those is it either.

Posted in Uncategorized | Leave a comment