XADMaster
XADMaster copied to clipboard
Improve DiskDoubler support
Originally reported on Google Code with ID 149
The version of Unarchiver I just downloaded does not seem to decompress
files compressed by DiskDoubler 3.7.7. It returns a message with "Unknown
data format". if one continues, a blank decompressed file is produced.
I am running Mac OSX 10.5.6. I see that the Preferences includes a check
box for
DiskDoubler Archive extensions dd, DD.
Is there something I am doing wrong or does Unarchiver not decompress
DiskDoubler 3.7.7 files? If so, any suggestions on how to decompress these
files?
MANY thanks,
S.
paracelsus:
Some of those seem to not be compressed at all, but a few of them are apparently in
some really old DiskDoubler format which I have not seen before. Those should come
in useful! Thanks!
muellerjh:
Here are some older Disk Doubler files
* *Attachment: [Age.chap.revised](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/Age.chap.revised)* * *Attachment: [Age_chap_revised](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/Age_chap_revised)* * *Attachment: [Faw.self&laterality(APS.89)](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/Faw.self&laterality(APS.89))* * *Attachment: [Kausler.retirement](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/Kausler.retirement)* * *Attachment: [robertson-ldrc-sept](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/robertson-ldrc-sept)* * *Attachment: [Sam.Brown](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/Sam.Brown)* * *Attachment: [SplitSelf-Brain&Cog.draft](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/SplitSelf-Brain&Cog.draft)* * *Attachment: [winchester-computer-may98](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-58/winchester-computer-may98)*
d235j.1:
Method 2 seems to be very effective on Mac applications and the like, but very ineffective
on text. Probably a custom algorithm designed for MC68000-type code, and similar data?
As for Method 5, I'm hunting for some versions of DiskDoubler that might support it.
May take some time though.
paracelsus:
All right, I have some working code for method 2. Method 5 might also work, but I obviously
haven't been able to test it. All the test cases for method 2 unpack correctly, though.
(I really don't understand method 2 at all, even though the code is extremely simple.
It doesn't look like anything I've ever seen before. It's some kind of adaptive Huffman-like
encoding with first-order contexts, working on individual bytes with no LZ encoding
at all? Weird.)
paracelsus:
Yeah, I was working on the Linux version earlier. It's added now, I forget if I pushed
it yet though.
d235j.1:
Seems like you forgot to add XADXORSumHandle.* to the Xcode project.
d235j.1:
I can't seem to generate any more types of files. Probably need to find AutoDoubler
1.x and DiskDoubler 2.x, as well as 3.1 to 3.7.2.
This decompression code works though. Thanks!
paracelsus:
Yes, they are supported in The Unarchiver too. They basically just wrap several regular
DiskDoubler files into a single file, with some extra headers for filenames and such.
If the app creating that has any further options for compression methods, it could
be interesting checking those out.
d235j.1:
It appears that DiskDoubler supported multifile archives (one archive containing multiple
files). I will have to test these.
d235j.1:
AutoDoubler may have used those weird formats to compress resources within extensions
and control panels. I shall do further investigation.
paracelsus:
Ideally I'd like to support any format that has been in actual use, so filing a bug
is probably a good idea. However, the main problem is that the entire architecture
of the app isn't designed to handle files that actually use multiple forks for their
compression data. I'd have to figure out some way to deal with that, which I don't
think it would be very high priority. But it's still good to have the files and the
idea here for future reference.
d235j.1:
l'll file a ticket if you think it's worthwhile.
d235j.1:
I will produce some AutoDoubler samples later today. I don't think I have AD 1.0 but
I do have several older versions.
It would be nice if the format of Disk Copy 6 image compression was reversed. This
is quite annoying since it uses a 'bcem' resource to store what appears to be the compression
dictionaries. A simple NDIF Compressed to NDIF converter would be quite useful. ShrinkWrap
is another format that uses these 'bcem' resources but with its own algorithms (which
Disk Utility cannot handle). Just a thought.
paracelsus:
Also, from the other samples, I've so far gathered three things: There's a bug in recognizing
really old DiskDoubler files, which I fixed. There's also a bug in unpacking really
big Compress files, which I still need to fix. And there are some method 2 files in
there, which will be useful if I manage to reverse engineer the format (I started on
that some time back, but didn't bother going very far since I lacked examples).
Still missing methods 3, 4 and 5 (unless some of those are in here and I missed them
due to crashes). 5 should be the same as 2 with a different parameter, but 3 and 4
are entirely different. I think they might be very bad but fast algorithms, so maybe
only old AutoDoubler uses them, or something? Not sure exactly how one would get a
hold of those.
paracelsus:
Ok, I think I have code that handles LZS now. The checksumming is pretty limited for
this format (just a XOR of all bytes) but all the files I have pass it now.
Code for that is checked in, you'd have to build it yourself if you want to test, though.
Going to look at the other samples next.
d235j.1:
Attached are testcases for DD 3.7.2. I also have 3.7.7 but I'm quite sure it will generate
the same format data.
The Sigma card output is from 3.7.7.
DD 4.0 formats (DD 1, DD 2, and DD 3) are well supported. AD 1 and 2 seem to be, though
I haven't been able to test either.
* *Attachment: [dd-3.7.2-a.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-44/dd-3.7.2-a.zip)* * *Attachment: [dd-3.7.2-b.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-44/dd-3.7.2-b.zip)*
d235j.1:
Attached are testcases for DD 3.1. These are probably of the same type as DD 3.0.1.
* *Attachment: [dd-3.1-a.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-43/dd-3.1-a.zip)* * *Attachment: [dd-3.1-b.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-43/dd-3.1-b.zip)*
d235j.1:
Attached are testcases for DD 3.0.1. This one allowed selection of method A or B.
* *Attachment: [dd-3.0.1-A GOOD.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-42/dd-3.0.1-A GOOD.zip)* * *Attachment: [dd-3.0.1-B GOOD.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-42/dd-3.0.1-B GOOD.zip)*
d235j.1:
Attached are testcases for DD 1.0. It was impossible to use the testcases you provided
for 1.0 Method 2, because Disk Doubler 1.0 tries both methods and uses whichever produces
the smaller file.
* *Attachment: [dd-1.0-A GOOD.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-41/dd-1.0-A GOOD.zip)* * *Attachment: [dd-1.0-B GOOD.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-41/dd-1.0-B GOOD.zip)*
d235j.1:
Here are the test cases.
* *Attachment: [testcases_dd.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-40/testcases_dd.zip)*
paracelsus:
Well, I finally made some progress: It is LZS, but with some kind of extra header, and
with all bytes inverted before and after compression for no good reason. I think it
may also be separating the data into blocks.
I'll keep hacking at it. The test cases will still be useful to confirm the code really
works, once I get that far.
d235j.1:
A document I found claims that what you're calling DD B was originally DD C and DD B
had been dropped. I'll see if I can find the document. Ideally, I'd need to find an
older version of DD.
d235j.1:
I'm looking for old versions of DD.
The chip on the card is a Stac 9703PC4.
I'll run the testcases as soon as possible. Size isn't an issue, as long as it fits
on a Zip disk :)
paracelsus:
Oh, that is interesting. I put together eight fields of various complexity to run through
it. The last ones are a little big, if that turns into a problem then ignore them.
If you have an old version of DD running, could you also see if you can get it to produce
files with other algorithms? Currently I am lacking test cases for methods 2, 3, 4
and 5. No idea what DD might call those, but 2 and 5 are variations on the same one,
and 3 and 4 might be an RLE algoithm used by PackIt, and some kind of Huffman algorithm.
Also looking forward to any test cases for anything else that isn't working yet. Can't
guarantee I'll get it all working, but without files I can't even try.
* *Attachment: [testcases.zip](https://storage.googleapis.com/google-code-attachments/theunarchiver/issue-149/comment-36/testcases.zip)*
d235j.1:
I have a card, and it's in a working Mac. Compression with the card is over ten times
faster than with software.
If you provide me with an archive of test pattern files, I can compress each one.
Also I have some .SIT files which The Unarchiver chokes on, as well as some self-extracting
archives of various types. I'll eventually open tickets for those.
paracelsus:
I suspect it is LZS. The Sigma cards supposed used the Stac 9703 chip, and according
to a paper I found, the Stac 9703 should implement LZS.
However, the data in the file is not valid LZS. DiskDoubler is kind of notorious for
obfuscating its data in different ways, so it's probably just that. Haven't figured
out how it's working yet, though. The ancient macutils package has an implementation
for this data format, but it is marked as "untested" and does not seem to actually
work either.
Do you have any more files in this format? And do you have the unpacked versions of
any of them? Both would be very useful.
d235j.1:
Perhaps it's just LZS: http://en.wikipedia.org/wiki/Lempel–Ziv–Stac
d235j.1:
It's hard to say, since this was by Stac, it may have been used in Stacker and possibly
Microsoft DoubleSpace (which they ripped off from Stac).
paracelsus:
Seems it was used elsewhere too and is simple and documented. I'll have to have a look,
it might be easy to support.
paracelsus:
Interesting. Is the algorithm documented anywhere?