XADMaster
XADMaster copied to clipboard
Add support for SHK (Apple IIgs archives)
Originally reported on Google Code with ID 119
Can you add the SHK archive for Apple II / IIgs files?
thanks
Not sure what happened to this, but if anyone is working on it, there's info about NuLib
(a library by Andy McFadden) here, which has been actively developed in recent years:
http://nulib.com/
https://github.com/fadden/nulib2
paracelsus:
All right, the program flow basically goes like this (don't rely on me getting all the
method names correct here):
* Client program requests an archive parser for a file.
-> XADArchiveParser reads the first part of the file, and then goes through the list
of available parsers, running their detection method, until it finds one that matches.
* Client program sets a delegate for the parser, which will receive all the information
from the parser.
* Client program calls [parser parse] to start parsing the archive.
-> The subclass starts scanning through the archive.
-> The subclass builds dictionaries for each entry with as many of the standard keys
it can figure out, and whatever else it needs to keep around for later (like compression
methods and settings, or anything else it wants to make optionally available).
-> The subclass calls [self addEntryWithDictionary:] for each entry it has created.
This causes some further processing of the dictionary to automatically fill in certain
details, and then the dictionary is passed to the client program's delegate class.
* The client program's delegate class either stores the dictionary for later, or it
calls [parser handleForDictionary:] to receive a handle to decompress the file.
-> The subclass may therefore still be calling [self addEntryWithDictionary:] when
its [self handleForDictionary:] is called, or it may happen after parsing has ended.
-> The subclass instantiates a suitable decompression handle for the entry it is given
in handleForDictionary:
-> The subclass usually uses a convenience method to create a sub-handle that reads
just from the part of the file that contains the data stream, like handleAtDataOffsetForDictionary
which uses the XADDataOffset and XADDataLength keys to identify the area to use.
* The client program reads data from the handle.
-> When addEntryWithDictionary: finally returns, the file pointer has probably moved,
because of the reading. Either be aware of this, or use the retainFilePosition: argument
to have it automatically restored.
* Once the archive is entirely parsed, the subclass exits its parse method. The client
might call handleForDictionary: after this point, too.
As for the other issues:
* rawHandleForEntryWithDictionary: is used by XADMacArchiveParser, which is a convenience
class for archives that may contain ditto or MacBinary-format files. SHK is not one
of those, so you should not use it. The name is a bit confusing, it's not used for
ALL Mac archive format, but more likely for non-Mac archive formats using tricks for
Mac files. Implement handleForDictionary: instead.
* It is highly discouraged to look at file extensions at all. XADMaster supports unpacking
from any abstract stream, so there might not even be a filename. When possible the
detection method should always look at the file contents it is passed for magic numbers
or use other heuristics to find out if the file is one it can handle. Only when this
is entirely impossible should it look for file extensions. Almost all archive parsers
do this, and it avoids problems with filename conflicts.
mressl:
Hello there!
I've finally managed to find some time and start working on this. I've been able to
understand NuLib and NuFX, but I'm having a hard time figuring out the Unarchiver.
So far I've built a XADSHKParser class for .shk archives, and the file recognition
is indeed working (tested with XADTest2). But I'm unable to understand the function
of the [x parse] call. I'm also not quite able to understand the [x rawHandleForEntryWithDictionary]
call. What do the entries of this dictionary represent? My biggest issue is understanding
the point where the parser gets to start the extraction of the files.
About NuFX, the library supports individual file access à la fopen so I don't think
it should be very hard to include in The Unarchiver. Some bad/good news: the library
is approx. 560 kB, but I've checked and a lot can be left out as there is no need to
build/modify archives. Also, a lot of the functionality is repeated in XADMaster.
One last issue, NuFX can decode Apple IIGS .sea archives, that unfortunately are not
compatible with Macintosh .sea archives (and which the Unarchiver can decode). Is it
possible to have overlapping file extensions in the Unarchiver?
Finally, I wish I had more time to understand how The Unarchiver works, but I guess
it's easier to just ask for help.
Cheers,
Marc.-
mressl:
Thanks a lot!
paracelsus:
Well, the basic structure for archive support in The Unarchiver is to create a XAD*Parser
class that detects the file format and parses the files it contains. Then, create one
or more XAD*Handle files that implement the compression algorithms used. The XAD*Parser
will instantiate one of the handle classes as needed. Unfortunately, I haven't gotten
around to documenting this, so you have to look at another parser class and try to
figure it out from there. You might want to look at some simple class, like NSA or
cpio.
Then, add the class to the list in XADArchiveParser.m, and it should magically work.
Use the XADTest2 and XADTest3 command-line utilities while testing, they are quite
handy.
A few notes, though: I really want to avoid external dependencies in XADMaster as far
as possible. If nulib2 is of reasonable size and can just be stuffed into a subdirectory,
that should be fine, though. Also, the abstracted filehandle model for compression
algorithms puts some limitations on what kinds of libraries you can reasonably use.
Basically, data is read from archives in XADMaster using an interface similar to fread()
and company. That means that the unpacking algorithm needs to be able to stop in mid-stream,
and resume later. Some decompression libraries are not built like that, and will write
the entire output in one go.
Well, in the worst case, files for the Apple II should never be big, so this can be
kludged by unpacking the entire file to a memory buffer, and returning a CSMemoryHandle
for reading from this buffer.
There are plenty of other utility classes for the handles, too. CSHandle is the superclass,
but few classes inherit directly from this. CSStreamHandle implements an interface
for non-seekable streams: A read function, and a reset function to restart from the
start of the stream (and seeking is handled automatically by restarting if needed).
This, in turn, has further convenience classes, such as CSByteStreamHandle for algorithms
that want to return single bytes one by one, and CSBlockStreamHandle for algorithms
that unpack in blocks.
Well, feel free to ask for further help once you have a closer look at it all.
mressl:
I'm interested in working on adding nulib2 support in The Unarchiver. How should I proceed?
paracelsus:
Also: Submit test cases if you want this implemented.
paracelsus:
If someone writes the code for it, sure. Personally I have far too many other things
that have higher priority.