Extra credit
Create a pbsetup-like program that can construct all those files from htm-s like what pbsetup does. Maybe this can help the archiving effort a little bit.
To clarify the situation: files can be constructed offline, however it seems to rely on some magic inside pbsetup.
htm file header
- matches regex
<html><body><p>S([^ ]+) ([^ ]+) ([^ ]+) ([^ ]+) ([^<]+)<p> - second capture group contains the size of assembled binary in hexadecimal
htm file body is zlib data in hexadecimal
in pbsecsv file, the F entry syntax:
<p> F <type=C|A|S> <platform=V|W|K|L|M> <ver=INT> <uncompressed sz=HEX> <binary md5=HEX> <binary md5 without first byte=HEX>
in <platform><type>00<ver>.htm file syntax:
<html><body><p>S<compressed sz=HEX> <uncompressed sz=HEX> <compressed binary md5=HEX> <compressed binary md5 without first byte=HEX> <random file name>(<p><zlib data HEX>)+</body></html>
to extract ZLIB, HEX has to turned into binary format.
EXT = {w=>so,w=>dll,v=>d64...}
pb/dll/<platform><type>00<ver>.EXT contains extracted file from its similarily named htm file in pb/htm folder.
After that files are copied: S type files into pb/pbsv.EXT C type files into pb/pbcl.EXT and pb/pbcls.EXT A type files into pb/pbag.EXT and pb/pbags.EXT
if there is no file for a certain platform, then the checksums in pbsv refer to the htm file, not the file that is contained within htm.