unpackerr icon indicating copy to clipboard operation
unpackerr copied to clipboard

Polish support for ISO9660 file format

Open blackwind opened this issue 2 years ago • 23 comments

  • [ ] Support extracting files larger than 2^32 bytes
  • [x] Don't truncate extracted filenames
  • [x] Preserve casing for extracted filenames
  • [ ] Apply timestamp metadata to extracted files
  • [ ] Extract to root _unpackerred folder instead of creating a subfolder based on ISO's filename
  • [ ] Delete ISO with other archives after extraction if the associated option is enabled
  • [ ] Confirm exotic ISO formats can be extracted (UDF 1.02 seems to be most common)
$ ls -l extracted-by-unpackerr/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 ARTBOOK
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 GUIDE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:43 MANUAL
drwxr-xr-x 4 docker everyone       4096 2023-02-01 21:43 OST
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 POSTER
-rw-r--r-- 1 docker everyone  243729046 2023-02-01 21:44 SETUP~01.BIN
-rw-r--r-- 1 docker everyone 4294081022 2023-02-01 21:44 SETUP_X-.BIN
-rw-r--r-- 1 docker everyone     896112 2023-02-01 21:44 SETUP_X-.EXE
drwxr-xr-x 2 docker everyone       4096 2023-02-01 21:44 WALLPAPE

$ ls -l extracted-by-winrar/
total 4432364
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 artbook
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 guide
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 manual
drwxr-xr-x 4 docker everyone       4096 2022-12-16 08:20 ost
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 poster
-rw-r--r-- 1 docker everyone 4294081022 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-1.bin
-rw-r--r-- 1 docker everyone  243729046 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327)-2.bin
-rw-r--r-- 1 docker everyone     896112 2022-12-16 08:20 setup_x-blades_hd_1.0_(60327).exe
drwxr-xr-x 2 docker everyone       4096 2022-12-16 08:20 wallpapers

blackwind avatar Feb 03 '23 00:02 blackwind

Thank you! I have a few points to make, but I'm super busy and will catch up on this soon!

davidnewhall avatar Feb 03 '23 00:02 davidnewhall

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"

davidnewhall avatar Feb 07 '23 19:02 davidnewhall

Man, this is a rough one. I thought, back when you opened this issue, I found another ISO library for Go (seo: golang). Today, I'm only finding 3 libraries, and I seem to be using the 'best' one. None of them support Joilet file extensions, which means file names over 32 characters are out. The bugs you've run into seem to be directly in this library. I don't believe I can fix them myself. I'm also afraid the 4 GB file limitation is built into the library, but I think it may be inadvertently used for extractions when it should be used for compression. Not entirely sure yet.

Question for ya @blackwind .. if I give you a spot to upload, can you send me an ISO file or two that didn't work? I'll try to engage with @kdomanski once I have a reproducible example to share with him.

This is the library I'm using now:

  • https://github.com/kdomanski/iso9660

These are the other two I found:

  • https://github.com/ajcollins0/go-diskfs
  • https://github.com/hooklift/iso9660

EDIT: Found more that are 2+ years old:

  • https://github.com/qeedquan/iso9660
  • https://github.com/dgodd/iso9660
  • https://github.com/jesperkha/iso9660-reader
  • https://github.com/mogaika/udf
  • https://github.com/sjpotter/udf-fs

If anyone find a good ISO9660 library for Go.. lemme know.

davidnewhall avatar Apr 28 '23 05:04 davidnewhall

Proper support for these files will be a huge time-saver for me, so absolutely, I'm happy to help in any way I can. The one I used in my log (X-Blades_HD-DINOByTES) is a good example of all mentioned issues and is available in the obvious places, but I'll do the legwork if you need me to for whatever reason.

blackwind avatar Apr 28 '23 05:04 blackwind

I'll try that file with a few of these libraries. Will see if anything can extract it.

davidnewhall avatar Apr 28 '23 05:04 davidnewhall

Found this in someone else's log.

unpackerr-2023-02-06T18-44-33.236.log:2023/02/06 03:27:53 Extraction Error: Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD: failed to open iso image: /downloads/tv-sonarr/Pokemon.Origins.S01.2013.ANiME.DUAL.COMPLETE.BLURAY-iFPD/ifpd-pokemonoriginss01-bluray.iso: volume descriptor "BEA01" != "CD001"

That's a UDF descriptor.

kdomanski avatar May 01 '23 10:05 kdomanski

It's funny, because the ISO9660 library I currently use just released a new version. Literally the only significant change they made was to add an error that says "UDF volumes are not supported." rip. I'm messing with this a little bit today, but I'm not very optimistic. :(

davidnewhall avatar May 07 '23 18:05 davidnewhall

I'm stumped at this point. The only actively maintained libraries I can find do not support 2 or more of:

  • Joliet (Unicode / non-Latin + 100 character file names)
  • Rock Ridge (posix timestamps and permissions + 255 byte file names)
  • UDF (DVD ISOs, generally)

I did find 1 old library that has Rock Ridge support, and 1 that supports UDF, but I don't think I found any that support Joliet.

Ideally, the rock ridge support can be ported into https://github.com/kdomanski/iso9660 or https://github.com/diskfs/go-diskfs or both.

davidnewhall avatar May 07 '23 19:05 davidnewhall

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

The 1 library with Rock Ridge support that you linked, it only gets the full filename from the RR data, but not timestamps. Looks like the master branch of the library you use already has RR test fixture added, so full RR support might drop any time. Supporting Joliet might then be redundant for your usecase, we'll see.

As for files larger than 4GB, this requires support for multi-extent descriptors. It's not hard to implement, but it requires a bit of free time.

kdomanski avatar May 07 '23 20:05 kdomanski

Funny indeed. Maybe the author got a notification when you mentioned him, and saw your post.

Love it. Thanks for stopping by!

Looks like the master branch of the library you use already has RR test fixture added,

haha, don't be so modest. I see your recent commits (now), and am very pleased!

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

As for files larger than 4GB, this requires support for multi-extent descriptors.

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

EDIT: derp moment. Just realized who I replied to earlier. haha EDIT2: and now realizing the new release you made was probably because of this issue, and that error message you quoted. Thank you :)

davidnewhall avatar May 07 '23 20:05 davidnewhall

Give me a pointer or two? I'm willing to try if you think it might be a worthwhile use of my time. This is probably the last "hurdle."

ECMA-119 9.1.6: multi-extent flag. ECMA-119 6.5.1 "Each file shall consist of one or more File Sections."

It's not very explicit, but I infer that maybe it means a multi-extent file has several consecutive Directory Records and the flag turned on.

The Linux Kernel's code for this appears to interpret this flag as an indication of the given DE not being the last one for the file.

Supporting Joliet might then be redundant for your usecase

I hope so. Seems like rock ridge will give us what we're missing.

Looks like (outside of some edge cases) Linux will use RR and ignore Joliet if both are present.

kdomanski avatar May 07 '23 22:05 kdomanski

@kdomanski You're right, overall this doesn't look too hard. It's going to take me a bit to come up to speed on this, but I've got a couple hours into it now and may be able to get there. Here's where I'm at...

None of the files have dirFlagMultiExtent set in their FileFlags. This image doesn't actually seem to have any files larger than 4 GB, so I will keep looking for one.

de.SystemUse is also empty, so I don't seem to get Rock Ridge files names. It could be that this image has two volumes and doesn't do rock ridge. Have you figured out how to access that second volume yet?

Here's my "update" to do some debugging: https://github.com/kdomanski/iso9660/commit/45c0c7da7dddbfbecfe7910a071b01d36e0e7080

I ran this new code against the ISO file mention earlier in the thread. Here's the whole output: https://gist.github.com/davidnewhall/b67c6fdf1c942fb8d8026ba1a42fad25

This is what it looks like mounted on my Mac: Screen Shot 2023-05-18 at 1 54 09 AM

...which makes me want to ask: Is the volume name exposed by this library yet? (the name in the title)

davidnewhall avatar May 18 '23 09:05 davidnewhall

Hmm, this might be a Joliet-only image. I'll look into the dump you provided.

Is the volume name exposed by this library yet? (the name in the title)

it is now. ;-) https://github.com/kdomanski/iso9660/releases/tag/v0.3.5

kdomanski avatar May 21 '23 17:05 kdomanski

amazing!

davidnewhall avatar May 21 '23 19:05 davidnewhall

Any further progress on this? Or are we blocked indefinitely?

blackwind avatar Aug 16 '23 06:08 blackwind

No one has ever extracted or created these 'advanced' format ISO images with Go apps. This is all new. kdomanski is the only person that's put together a comprehensive library that will one day provide these features. Today, it does not. I haven't had time to visit this. I have dozens of projects, and this feature is a lot of work, so it will be a while before I'm intrigued enough to spend the time required.

There has been no further progress at this time.

davidnewhall avatar Aug 19 '23 18:08 davidnewhall

If you detected impatience in my tone, none was intended. I appreciate the update and all the work done on this so far.

blackwind avatar Aug 19 '23 19:08 blackwind

Sup. Release v0.4.0 can read Rock Ridge filenames. Looking forward to your feedback (and bug reports 😉 ).

kdomanski avatar Aug 20 '23 13:08 kdomanski

There's probably more I can do here, but I updated the library and pushed some updates. You can download it here https://unstable.golift.io - thanks Kamil!

EDIT: Docker is ready.

davidnewhall avatar Aug 20 '23 18:08 davidnewhall

Unable to test until the Docker image is available, but it sounds like no more filename truncation, no more incorrect filename casing, but the other issues persist. I've marked the completed tasks in the first post.

blackwind avatar Aug 20 '23 19:08 blackwind

Currently just getting "UDF volumes are not supported", which I guess is an improvement over the old behavior.

blackwind avatar Aug 22 '23 01:08 blackwind

What is the image you're testing? UDF is probably another problem that needs a solution.

davidnewhall avatar Aug 22 '23 04:08 davidnewhall

Tried a few, but Stray.v1.5-Razor1911 is a well sized one for testing the 4GB issue as well.

blackwind avatar Aug 22 '23 04:08 blackwind