easybuild-framework icon indicating copy to clipboard operation
easybuild-framework copied to clipboard

Correctly detect final path when extracting multiple tarballs

Open Flamefire opened this issue 6 months ago • 1 comments

The current implementation of extract_file detected folders/files from the first tarball when extracting the second. Due to the definition of find_base_dir it will then return the parent path (usually builddir) which is used as the finalpath for sources. This leads issues requiring workarounds in e.g. the Bundle easyblock where sources from all components are extracted into the same folder.

Fix by storing the old state of the target folder and detect of extraction resulted in a single (top-level) folder.

As part of this work I enhanced the documentation and test of find_base_dir and added tests for the expected and (previous) error cases of extract_file

guess_start_dir detects the fixed behavior already: If there is a prefix/<startdir>/<startdir> which doesn't exist prefix/<startdir> is used instead

E.g. GTK3 has start_dir set manually for components due to this bug which would now lead to builddir/foo/foo IMO trivial to detect/fix

Remark: It is possible to have find_base_dir NOT change into the resulting directory which means we can do this explicitely when requested instead of undoing it. Not sure if anyone outside the main repo relies on that.

Flamefire avatar Jun 13 '25 11:06 Flamefire

@boegel I replaced the added binaries by creating them in the test (replacing the commit to remove them from the history)

I also replaced all uses of other binaries in that test. The test-method I added should be easily usable for other tests.

What I noticed: Our filetools.make_archive is lacking a way to create an archive from a list of files or contents of a folder, i.e. exclude the parent folder from the hierarchy in the archive. See the new method I added.

Flamefire avatar Jun 18 '25 08:06 Flamefire

@Flamefire merge conflict to resolve here...

boegel avatar Sep 24 '25 14:09 boegel

An added test, rebased to fix.

Flamefire avatar Sep 24 '25 14:09 Flamefire

I found an issue with e.g. Clang: Prior to this fix when extracting multiple sources the directory of the extracted source could not be uniquely determined which results in changing to the build dir instead inside the extract_step

We could

  • a) Change into the build dir whenever extracting multiple sources, which I find strange
  • b) Change into the directory of the first extracted source assuming this is the main source
  • c) Change into the directory of the last extracted source which seems to be the intention of the original code although that never(?) worked.

This doesn't matter much because in prepare_step we set and change to start_dir which is the final_path of the first source which will still be the same.
Clang fails because it does globing in the extract_step

FWIW: I'm using this for a while now installing hundreds of easyconfigs and Clang was the only issue I found so far. Followup: https://github.com/easybuilders/easybuild-easyblocks/pull/3961

Flamefire avatar Oct 15 '25 14:10 Flamefire