cached_path icon indicating copy to clipboard operation
cached_path copied to clipboard

Support extracting files from RPM archives.

Open LeStahL opened this issue 1 year ago • 1 comments

I use cached_path for downloading executable dependencies. This works perfectly on Windows; on Linux however, some dependencies are only available as RPM files. Extracting those is not supported by cached_path yet.

The feature could be implemented using the rpmfile module. The following steps would be necessary for this:

  • Implement a function is_rpmfile that tries to parse the file and is missing from the rpm module linked above (that step could (should?) be done in the repo of the above module, which is available at https://github.com/srossross/rpmfile.
  • Use the is_rpmfile in the checks at https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L181 and https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L201
  • Add RPM extraction code after the block at https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L245
  • It might also be sensible to include a test verifying the extraction code into https://github.com/allenai/cached_path/blob/main/tests/cached_path_test.py.

I'm currently working around this by explicitly checking for RPM files when using cached_path and extracting the cached RPMs. It would be much cleaner to rely on cached_path for this however :)

I could write and PR this code if you're interested. Let me know if that's the case.

LeStahL avatar Dec 07 '23 14:12 LeStahL

Hey @LeStahL I'd be happy to review a PR for this

epwalsh avatar Dec 11 '23 16:12 epwalsh