cached_path
cached_path copied to clipboard
Support extracting files from RPM archives.
I use cached_path for downloading executable dependencies. This works perfectly on Windows; on Linux however, some dependencies are only available as RPM files. Extracting those is not supported by cached_path yet.
The feature could be implemented using the rpmfile module. The following steps would be necessary for this:
- Implement a function
is_rpmfilethat tries to parse the file and is missing from the rpm module linked above (that step could (should?) be done in the repo of the above module, which is available at https://github.com/srossross/rpmfile. - Use the
is_rpmfilein the checks at https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L181 and https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L201 - Add RPM extraction code after the block at https://github.com/allenai/cached_path/blob/0d250ae06ca48b014cd8f5079ca6165beca58311/cached_path/_cached_path.py#L245
- It might also be sensible to include a test verifying the extraction code into https://github.com/allenai/cached_path/blob/main/tests/cached_path_test.py.
I'm currently working around this by explicitly checking for RPM files when using cached_path and extracting the cached RPMs. It would be much cleaner to rely on cached_path for this however :)
I could write and PR this code if you're interested. Let me know if that's the case.
Hey @LeStahL I'd be happy to review a PR for this