fs
fs copied to clipboard
UnsupportedOperationException when using `copy-tree` with zip file path as source
Hello,
I stumbled over the following problem when trying to copy a sub-tree with copy-tree
from a jar resources to the the file system. You can reproduce the error with the following code snippet in a REPL in the babashka/fs
project (The first resource to be found with the name "META-INF" is in the nrepl jar).
(let [root-source "META-INF"
target (.toPath (io/file "tmp"))]
(with-open [fs (FileSystems/newFileSystem
(.toURI (io/resource root-source))
{})]
(copy-tree (.getPath fs root-source (into-array String [])) target)))
When evaluating the code snippet above, I get an UnsupportedOperationException
exception.
Execution error (UnsupportedOperationException) at jdk.nio.zipfs.ZipPath/toFile (ZipPath.java:669).
The exception comes one of the two (path to rel)
calls inside the copy-tree
function. The dyadic version of the path
function transforms its argument as Files to resolve them and then transforms them back as Paths. However ZipPaths cannot be transformed to File instances and therefore call fails with the UnsupportedOperationException
exception.
(^Path [parent child]
(as-path (io/file (as-file parent) (as-file child))))
I tried to change the definition path
(dyadic) as follows:
(^Path [parent child]
(let [child (as-path child)]
;; we check for nil parent to reproduce the behaviour of `io/file` and,
;; ultimately, of `(File. ^File parent ^String child)`.
(if (nil? parent)
child
(.resolve (as-path parent) child))))
But it still does not work because we get now a ProviderMismatchException
when calling Path#resolve
. The reason is that ZipPaths and UnixPaths cannot be combined/resolved/relatived with each others (I am on MacOS, but I guess it is the same problem with Windows).
Finally, I tried the following function to use instead of the Path#resolve
call. The function resolve component by component when the paths cannot be resolved directly.
(defn resolve-path [^Path parent ^Path child]
"Resolves the child path against the parent path; component by component
in case of a ProviderMismatchException exception."
(try
(.resolve parent child)
(catch ProviderMismatchException _
(reduce (fn [^Path path ^Path component]
(if-let [file-name (.getFileName component)]
(.resolve path (str file-name))
path))
parent
(seq child)))))
It works in my specific case, because the child path is always relative and relative ZipPaths are built similarly to UnixPaths. However I do not believe that it is a good solution as combinations between Paths from different FileSystems are always ad-hoc. It is probably why the JRE raises a ProviderMismatchException
exception in the first place.
I still would like to use copy-tree
to extract resources from a jar and I see the following alternatives.
- Copying the
copy-tree
method in my project and replacing the two(path to rel)
calls by a custom resolving function for the ad-hoc resolving of a ZipPath relative path against a UnixPath and renaming it to something likeextract-tree-from-zip-file
. - Introducing an option to
copy-tree
to pass a custom resolving function withpath
as the default. In this case,bababshka.fs
could even offer the specific resolving function to copy from a zip file to a unix (and windows) directory.
What do you think? Is this use case (copying out of a zip file to the file system) important enough to warrant the introduction of a new option? Would such option be a good API?
You can have a look at the code above in the following branch: https://github.com/codesmith-gmbh/fs/tree/stan/path-resolve
I still would like to use copy-tree to extract resources from a jar
copy-tree has been written with normal files in mind.
Have you tried the unzip function? jar files are basically just zip files
Thank you for the feedback. Yes, I know the unzip
function, but it does not allow to extract a sub tree from the zip/jar file.
A possibility for sub-tree extraction could be an option named sub-tree-root
(or similar) to unzip, or, more generically, to have 2 options :include-entry?
and resolve-entry
to control the behaviour of unzip. For the normal case, these 2 options would be
{:include-entry? (constantly true)
:resolve-entry (fn [^Path output-path ^Path entry-path]
(.resolve output-path entry-path))}
and for sub tree extraction, they would be (where sub-tree-root
is the Path to be extracted)
{:include-entry? (fn [^Path entry-path]
(fs/starts-with? entry-path sub-tree-root))
:resolve-entry (fn [^Path output-path ^Path entry-path]
(.resolve output-path (.relativize sub-tree-root entry-path)))}
Sounds like an idea. With zip
we already have the option :path-fn
which resolves a path to a zip entry. In hindsight :path->entry
would have been a better name (and we can support this new name while deprecating the old one).
Then we could have: :unzip-entry?
and :entry->path
to do what you suggest above.
nice. should I give a try at a PR?
yes please!