plexus-archiver icon indicating copy to clipboard operation
plexus-archiver copied to clipboard

Symbolic links are added as symlinks to the sources JAR instead of being followed

Open mirabilos opened this issue 4 years ago • 13 comments

I’ve got symbolic links in my resources (files that get reused in multiple places in a larger repository). They are correctly followed when creating the main binary JAR but the sources JAR contains the symbolic links themselves¹, which then point to nirvana (i.e. to outside the JAR).

In https://github.com/codehaus-plexus/plexus-archiver/issues/47#issuecomment-308067327 it was suggested to make the “follow symlinks?” mode configurable. Implementing this would completely solve my issue if it is configurable in the necessary places; I believe that implementing it in the manner shown will fix it.

Related: https://issues.apache.org/jira/browse/MRESOURCES-237 where 3.x versions of the maven-resources-plugin cause similar problems by not following symlinks any more (which is why I’m pinning it to a 2.x version in my projects).

Fixing this in plexus-archiver might fix the maven-resources-plugin problem as well… unsure.

① Funnily enough, this even causes other software to crash when exploring the resulting archives

mirabilos avatar Jan 08 '21 17:01 mirabilos

The reported crash isn't reproducible for me with:

$ mc -V
GNU Midnight Commander 4.8.24
Kompiliert mit GLib 2.66.2
Die ncurses-Bibliothek benutzen
Mit eingebautem Editor
Mit Unterstützung für Hintergrundtätigkeiten
Mit Maus-Unterstützung für xterm
Mit Unterstützung für X11-Ereignisse
Mit Internationalisierungs-Unterstützung
Mit Unterstützung mehrerer Codepages
Virtuelles Dateisystem: cpiofs, tarfs, sfs, extfs, ftpfs, fish
Datentyp: char: 8; int: 32; long: 64; void *: 64; size_t: 64; off_t: 64;

This needs to be verified with Plexus Archiver.

michael-o avatar Jan 08 '21 17:01 michael-o

It’s not about the crash, this is just a funny side effect.

I need to be able to have the maven-source-plugin follow symlinks when creating the sources JAR instead of adding the symlinks themselves into the JAR. This is what this is about.

mirabilos avatar Jan 08 '21 17:01 mirabilos

Likely both modes would be required. Retention and resolution.

michael-o avatar Jan 08 '21 17:01 michael-o

Yes, of course, hence the request to make it configurable.

Per resource set, as @petr-ujezdsky suggested in his comment, would probably match best.

mirabilos avatar Jan 08 '21 17:01 mirabilos

Setting collection.setFollowingSymLinks(true); does not seem to be enough; the resulting JAR file still contains the symbolic link instead of the pointed-to file :(

mirabilos avatar Jan 08 '21 18:01 mirabilos

Some debugging later…

Archiver archiver;
archiver.addDirectory( sourceDirectory, pIncludes, pExcludes );

        getLog().info( "resource {{{" );
        ResourceIterator ri = archiver.getResources();
        while (ri.hasNext()) {
            getLog().info( "resource: " + fmt(ri.next()) );
        }
        getLog().info( "resource }}}" );
    }

    private static String fmt(final ArchiveEntry e) {
    	return String.format("%d(%o)<%s>", e.getType(),e.getMode(),e.getName());
    }

… it clearly shows that, despite…

--- a/src/main/java/org/codehaus/plexus/archiver/AbstractArchiver.java
+++ b/src/main/java/org/codehaus/plexus/archiver/AbstractArchiver.java
@@ -367,7 +367,7 @@ public void addFileSet( @Nonnull final FileSet fileSet )
         // The PlexusIoFileResourceCollection contains platform-specific File.separatorChar which
         // is an interesting cause of grief, see PLXCOMP-192
         final PlexusIoFileResourceCollection collection = new PlexusIoFileResourceCollection();
-        collection.setFollowingSymLinks( false );
+        collection.setFollowingSymLinks( true );
 
         collection.setIncludes( fileSet.getIncludes() );
         collection.setExcludes( fileSet.getExcludes() );

…, the collection entry is still a symlink:

[INFO] indeed running the patched plugin
[INFO] resource {{{
[INFO] resource: 2(40755)<>
[INFO] resource: 2(40755)<org>
[INFO] resource: 2(40755)<org/evolvis>
[INFO] resource: 2(40755)<org/evolvis/tartools>
[INFO] resource: 2(40755)<org/evolvis/tartools/mvnparent>
[INFO] resource: 2(40755)<org/evolvis/tartools/mvnparent/examples>
[INFO] resource: 1(100644)<org/evolvis/tartools/mvnparent/InitialiseLogging.java>
[INFO] resource: 1(100644)<org/evolvis/tartools/mvnparent/examples/Main.java>
[INFO] resource: 2(40755)<META-INF>
[INFO] resource: 2(40755)<META-INF/legal>
[INFO] resource: 2(40755)<META-INF/legal/org.evolvis.tartools>
[INFO] resource: 2(40755)<META-INF/legal/org.evolvis.tartools/maven-parent-lib>
[INFO] resource: 3(120777)<META-INF/legal/org.evolvis.tartools/maven-parent-lib/LICENCE>
[INFO] resource }}}

This is about the last file, a symbolic link:

lrwxrwxrwx 1 me me 54 Jan  8 15:35 lib/src/main/resources/META-INF/legal/org.evolvis.tartools/maven-parent-lib/LICENCE -> ../../../../../../../../src/main/ancillary/LICENCE.hdr

I hope the fact that the amount of ../s brings it out of the lib/ subdirectory is not the problem here, because this is how files are shared in multi-module projects…

mirabilos avatar Jan 08 '21 18:01 mirabilos

This might be due to plexus-io ResourceFactory

public static PlexusIoResource createResource( File f, String name, final ContentSupplier contentSupplier,
                                               InputStreamTransformer inputStreamTransformer,
                                               PlexusIoResourceAttributes attributes )
    throws IOException
{
    boolean symbolicLink = attributes.isSymbolicLink();
    return symbolicLink ? new PlexusIoSymlinkResource( f, name, attributes )
        :  new PlexusIoFileResource(f, name, attributes, contentSupplier, inputStreamTransformer);
}

… in PlexusIoFileResourceCollection

        File f = new File( dir, sourceDir );

        PlexusIoResourceAttributes attrs = new FileAttributes( f, cache1, cache2 );
        attrs = mergeAttributes( attrs, f.isDirectory() );

        String remappedName = getName( name );

        PlexusIoResource resource =
            ResourceFactory.createResource( f, remappedName, null, getStreamTransformer(), attrs );

… and FileAttributes

        Map<String, Object> attrs = Files.readAttributes( path, "unix:permissions,gid,uid,isSymbolicLink,mode", LinkOption.NOFOLLOW_LINKS );

… so the collection.setFollowingSymLinks(true); seems to be only about following symbolic links to directories when encountered; additional remapping of symbolic links in plexus-io is needed.

mirabilos avatar Jan 08 '21 18:01 mirabilos

I’m currently trying to implement this inside plexus-archiver in a configurable way (support both retention and resolution of symbolic links, as you said), but I have absolutely no idea how to then tie this in with the maven-source-plugin because the Maven Archiver configuration does not expose any plexus-archiver configuration, if there is any.

@michael-o can you help me with that?

mirabilos avatar Jan 08 '21 20:01 mirabilos

My WIP branch on top of the 4.2.1 release (since that’s what the maven-source-plugin uses) now has this implemented, even extending the ZipArchiverTest.testSymlinkFileSet() case so it tests both retaining and resolving symlinks at creation time.

Even if this necessarily must wait for a new plexus-io release with that PR applied, review (also whether I’m on the correct path here) welcome…

mirabilos avatar Jan 08 '21 20:01 mirabilos

You need to pass this through all layers. All plugins using Maven Archiver expose XML configuration for it.

michael-o avatar Jan 08 '21 22:01 michael-o

Michael Osipov dixit:

You need to pass this through all layers.

I guessed so. Right now, it’s available in the Archiver interface of plexus-archiver as of the current WIP.

All plugins using Maven Archiver expose XML configuration for it.

“For it” being the Maven Archiver, not the plexus-archiver, right? I saw configuration/XML documentation for that.

I’m a bit overwhelmed should needing to completely add support to configure plexus-archiver from the Maven Archiver need to be created first…

bye, //mirabilos

tarent solutions GmbH Rochusstraße 2-4, D-53123 Bonn • http://www.tarent.de/ Tel: +49 228 54881-393 • Fax: +49 228 54881-235 HRB 5168 (AG Bonn) • USt-ID (VAT): DE122264941 Geschäftsführer: Dr. Stefan Barth, Kai Ebenrett, Boris Esser, Alexander Steeg

mirabilos avatar Jan 08 '21 22:01 mirabilos

For me the topic of symlinks is at times quite confusing and hard to follow (no pun intended). It would help if we write down the requirements what is the desired behavior when the follow symbolic links. Plexus Archiver does have a complex system of including/excluding files, filenames mappings, overriding file attributes etc. Lets say we have the following directory structure (-> denotes symlink)

/regularDir
    /nestedDir
        regularFileC
    regularFileA
    regurlarFileB
regularFileD
symLinkDir -> regularDir
symLinkA -> regularFileD
symLinkB -> regularDir/regularFileA

So if follow links is true then what should be the result. For files I think it is relatively straightforward. We should apply all exclusion/inclusion, file mappings, etc on the symLinkA and symLinkB. Just when compressing we should include the content of regularFileD and regularDir/regularFileA instead of having symlinks in the archive. But what about the file attributes? And what about dirs.

p.s. @mirabilos please keep in mind that the follow sylinks change you mentioned was applied for Windows - platform on which Plexus Archiver didn't support symlinks at the time. That is why reverting the change on all system does not work as expected. I was intended for systems where symlinks are not supported not for ones where it is supported but should be followed.

plamentotev avatar Jan 09 '21 07:01 plamentotev

Plamen Totev dixit:

For me the topic of symlinks is at times quite confusing and hard to follow (no pun intended). It would help if we write down the requirements what is the desired behavior when the follow symbolic links.

That, if a symlink is encountered, it is handled as if the pointed-to thing were present in its place.

Lets say we have the following directory structure (-> denotes symlink)

/regularDir
   /nestedDir
       regularFileC
   regularFileA
   regurlarFileB
regularFileD
symLinkDir -> regularDir
symLinkA -> regularFileD
symLinkB -> regularDir/regularFileA

So if follow links is true then what should be the result.

/ regularDir/ directory nestedDir/ directory regularFileC file regularFileA file regurlarFileB file regularFileD file symLinkDir/ directory (uid/gid/mode/… copied from /regularDir) nestedDir/ directory (uid/gid/mode/… copied from /regularDir/nestedDir) regularFileC file, same contents as /regularDir/nestedDir/regularFileC regularFileA file, same contents as /regularDir/regularFileA regurlarFileB file, same contents as /regularDir/regurlarFileB symLinkA file, same contents as /regularFileD symLinkB file, same contents as /regularDir/regularFileA

One corner case you didn’t mention is symbolic links that don’t resolve to an existing file (including directories and special files) at the time of archival. They would, in follow symlinks mode, be included into the archive as symlink (because that’s what (I just checked) tar does in follow symlinks mode).

I admit I have not tested that with these changes here yet. For my use case, omitting them would be just as good.

p.s. @mirabilos please keep in mind that the follow sylinks change you mentioned was applied for Windows - platform on which Plexus Archiver didn't support symlinks at the time. That is why reverting the change on all system does not work as expected. I was intended for systems where symlinks are not supported not for ones where it is supported but should be followed.

Yes, I assumed so, which is why I’ve prepared the PR for plexus-io which would fix this.

Thanks, //mirabilos

“It is inappropriate to require that a time represented as seconds since the Epoch precisely represent the number of seconds between the referenced time and the Epoch.” -- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2

mirabilos avatar Jan 09 '21 14:01 mirabilos