pom-scijava icon indicating copy to clipboard operation
pom-scijava copied to clipboard

Improve our approach to javadoc linking

Open ctrueden opened this issue 5 years ago • 3 comments

The pom-scijava-base ancestor POM includes configuration hardcoding several links for the javadoc tool. This configuration makes classes for SciJava-based projects clickable, pointing uniformly to javadoc.scijava.org endpoints aggregating multiple components. For example, the URL https://javadoc.scijava.org/SciJava/ includes javadoc for many components in the scijava GitHub org with groupId org.scijava. The general pattern is "Maven groupId = GitHub org = javadoc.scijava.org endpoint".

Problems

  1. Reproducibility. The javadoc.scijava.org endpoints are not reproducible. They are like SNAPSHOTs, always serving the latest javadoc. Properly, we should be linking to stable API javadoc corresponding to the versions depended upon by each built project. Otherwise, rebuilding the javadoc later will produce a different result, and potentially javadoc build errors if e.g. a class was later removed.

  2. Performance. The more <link> entries hardcoded by pom-scijava-base, the longer everyone's javadoc builds take, because the javadoc tool scrapes the index from each linked URL.

  3. Extensibility. Projects wanting to extend their javadoc with links other than what pom-scijava-base hardcodes must add those links manually to their project POMs.

  4. Separation of concerns. The pom-scijava-base should not be hardcoding links related to components from the pom-scijava BOM. Or to put another way: the list of javadoc links should be defined here in pom-scijava, and fully correspond to all the components managed here.

Solutions

How to extend a project POM with more links

The following block adds more javadoc URL links:

<build>
  <pluginManagement>
    <plugins>
      <plugin>
        <artifactId>maven-javadoc-plugin</artifactId>
        <configuration>
          <links combine.children="append">
            <link>https://javadoc.io/static/commons-io/commons-io/${commons-io.version}</link>
            <link>https://javadoc.io/static/org.ojalgo/ojalgo/${ojalgo.version}</link>
          </links>
        </configuration>
      </plugin>
    </plugins>
  </pluginManagement>
</build>

How to achieve link reproducibility

Note the use of the wonderful javadoc.io service to link reproducibly to individual component javadoc available from Maven Central! I think javadoc.io is the nicest way forward for reproducible javadoc permalinks. We don't have to deploy our own javadoc infrastructure—just continue working toward publishing as many of our artifacts as possible to Maven Central as they mature.

How to balance javadoc build times with correct linking

Firstly: we shouldn't unilaterally add javadoc links for the entire component collection. I tried updating pom-scijava to use granular javadoc.io links instead of our fuzzy aggregating javadoc.scijava.org links. This changed the configuration from 37 unstable (javadoc.scijava.org) links to 231 stable ones (javadoc.io). Unfortunately, javadoc build time appears to scale pretty much linearly with this number: 0 links -> 11s; 37 links -> 48s; 231 links -> 215s.

Therefore, every project should include links in its javadoc configuration matching its direct dependencies. (Assuming a project's dependencies are structured correctly, with mvn dependency:analyze not reporting problems: transitive dependencies are not part of the project's public API, and thus no such links will be needed for that project's javadoc.)

It would be nice is the maven-javadoc-plugin had a feature that did this automatically. The <detectLinks> option is almost what we need—but because there is no way to know what remote javadoc URL you want for each given groupId:artifactId dependency, it uses a simple heuristic, ${project.url}/apidocs, which will fail for the vast majority of components.

Conclusion

We could enhance the maven-javadoc-plugin to support configuration declaring a mapping from groupId:artifactId to javadoc link URL, with the <detectLinks> feature using this mapping preferentially when discerning the links. For performance, we would need to double check that <detectLinks> only includes links for direct dependencies, not transitive ones—or else add a flag controlling this behavior.

ctrueden avatar Jun 15 '20 15:06 ctrueden

Because we proxy a few of the javadoc link targets, any time any of those proxied sites go offline (which has happened a few times now), it impacts builds and releases, which I find undesirable and counterintuitive—and unnecessary, because steps we take to address this issue here could also avoid the problem of link targets going offline.

While I still think it would be most correct, and super slick, to enhance the maven-javadoc-plugin to add <link>s contingently based on which dependencies your project has, it would be a significant development effort. In the meantime, I'm looking at aggregating the javadoc for the entire SciJava component collection into one huge javadoc destination, which would be served from the root of javadoc.scijava.org (probably not using GitHub Pages anymore). If it works, it would greatly speed up javadoc build times (3 javadoc <link>s instead of 40+). It still won't give us versioned javadoc, but the javadoc could be kept in sync easily with the most recent pom-scijava release.

ctrueden avatar Nov 17 '21 14:11 ctrueden

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/openmicroscopy-org-down/60062/8

imagesc-bot avatar Nov 17 '21 14:11 imagesc-bot

There is now a repository called javadoc-wrangler intended to improve this situation. See the README for details. It's not fully tested and working yet, but I anticipate it will solve all four of the problems described above.

ctrueden avatar Jan 07 '22 22:01 ctrueden