Brittle, Bugged, and Confusing Link Processing Logic
Version: 2.25.5-SNAPSHOT
Issue 1: Logic that attempts to preserve the location of the hash in front of the query string incorrectly moves the hash if it is a substring of the path
assertEquals("https://test.com?categ=cat1#test",
underTest.sanitize("https://test.com?categ=cat1#test", request));
org.opentest4j.AssertionFailedError:
Expected :https://test.com?categ=cat1#test
Actual :https://test.com#test?categ=cat1
Issue 2: url in LinkImpl is sanitized/encoded, but mappedUrl and exernalizedUrl are not. Because we return url for display on the AEM site (e.g. Button component), but mappedUrl for model.json export and JS datalayer, this results in the above bug rendering "https://test.com/#test?categ=cat1" (incorrect) on the website but putting "https://test.com?categ=cat1#test" (correct) in the datalayer and model export
Issue 3: Also, mapped/externalized URLs are not encoded, nor are they consistent with each other (map encodes path, externalize encodes nothing).
// PASS
assertEquals("https://test.com/ab%7Ccd?recipient=me%7Cyou",
underTest.sanitize("https://test.com/ab|cd?recipient=me|you", request));
// FAIL
assertEquals("https://test.com/ab%7Ccd?recipient=me%7Cyou",
underTest.externalize("https://test.com/ab|cd?recipient=me|you", request));
Expected :https://test.com/ab%7Ccd?recipient=me%7Cyou
Actual :https://test.com/ab|cd?recipient=me|you // encodes neither path nor query
// FAIL
assertEquals("https://test.com/ab%7Ccd?recipient=me%7Cyou",
underTest.map("https://test.com/ab|cd?recipient=me|you", request));
Expected :https://test.com/ab%7Ccd?recipient=me%7Cyou
Actual :https://test.com/ab%7Ccd?recipient=me|you // encodes path, but not query
Issue 4: There's also a bug where logic that attempts to preserve the location of the hash in front of the query string does not encode the hash
// PASS
assertEquals("https://test.com/#ab%7Cc", // converts | to %7C
underTest.sanitize("https://test.com/#ab|c", request));
// FAIL
assertEquals("https://test.com/#ab%7Cc?xyz",
underTest.sanitize("https://test.com/#ab|c?xyz", request));
Expected :https://test.com/#ab%7Cc?xyz
Actual :https://test.com/#ab|c?xyz // fails to convert | to %7C
Issue 5: All in all the code is very confusing as well - For URL https://test.com/#/downloads/file.html?name=/content/file.zip LinkUtil.escape is being called with:
- Path:
https://test.com/#/downloads/file.html - Query:
name=/content/file.zip - Fragment:
/downloads/file.htmlReally the Path should be https://test.com/, but I believe it's being passed with the hash in it for the logic that is trying to preserve the hash in front of the query. This also begs a question of why we're supporting query after hash anyway - in the above URL, by HTTP convention isn't the hash actually#/downloads/file.html?name=/content/file.zipwith no query string? Or if we want to assume this was meant to be a hash and query, and we're trying to be nice to authors by "fixing" it for them, then we should fix in the output URL.
To summarize, solutions for Issue 5 are one of the following:
- Treat query after hash as part of the hash - there should be no query string, and the entire hash (with the query part) should be encoded via the hash encoding rules in code
- Treat query after hash as an accident, leaving the query as a true query string, and correct the order of query/hash in the output.
Issue 6: Asset link test at com.adobe.cq.wcm.core.components.internal.link.LinkBuilderImplTest#testLinkToAsset is not running, b/c it is missing the @Test annotation.
If you add @Test to run it, it breaks with the following exception:
java.lang.IllegalArgumentException: width or height <= 0
at com.day.image.Layer.init(Layer.java:3181)
at com.day.image.Layer.<init>(Layer.java:501)
at io.wcm.testing.mock.aem.builder.ContentBuilder.createDummyRasterImage(ContentBuilder.java:405)
at io.wcm.testing.mock.aem.builder.ContentBuilder.createDummyImage(ContentBuilder.java:398)
at io.wcm.testing.mock.aem.builder.ContentBuilder.asset(ContentBuilder.java:308)
at io.wcm.testing.mock.aem.builder.ContentBuilder.asset(ContentBuilder.java:295)
at com.adobe.cq.wcm.core.components.internal.link.LinkBuilderImplTest.testLinkToAsset(LinkBuilderImplTest.java:111)
I believe it can be fixed by simply changing
Asset asset = context.create().asset("/content/dam/asset.jpg", 0, 0, "image/jpeg");
to
Asset asset = context.create().asset("/content/dam/asset.jpg", 1, 1, "image/jpeg");
Issue 7: Hash is duplicated when it is in front of a query string and there is no URL path
// PASS
assertEquals("https://test.com?categ=cat1#top",
underTest.sanitize("https://test.com?categ=cat1#top", request));
assertEquals("https://test.com#top?categ=cat1",
underTest.sanitize("https://test.com#top?categ=cat1", request));
// FAIL
Expected :https://test.com#top?categ=cat1
Actual :https://test.com#top?categ=cat1#top // duplicate hash
Issue 8: (minor issue) A fallback LinkImpl is always constructed in LinkBuilderImpl::buildLink b/c call to orElse()
return pathProcessors.stream()
.filter(pathProcessor -> pathProcessor.accepts(decodedPath, request))
.findFirst()
.map(pathProcessor -> new LinkImpl(
pathProcessor.sanitize(decodedPath, request),
LinkManagerImpl.isExternalLink(decodedPath) ? decodedPath : pathProcessor.map(decodedPath, request),
pathProcessor.externalize(decodedPath, request),
this.reference,
pathProcessor.processHtmlAttributes(decodedPath, htmlAttributes)))
.orElse(new LinkImpl(path, path, path, this.reference, htmlAttributes)); // this is always constructed, though rarely needed