commonmark-java icon indicating copy to clipboard operation
commonmark-java copied to clipboard

Add more information to Link nodes (for round-trip rendering)

Open robinst opened this issue 7 years ago • 6 comments

For #12, we need to preserve link reference definitions. We also need to be able to distinguish between the different link types (inline links, autolinks, reference links).

There is an old PR which might serve as a starting point: https://github.com/atlassian/commonmark-java/pull/20#issuecomment-142782665

I'm not sure yet what the final Node types should look like. Ideally we'd have an interface that all the links can implement for code that doesn't care about what the source representation was.

If someone is interested in working on this, please comment here first so we can figure out what the end result should look like.

robinst avatar Aug 29 '17 02:08 robinst

Hi @robinst,

I've been looking this over since last night and I'll have some time on various Saturdays in the next few months that might serve to implement this. In the mean time: talk is cheap! :)

I've reviewed #20, #12 and the appropriate spec. Regarding your notes in #20

Do you think it would make sense to have

[foo]: /url "title"

be its own Node type, like ReferenceDefinition? Something like this maybe:

[foo](/uri "title")

is a Link, which has getDestination, getTitle (not sure about getLabel)

[foo][foo], [foo][] and [foo]

are ReferenceLink extends Link, which have an additional getDefinition which returns the ReferenceDefinition

plus the need for a generic link interface, I was thinking this:

public interface GenericLink {
	String getLabel();
	String getLinkDestination();
	String getTitle();
}

Autolinks only have linkDestinations, which calls into question a functional need for inheritance to Link.

public class Autolink extends Node implements GenericLink {
	private String linkDestination;
	interface impl just refers to linkDestination...
}

ReferenceDefinitions do break down to exactly the same components as Link, so inheritance is worthwhile from a reduction of code duplication standpoint but I don't know if a ReferenceDefinition is a Link. So straight inheritance, a class just for type purposes:

public class ReferenceDefinition extends Link implements GenericLink {
	not much here...
}

~~or compositional inheritance:~~

~~public class ReferenceDefinition extends Node implements GenericLink { private Link linkReferenced; methods call through to linkReferenced.. }~~

^edit: nope. Don't like that idea after I've thought about it.

or a complete duplicate/copy of Link's functionality. An inheritance decision here might also influence Autolink's implementation.

Then, we'll need something that references the ReferenceDefinition. From the spec:

A link reference definition does not correspond to a structural element of a document. Instead, it defines a label which can be used in reference links...

Emphasis mine, because I've thought of this as a ReferenceLabel

public class ReferenceLabel extends Node {
	private ReferenceDefinition referenceDefinition;
	...
}

It's not a link itself, just a pointer.

Thanks for reading!

jonsampson avatar Sep 16 '17 17:09 jonsampson

Hey! Thanks for picking this up :).

So, a couple of thoughts:

  • I think Link itself should be the name of the interface (not GenericLink)

  • Link should be implemented by all nodes that result in a rendered link. So that means the following:

    Link within text: [foo]
    
    [foo]: /url "title"
    

    Would result in: [foo] is a ReferenceLink ... implements Link. The last line would be a LinkReferenceDefinition which does not implement Link.

  • I'm not sure getLabel belongs on the interface. Link labels are only used for reference links, but not inline links.

Does that make sense?

robinst avatar Sep 20 '17 04:09 robinst

I'm happy to help, bud! Thanks for doing lots of the heavy lifting on this project.

  • Link as name of interface - I have no problem with that. The purpose of GenericLink was solely to disambiguate it from org.commonmark.node.Link.
  • Link is implemented by all nodes that result in a rendered link - Good note; makes sense.
  • getLabel in the interface - Link labels are optional in inline links - Link spec refers to them as link text. Currently in commonmark-java, a Link node has a destination and title and optionally contains a Text child node for link text. A LinkReferenceDefinition as proposed would be implemented similarly.

So, to distill: do you think that the link label of a link reference definition equates to the link text of an inline link?

Inline Link Example 459 - link is link text [link](/uri "title")

yields <p><a href="/uri" title="title">link</a></p>

Ref Def Link Example 159 - foo is link label [foo]: /url "title"

[foo]

yields <p><a href="/url" title="title">foo</a></p>

Proposal: link = foo

getLabel may not be the best choice for the interface method name since we're talking about building a link from this object. getLinkText or getText might be more in line with the nature of the work.

So now, having written this, I've stumbled across http://spec.commonmark.org/0.28/#reference-link, a sub part of the Link section, which could further inform on how I should visualize this. I hadn't realized there are full reference links, collapsed reference links, and shortcut reference links (which appear to be what I was modelling this stuff based upon). I've got some more reading to do!

jonsampson avatar Sep 21 '17 16:09 jonsampson

Yeah, I see what you're saying. I think the ReferenceLink would have foo as child node(s). But it would also be useful to just have a String representation of it, which allows you to look up the corresponding LinkReferenceDefinition from a Map. Calling that getLabel would be fine. But it wouldn't be on the generic Link interface because not all links have a label.

Anyway, I think as soon as someone starts implementing this, it should become clear and we can discuss details on the PR.

robinst avatar Oct 21 '17 00:10 robinst

The first part of this is now done, see cb43ae9.

robinst avatar Jul 12 '19 06:07 robinst

With the MarkdownRenderer now merged:

  • https://github.com/commonmark/commonmark-java/pull/306+

The goal of this issue is now nice and clear: Preserve enough details about links in nodes to be able to render them back in their original format.

robinst avatar Apr 06 '24 05:04 robinst