autolink-java icon indicating copy to clipboard operation
autolink-java copied to clipboard

Adapt autolink-java to replace rinku in JRuby

Open headius opened this issue 6 years ago • 1 comments

Hello! I am working on getting the Discourse app to run in JRuby, and need to replace its dependency on rinku.

There are two ways we typically do this:

  • Port the extension. This shouldn't be difficult, but it seems like you may have already done this work?
  • Wrap a JVM library.

The latter would be preferable, since all we'd need to write is a bit of Ruby to wrap your library.

However there's a few things that would make this integrate better with JRuby:

  • CharSequence is great, but the API produces String eventually. This means JRuby's byte[]-based Ruby strings need an extra conversion step, which will obviously slow down the rendering of a large document.
  • Compatibility with rinku. I'm not sure how to map the features of rinku to autolink-java and will need some tips here.

Here's a quick and dirty rinku-like wrapper based on your example code from README. It can serve as a place to start discussing: https://github.com/headius/jruby-autolink

Discourse on JRuby work: https://meta.discourse.org/t/getting-discourse-running-on-jruby/81273/14 Issue to make a JRuby port of rinku: https://github.com/vmg/rinku/issues/75

headius avatar Feb 26 '18 17:02 headius

Hey Charles! Thanks for getting in contact. I've used JRuby myself before, so I'm eager to help make it work better there if it makes sense.

CharSequence is great, but the API produces String eventually. This means JRuby's byte[]-based Ruby strings need an extra conversion step, which will obviously slow down the rendering of a large document.

Can you clarify what this means? Would you want to have another API that works on UTF-8 byte[] instead?

Compatibility with rinku. I'm not sure how to map the features of rinku to autolink-java and will need some tips here.

Yeah. To be clear, this library started as a port but the logic has since been tweaked and does not match rinku's logic 100%. Some other important differences:

  • Rinku has code to detect if the input is HTML to prevent double-linking, etc. This library does not support that, it's recommended to use a proper HTML parser and then pass only the text to autolink-java that should get linkified.
  • autolink-java finds links with all kinds of schemes, so if you want to restrict it like rinku you have to build something on top (basically check if URL links start with https:// etc).
  • autolink-java does not HTML-escape anything. So if you are producing HTML, you need to take care to do that yourself (I saw that your wrapper does not do anything there)

robinst avatar Feb 27 '18 05:02 robinst