unfurl
unfurl copied to clipboard
Make this library compatible with ClojureScript
Feature Request
Description of Problem:
I'm working on a new feature for status-react. This feature is about adding URL unfurling to its chat. The problem is that it's a ClojureScript project, so it's currently not possible to import this library and use it there.
Potential Solutions:
I've researched about this .cljc
extension, which is compatible between Clojure and ClojureScript. I'd like to clone this repo and transform your .clj
into a .cljc
, and then create a Pull Request if you're interested in this.
@pmonks How hard do you think it would be to do it? Do you think it's worth it, or should I better find another solution for my need? I'm fairly new to Clojure and ClojureScript, so I would really appreciate your input here. Thanks!
Wow that would be awesome! I'd love to accept a PR that makes the library cross-platform!
<disclaimer> I'm a ClojureScript / cljc noob myself (in fact I've just started converting one of my other, simpler projects in order to learn this stuff), so take everything I say here with a grain of salt. </disclaimer>
I think the biggest challenge will be finding ClojureScript alternatives to the dependencies. And that may not be that big a challenge anyway.
Hickory is already a cross-platform library, so it should be usable from both Clojure & ClojureScript transparently (though you'll need to remove the exclusions in the project.clj
to add ClojureScript support back in).
There's no cross-platform HTTP client library that I've found, at least not without adding other functionality that we don't necessarily need (e.g. rxhttp). I'm open to a rewrite that uses something like rxhttp, but another, potentially simpler alternative might be to use something like cljs-http then use reader conditionals for the calls to the underlying HTTP client library, depending on platform. I'm not sure which approach is best - what do you think?
I'm also not sure if all those libraries support some of the more advanced HTTP features unfurl
uses. Range requests in particular are a nice optimisation that would be unfortunate to lose. Though now that I look at it, the code doesn't check if the server supports range requests (not all servers do, and some return 416 to indicate that, instead of returning the entire content with a 200), so that area of the code could probably do with some attention anyway.
But I digress...
Regarding the JSoup dependency - I'm not entirely sure why that's still there. I think I originally put it in there simply to force Hickory to use an up-to-date version of JSoup (which Hickory uses under the covers when running on Clojure), in which case it should be trivial to just remove it and let Hickory's own dependencies pull it in transitively.
There will need to be some minor changes to the unit tests so that they can run on either clojure.test or cljs.test - helpful instructions here.
Finally, I think the build related machinery will need some updates; specifically:
- I think we'll have to add lein-cljsbuild as a plugin to the
project.clj
, so that Leiningen can build and test the library on ClojureScript. - The TravisCI config should be updated to run the unit tests on both Clojure and ClojureScript (this should be as simple as adding another invocation of
lein
, with the right tasks to run the unit tests on ClojureScript - that's where lein-cljsbuild comes in).
After looking at all of this, I'm cautiously optimistic that this won't be too difficult at all!
@pmonks Thank you so much for your detailed explanation! I'm gonna give this a try.
Fantastic! Hit me up if you hit any roadblocks.
Oh and because the unit tests use public URLs (which aren't under our control and so don't always cooperate) you may notice unexpected failures when you run the unit tests. Sometimes these are temporary (e.g. the website is down or the page has changed in some way that impacts the tests) and sometimes it's permanent (e.g. the page the unit tests hit got deleted). tl;dr - don't be shy about picking alternative URLs if you find that the ones I originally chose are no longer reliable.
@pmonks I've been following your most useful suggestions:
- Removed Hickory's exclusions
- Added ClojureScript dependencies, opted for
cljs-http
- Used reader conditionals to isolate dialect-specific code both in source and tests
- Added and configured the
cljsbuild
plugin
So far I haven't done any coding; I've focused only in configuration because I'm following a test-driven approach by which I want the ClojureScript tests to fail and then fix what's not working - most likely the HTTP requests which will be done by means of cljs-http
instead of clj-http
. However, in order for the tests to fail, they first need to work... And that's where I'm hitting a roadblock:
[e18r@nietzsche unfurl]$ rm -rf target/*
[e18r@nietzsche unfurl]$ source /usr/share/nvm/init-nvm.sh
[e18r@nietzsche unfurl]$ node -v
v10.15.3
[e18r@nietzsche unfurl]$ lein cljsbuild test
Compiling ClojureScript...
☔️ Running tests on Clojure 1.10.0 / JVM 1.8.0_212
Compiling ["target/unit-tests.js"] from ["src" "test"]...
Successfully compiled ["target/unit-tests.js"] in 6.038 seconds.
Running ClojureScript test: unit-tests
/home/e18r/code/unfurl/target/cljsbuild-compiler-0/hickory/core.js:81
return (Node[[cljs.core.str.cljs$core$IFn$_invoke$arity$1(type),"_NODE"].join('')]);
^
ReferenceError: Node is not defined
at hickory$core$node_type (/home/e18r/code/unfurl/target/cljsbuild-compiler-0/hickory/core.js:81:1)
at Object.<anonymous> (/home/e18r/code/unfurl/target/cljsbuild-compiler-0/hickory/core.js:83:49)
at Module._compile (internal/modules/cjs/loader.js:701:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
at Module.load (internal/modules/cjs/loader.js:600:32)
at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
at Function.Module._load (internal/modules/cjs/loader.js:531:3)
at Module.require (internal/modules/cjs/loader.js:637:17)
at require (internal/modules/cjs/helpers.js:22:18)
at global.CLOSURE_IMPORT_SCRIPT (/home/e18r/code/unfurl/target/cljsbuild-compiler-0/goog/bootstrap/nodejs.js:88:13)
Subprocess failed
Do you have any idea what this is? I'm trying to execute the tests with node.js as this article suggests, but I don't think it's the same "Node" that the error is referring to. Also, I'm not even sure that runnig the tests with node.js is the way to go, because the article about testing you mentioned doesn't talk about it. On Hickory's Github page, there's this line about ClojureScript support:
Hickory expects a DOM implementation and thus won't work out of the box on node. On browsers it works for IE9+ (you can find a workaround for IE9 here).
But I don't know if it's even related. Also, another confusing thing is the line "☔️ Running tests on Clojure 1.10.0 / JVM 1.8.0_212", which I explicitly set as a Clojure-only thing, but it keeps appearing.
Anyway, I'll keep working on this. Any help much appreciated.
@pmonks nevermind, I managed to solve it.
@e18r nice! Was there anything you had to do to your environment to get the tests working on node.js? I now see that line in the Hickory docs, and should have realised that that might be a potential problem...
@pmonks I couldn't get it working with node, although I think there's a fork of Hickory that does support it. What I did was to use PhantomJS, and also the doo plugin.
@e18r good to know! I wonder if this would make a good addition to the readme?
@pmonks Definitely. I'll update the readme when I'm done as this is probably not the only caveat, unfortunately. The single-origin policy will limit its use mostly to react-native, I'm afraid.
Emilio Silva Schlenker
On May 12, 2019 4:49:26 AM UTC, Peter Monks [email protected] wrote:
@e18r good to know! I wonder if this would make a good addition to the readme?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/clj-commons/unfurl/issues/6#issuecomment-491564397
The single-origin policy will limit its use mostly to react-native, I'm afraid.
hmmmm......yeah. Though there are several other ClojureScript runtime environments that would benefit from this being cross-platform, beyond just react-native e.g. node (if the Hickory problem gets solved) & electron.
@e18r not sure if it's helpful, but I got that other library I mentioned enabled for cross platform builds & testing. No guarantees it's the "right" or "best" way to accomplish this, but thought I'd share what I've learnt so far, in case it's helpful for the work you're doing.
@pmonks very helpful reference indeed. I still haven't been able to successfully deal with the fact that the HTTP clients in ClojureScript are asynchronous, while in Clojure they're not. Do you have any experience with that?
@e18r I don't, no, though this section of the cljs-http docs seems like it might be what you're after. I don't have any experience with core.async
either, but from the docs it appears that the <!
fn will block until the channel has data to be processed.
@pmonks Indeed I was able to handle the request inside the go
block. The problem is, I don't know how to return that value. Every time I try to return it, I end up returning the channel. I have had similar experiences with JavaScript, in which the only possible solution is to handle the asynchrony from the parent function. But in this case, it would become unmanageable and I would need to create a different .cljs file for that. Maybe that's what I'm gonna end up doing...
@e18r keeping in mind I have zero experience with core.async
, an atom might be the way forward here? Basically the atom would be declared outside the go
block (e.g. in a let
), then its value would be set inside the go
block.
The bit I'm unclear on is how the code after the go
block stops and waits for the atom to change value, so that it can dereference the atom and return its value to the caller. There is a way to add a watch to an atom, but I don't see how that would solve the problem (given watchers are also callback based).
I'm 99.7% sure I'm missing something basic here - there must be an easy way to "de-asynchronise" core.async
logic... Might be worth pinging the friendly folks on #beginners on Slack?
@pmonks Today I finally understood something I kinda suspected already. JavaScript never blocks because it is single-threaded. This is true for the browser and for Node.js. That's why Clojure's core.async blocking operators are missing in ClojureScript. See also this reply to my comment on a StackOverflow answer about this very issue. I other words, it isn't possible to get away from the go
block. You have to do whatever you need to do in there.
So what I'm gonna do, is to provide a slightly different interface for the ClojureScript version of this library. It will accept a callback function as one of its arguments. I believe this is the right way to do it. What are your thoughts on this?
...JavaScript how do I hate thee? Let me count the ways...
Anyhoo, I think I'm ok with that @e18r, though I'd like to see the callback interface added to both implementations, so that at least there's one common code path that downstream platform-independent code (.cljc
code) can use.
Some thought might be needed as to whether to implement this in Clojure trivially (i.e. call the existing blocking unfurl
fn, then call the callback fn with the result), or whether to make the Clojure implementation asynchronous as well (so that the semantics of the callback version are equivalent across platforms). Thoughts?
FWIW, I think I'm leaning towards making both versions asynchronous.
I agree that, for the sake of compatibility, it makes sense to provide an asynchronous version on the Clojure side as well. However, I also believe that the synchronous version should be kept for backwards compatibility.
However, I don't think I'm gonna be able to do that as part of this development effort, @pmonks. This is my main source of income and if you knew how many hours I've already spent on this, you would totally understand where I'm coming from... Nonetheless I will gladly do it if, after implementing the ClojureScript version, the change doesn't seem time-consuming.
I agree that, for the sake of compatibility, it makes sense to provide an asynchronous version on the Clojure side as well. However, I also believe that the synchronous version should be kept for backwards compatibility.
Agree 1000%.
However, I don't think I'm gonna be able to do that as part of this development effort, @pmonks. This is my main source of income and if you knew how many hours I've already spent on this, you would totally understand where I'm coming from... Nonetheless I will gladly do it if, after implementing the ClojureScript version, the change doesn't seem time-consuming.
Given that clj-http
already supports asynchronous callbacks directly, I don't think this will be time-consuming - in fact it may be as simple as a two-line reader conditional where the HTTP client is invoked (the signatures for the clj-http
and cljs-http
asynch methods look almost identical).
But no worries if you don't have time to do the Clojure side - either you could submit a PR that allows editing, or I'll merge your work and add on the Clojure implementation separately.
👍 Appreciate you taking the time to contribute to the library! 👍
@pmonks I just finished. I didn't touch the documentation yet. Finally, instead of a callback, I kept the channel. That's the most clojuresque way.
As you will see, my changes aren't really homegeneous with the Clojure version (please don't be too disappointed...). Not only the ClojureScript side is asynchronous, but also it is in a different file and has its own unit tests. Also - and this might be the biggest incompatibility - it handles errors differently. It doesn't throw any exceptions, and the error messages are really poor: that comes from cljs-http
, which by the way is a pretty unmaintained library. It stopped working with phantomjs
three years ago and nobody has fixed that. Since Hickory also doesn't support Node, I had to do the tests with Karma's firefox-headless.
So creating a homegeneous interface for both dialects might be a little bit more complicated that previously thought. Do you really think it's worth the trouble? Because I have the feeling that there might not be so much platform-independent projects downstream. I mean, unless they're developing libraries like us, people have to choose whether to create Java or JavaScript projects, and the difference is pretty big. Is it not?
Anyway, please let me know what you think about my PR and about all this compatibility stuff. I'm willing to follow you in the next steps.