shadow-cljs
shadow-cljs copied to clipboard
Add support for Yarn 2.x
Yarn has made some significant changes in Yarn 2.x. It no longer uses node_modules/ and instead uses a virtual filesystem via PnP (https://yarnpkg.com/features/pnp). Due to this change, shadow-cljs is not compatible with it as it looks for node_modules/ to exist, which it no longer does. It seems like it would be fairly simple to look for node_modules/ as well as use .pnp.js to support npm and Yarn 1.x as well as Yarn 2.x.
It seems like it would be fairly simple
Not really.
I don't know exactly how pnp works and the last time I checked it required executing node to resolve the dependencies since for some reason they aren't available as data. shadow-cljs currently resolves all dependencies on its own without ever executing node.
So yeah this is very far from simple. I can see if I can figure it out when I have some time but for now I'd recommend sticking with regular node_modules.
I am using Yarn 1.x. I tested the current preview of 2.x to test it and relay feedback. 2.x is not released yet so I was just being nice to alert you of the upcoming changes to be able to support it when that happens. .pnp.js contains two maps that tell any software using it where the dependencies are located and what their own dependencies are. Sure, as a Clojure(Script) engineer, I'd love if there was just a data file that did what it's doing. Perhaps that may change in the future, but this is the current trajectory of Yarn and possibly npm.
shadow-cljs may not be calling node directly, but people are required to either use npm or yarn to install and manage the JS dependencies. So this will eventually become an issue and would be unfortunate if we have to use old versions of Yarn or go back to npm just to use shadow-cljs (and I love your work and have promoted it to many people in many places, but I do have issues with being locked in to certain software when it can be avoided. There is time to figure out how to support both).
Per your own requirements:
node.js (v6.0.0+, most recent version preferred)
npm (comes bundled with node.js) or yarn
I suggest taking a look at Yarn's roadmap (https://github.com/yarnpkg/yarn/issues/6953) as it gives an explanation as to what is changing and how it will be easier used. This is why I said it should be trivial to implement in shadow-cljs. Specifically:
- Related to the plugin system, Yarn will become an API as much as a CLI. You can expect to be able to require it and start using its components in your script - no need to parse your package.json anymore, no need to run the resolution .. Yarn will abstract all those tedious tasks away.
Some discussion on cross language Yarn PnP integration here: https://github.com/evanw/esbuild/issues/154#issuecomment-634854150
Typescript has been resistant to adopting Yarn PnP, and going over their position can give insight on why would a build tool not be inclined to support PnP right now:
- https://github.com/yarnpkg/rfcs/pull/101#issuecomment-422515704
- https://github.com/microsoft/TypeScript/pull/35206
- https://github.com/microsoft/TypeScript/issues/28289
Yarn PnP is interesting and solves some shortcomings present in npm and the general node resolution algorithm, but it is at its core an opt-in approach that requires changes in any tool that relies on their own implementation of the node resolution algorithm.
I found a solution to this is simply to use the built-in node-modules plugin by placing a .yarnrc.yml in the root of the project containing:
nodeLinker: node-modules
This causes packages to be copied into node_modules ala npm and Yarn 1 which works well with shadow-cljs.
Therefore I suggest documenting the above as the solution rather than adopting PnP.
Thoughts @thheller ?
@superstructor this is pretty much the same as not using yarn v2 in the first place so probably not what people are after.
Maybe. But for me, Yarn 2 is more interesting as a maintained alternative to NPM than for dropping node_modules. Since at some point, I guess Yarn 1 won't be maintained.
All of Yarn 2's arguments for dropping node_modules appear to amount to 'make it faster' which is really of zero interest compared to reliable dependency handling/repeatable builds etc. NPM/Yarn 1 is already fast enough imo and if its not, buy a faster SSD.
Just to add, one of the big reasons for Yarn 2's pnp system can also be disk space usage. I was using pnpm originally for some of my projects (which uses hard links to save space) but then switched to yarn2 as you can gain disk space and speed.
The current approach is to have yarn v1 installed globally and yarn v2 installed on a per project basis. with pnp enabled (which it is by default on yarn2) all the packages are downloaded as zips I think typically into .yarn/cache somewhere and are not uncompressed. Also there's a global cache option which when enabled just keeps a single store of the zips typically under C:\Users\username\AppData\Local\Yarn\Berry\cache
Then it generates a .pnp.js file in the project which I think is somehow used to access these modules virtually. For tools that don't support pnp just yet (like the typescript compiler tsc) You can prefix / wrap the tool with "pnpify" which makes the tool think it's accessing stuff from node_modues when in actual fact it's just pulling from the stored zip files
Typescript has been resistant to adopting Yarn PnP, and going over their position can give insight on why would a build tool not be inclined to support PnP right now:
* [yarnpkg/rfcs#101 (comment)](https://github.com/yarnpkg/rfcs/pull/101#issuecomment-422515704) * [microsoft/TypeScript#35206](https://github.com/microsoft/TypeScript/pull/35206) * [microsoft/TypeScript#28289](https://github.com/microsoft/TypeScript/issues/28289)Yarn PnP is interesting and solves some shortcomings present in npm and the general node resolution algorithm, but it is at its core an opt-in approach that requires changes in any tool that relies on their own implementation of the node resolution algorithm.
Sorry for the late reply on this. I'm not sure I understand the backlash you're referencing here. Perhaps I'm misunderstanding some behavior of PnP, but from my understanding it is designed to enable custom behavior interop by specifying it as an API. I'm not sure I see the parallel between what TypeScript and shadow-cljs is trying to accomplish. TypeScript needs to add searching for types along with the regular dependencies. On the other hand, shadow-cljs is just pulling in the JS dependencies, no?
One of the comments in the PR is capturing what I was originally suggesting at the start of this thread:
Ore4444 on Jan 26
wait, isn't pnpapi just an implementation detail of the compiler? The logic could all be conditional to presence of 'pnpapi' module. When npm would settle on its approach, supported could be added without breaking changes, no?
So by that reasoning, couldn't we enable support of Yarn 2.x by using the PnP logic conditionally based on the presence of pnpapi?
@thheller I have time available now to assist on this. I'll take a deep dive through the existing code base to gain a better understanding of your current process.
@WhoNeedszZz the "shortest" path to integrate this would be implementing a function like
(defn pnp-resolve [state build-config from require] ...)
from and require would both be strings. The return value should be a string representing a filesystem path. I'd assume that path to be absolute given how pnp works.
require is just the required library name or path. from might be nil in cases where an "entry" is to be resolved. Meaning a direct blank require, eg. from the repl (require '["foo" :as bar]). It is otherwise a string returned by another pnp-resolve call.
(:require ["foo" :as foo])would end up as(pnp-resolve state build-config nil "foo").import Other from "./other.js"would be(pnp-resolve state build-config the-foo-we-are-in "./other.js").
I'd assume the pnp-resolve to require some state so my usual pattern for this would be to have
(pnp-start (:yarn-pnp config))(with config beingshadow-cljs.edn). Should return whatever state it needs. It will be passed as the first argument to(pnp-resolve state build-config from require).(pnp-stop state)called whenever the build or shadow-cljs is stopped. Should clean up all state it created if needed.
There are many examples for this pattern throughout the codebase. I'd be ok for everything to live in shadow.build.pnp. Heck you could even write it as a separate library so I don't have to maintain it. Hooking up the rest I could do on my own. You are welcome to give it a try though by implementing this multimethod.
https://github.com/thheller/shadow-cljs/blob/b06ef3a56554c04431edb5a87ded2db1cfcde110/src/main/shadow/build/resolve.clj#L67-L68
Basically there would be a :js-provider :pnp. You can start by copying the :shadow impl and just replacing everything that looks at files directly and instead goes through pnp. I still have not looked at how that works exactly.
@WhoNeedszZz the "shortest" path to integrate this would be implementing a function like
(defn pnp-resolve [state build-config from require] ...)
fromandrequirewould both be strings. The return value should be a string representing a filesystem path. I'd assume that path to be absolute given how pnp works.
requireis just the required library name or path.frommight benilin cases where an "entry" is to be resolved. Meaning a direct blank require, eg. from the repl(require '["foo" :as bar]). It is otherwise a string returned by anotherpnp-resolvecall.* `(:require ["foo" :as foo])` would end up as `(pnp-resolve state build-config nil "foo")`. * `import Other from "./other.js"` would be `(pnp-resolve state build-config the-foo-we-are-in "./other.js")`.I'd assume the
pnp-resolveto require some state so my usual pattern for this would be to have* `(pnp-start (:yarn-pnp config))` (with config being `shadow-cljs.edn`). Should return whatever state it needs. It will be passed as the first argument to `(pnp-resolve state build-config from require)`. * `(pnp-stop state)` called whenever the build or shadow-cljs is stopped. Should clean up all state it created if needed.There are many examples for this pattern throughout the codebase. I'd be ok for everything to live in
shadow.build.pnp. Heck you could even write it as a separate library so I don't have to maintain it. Hooking up the rest I could do on my own. You are welcome to give it a try though by implementing this multimethod.https://github.com/thheller/shadow-cljs/blob/b06ef3a56554c04431edb5a87ded2db1cfcde110/src/main/shadow/build/resolve.clj#L67-L68
Basically there would be a
:js-provider :pnp. You can start by copying the:shadowimpl and just replacing everything that looks at files directly and instead goes through pnp. I still have not looked at how that works exactly.
given this solution, it could also probably work with pnpm as well. just add a new :js-provider (or even re-utilize the same one, as, from what i understand, Yarn PnP and pnpm have a similar file structure). currently, given that pnpm places the dependencies in symlinks in a flat structure, it errors out whenever i try to run watch on a project