esy-issues icon indicating copy to clipboard operation
esy-issues copied to clipboard

[HIGH PRI] Generate and publish opam packages to pure npm.

Open jordwalke opened this issue 7 years ago • 11 comments

The current yarn fork is nice because it has some built in resolving to fetch opam packages even though they don't have package.jsons. We can then use that project to create many plain npm packages out of those opam packages, and sync them to npm under something like an @opam-beta scope.

This task will make it so that you can use esy from regular npm packages, and still get the benefits.

This github issue is to write a synchronization script as part of esy which will push all the opam packages to npm (and related things). The way we've done it in the past, is to push to forks of all the packages, and then have special tags/releases for "npm pushes", and then create dummy npm packages that point to those github tags. That's a lot of indirection that can go wrong, and it would be much simpler if we could just push each package directly to npm. However, pushing directly to npm has problems because we can't later update it - that's why we push dummy packages to npm once, that then point to git URLS. We should likely continue to do that.

One thing we can do to simplify, is to have one git repo, with many releases - one release for each pushed package at a specific version. My hunch is that this will make installs faster because npm might already cache that repo, somehow in a global git cache. That should be confirmed.

jordwalke avatar Mar 15 '17 09:03 jordwalke

will push all the opam packages to npm (and related things)

What are related things?

What about naming? Should everything be scoped? If yes how, @esyopam? If not, how do we handle naming collisions?

Just to reiterate:

Is this issue about reusing opal-packages-conversion (or likely rewriting it because it's python) to push all converted packages to one repo on github and making them available on npm? In other words a script for syncing and updating opam packages with our repo.

Unrelated but exciting

I didn't realise this was the plan and I'm super excited. That way we might not need to fork any packages for native because we will have everything in our repo. I'm wondering what can be the model for this? Maybe contributors could add some kind of config for a package that's already been pushed to github for native compilation?

wokalski avatar Mar 20 '17 21:03 wokalski

Also unrelated but exciting - maybe simple, as in pure OCaml packages could include .bsconfig for simple js interop.

wokalski avatar Mar 20 '17 22:03 wokalski

@jordwalke and I talked about creating a repository for the converted opam modules which would also contain corresponding bsconfigs. Although it's not clear if it's a good (i.e. workable) solution or not here are the takeaways and thoughts.

First of all I think it would be valuable to have all packages in one place with corresponding bsconfigs so that it's super easy to integrate with your BS project. It's also easier to maintain vs multiple forks (un)supported by multiple people.

Of course, generating bsconfigs automatically would be optimal but it's unclear how difficult it is. Until somebody figures out a solution, I think it would be very valuable if it can be done manually.

The naive solution

  • Single repo with all packages
  • One esy generated package.json per opam package
  • Optional bsconfig per npm package that can be contributed. After it's contributed we force update the package on npm.

I think it's good enough for the first version but there are some gotchas:

  • If source of every package has to be in the repo it would get super heavy. If npm pulls the whole repo to fetch a dependency it would be slow.
  • It would be impossible to update old package versions with bsconfigs.
  • the bsconfig would be deleted after every update of a corresponding package

As pointed out by Jordan, an alternative source of bsconfigs could be a single repository every opam package would depend on.

Really, there's a whole spectrum of possible implementations but the premise stays the same. One entry in package.json which brings you the bsconfig if you need it.

wokalski avatar Mar 22 '17 16:03 wokalski

Sorry for the delay!

What are related things? By the time we're done, there may be very little else to push to npm, and the entire workflow would just be a pure sync from opam to npm. Currently, we have two forks of opam packages, and the only reason we needed that was because they write outside of their sandbox IIRC, and a solid cache requires that packages not do that.

What about naming? Should everything be scoped? If yes how, @esyopam? If not, how do we handle naming collisions?

I have the @opam scope, so we can use that :D

Is this issue about reusing opal-packages-conversion (or likely rewriting it because it's python) to push all converted packages to one repo on github and making them available on npm? In other words a script for syncing and updating opam packages with our repo.

Yes, thank you for clarifying. This issue is to build the synchronization script that not only pulls down from opam, and converts, but also pushes to npm, and whatever else it needs to push to to make npm install work. That might involve also pushing to Github repos depending on the approach taken. We've previously found it helpful to have dummy npm packages like "@opam/topkg": "1.2.0" which have a single dependency on a package like "topkg-actual": gitUrlHere#npm-release-1.2.0, which is a github tag that we can force push to. Npm disallows force pushing, but this way we can accomplish it vie updating that git tag. The package.json of the topkg-actual should probably look like this project but with all the opam converted steps/dependencies of course.

Again, this is just a suggested approach and there might be a better way - the end goal is to create the scripts that publish what needs to be published, to wherever it needs to be published to, in order to make npm install work reliably, while allowing us to force push updates to converted packages if we happen to spot a mistake.

Note: More on force pushing. Npm packages not synced from an outside package manager don't have this problem because they can always bump a minor version. We on the other hand have to keep in perfect sync with opam.

Another note about git url indirection. This slows things down. However, you might be able to make a release that wipes all git history from that release tag so that cloning it is super fast.

jordwalke avatar Mar 24 '17 07:03 jordwalke

About bsconfig: I'm open minded, but as always, the best way is to automate. @opam npm seeks to be the totally automated conversion that implements opam package builds perfectly (but sandboxed and faster). We might want to create a separate scope for the packages that have been converted to @bs but then it's weird because you really want to be able to have one project and just compile it to either JS or native using opam packages.

jordwalke avatar Mar 24 '17 07:03 jordwalke

I might be too positive about bsconfigs. @dorafmon informed me that he had to fork containers because apparently BuckleScript doesn't support most of the ppx attributes. If it's the case for the vast majority of them it's not a good moment to waste energy on something with not much utility.

wokalski avatar Mar 25 '17 13:03 wokalski

@wokalski BuckleScript supports ppx attributes I believe, but it's bsb that doesn't. bsb is very specialized and doesn't build arbitrary packages. However, I mentioned one idea for how you could make bsc - the underlying BuckleScript compiler support nearly arbitrary opam packages - by making bsc (or something like it) support the exact same command line arguments as ocamlc - we can temporarily set ocamlc=bsc for the duration of an opam package build. As long as bsc respects the flags and puts artifacts in the right directory (for example it would output fake bytecode files at supplied/output/location/out.cmo, it would work. That's a larger project, but I believe that would still consume an order of magnitude less time than forking every opam package that has custom build steps (a large portion), and maintaining said forks. Forking of a few select packages that are important for web is sustainable, but if we want full compatibility we need to take a more comprehensive approach like an ocamlc facade. My two cents.

Still, I think that's out of scope for this github issue. We could open up another one if you like!

jordwalke avatar Mar 25 '17 20:03 jordwalke

Aha that made a lot more sense, but still we had the issue BS only support 4.0.2?

ghost avatar Mar 25 '17 23:03 ghost

You'll have to ask the BS core developers, but it seems they're open to supporting 4.03+, it's just a matter of priority. Reason itself will work with BS 4.03+ as long as BuckleScript does, and many ppx packages will also support 4.03+.

jordwalke avatar Mar 26 '17 05:03 jordwalke

Description thanks to @andreypopp :

How it works now — there's a bunch of Python scripts which convert opam metadata into npm metadata. Such npm metadata is distributed along with Esy as JSON files. Then custom resolver within a yarn fork uses it to resolve @opam/* packages and fetch tarballs.

The problem is that a) such scripts are "hacky" (regex-parsing, very hard to maintain and modify) and... b) they are in Python (if it were in JS we could convert them on the fly using the current opam registry as oposed to the current situation where we ship preconverted meta with Esy).

The idea is to redo them in JS... Or better in OCaml (and compile to JS via BuckleScript) because that way we can use opam-file-format package which can parse opam metadata. Then modify yarn's custom opam resolver to use that code.

That would be super useful — currently if we want to use freshly released packages from opam we need to publish a new version of Esy — with this tasks being completed we can resolve just released packages on the fly.

Bonus point for this task: write OCaml, not JS

wokalski avatar May 08 '17 16:05 wokalski

Since #98 was resolved the next smaller task is implementing a converter from opam to npm file using opam-file-format. The minimum self contained part of this task would be just implementing the conversion part, without caring about the way we pull those opam files. Just basically a function which takes an opam file and returns one in npm format.

wokalski avatar May 14 '17 16:05 wokalski