dotland Allow semver ranges and identifiers in urls

Allow semver ranges and identifiers in urls

Open jimsimon opened this issue 4 years ago • 12 comments

The website should be able to resolve semver symbols. Being able to use semver to dedupe dependencies is an important step towards facilitating the development of medium to large web apps using Deno. I believe some other github redirectors already support this as well. While we could use those CDNs instead, the deno.land registry is currently the only place that exclusively lists deno modules. It may be possible to build further deduping tools once basic semver support is in place.

Some examples of what URLs would look like: https://deno.land/std@^0.50.0/path/mods.ts https://deno.land/std@~0.50.0/path/mods.ts https://deno.land/std@>=0.50.0/path/mods.ts https://deno.land/[email protected]||^0.51.0/path/mods.ts https://deno.land/[email protected]/path/mods.ts

May 19 '20 01:05 jimsimon

This seems like a recipe for breaking. I guess I am bias/prefer udd-style fragments (where dependency urls are only updated explicitly by the maintainer, not randomly by a user).

Despite this being relatively standard it has always seemed crazy to me... What is the benefit?

May 19 '20 05:05 hayd

So a lockfile (which deno supports) allows you to fix the versions to a degree. Some way to do resolution overrides also helps (import maps might provide this). So there are ways to do so, and nothing I'm proposing stops you from using fixed versions in your modules if your preference is to avoid them.

The main benefit of using semver is runtime performance of larger web apps (client primarily, but also server apps to a much lesser degree). Without semver and dependency deduping, you start getting unwieldy bundle sizes due to multiple versions of the same dependency being bundled. Take the following dependency tree as a really simple example:

MyApp dependencies
  - [email protected] (1kb)
  - [email protected] (1kb)
    - [email protected] (1kb)

If we use fixed dependency versions, then two copies of dependency A will end up in our bundle. However, if we use semver ranges of ^1.0.0 and ^1.0.1 for dependency A can resolve both to 1.0.1 because semver tells us they're backwards compatible. The end result is that only one version of dependency A is included in our bundle resulting in 1kb less of JavaScript to send over the wire, and then parse and execute in the browser.

Now extrapolate this over hundreds of dependencies many of which are larger than our 1kb examples. Some common ones that often cause this kind of issue are lodash (and it's various subpackages), jQuery, react, moment, and smaller utility libraries like leftpad and is-promise. Larger projects can quickly end up with huge amounts of unintended bundle bloat that becomes noticeably detrimental to performance if deduping isn't done. These same issues can pop up with server-side apps if the amount of code that needs to be parsed and executed becomes quite large. It's generally not too bad for something like an app packaged and deployed with docker because you expect relatively slow startup times. But it can be quite expensive and noticeable when you're dealing with something like lambdas/cloud functions.

The kicker with all of this is that it can be pain in the butt to manage even with a good dependency manager. For example yarn is notoriously bad at semantically deduping dependencies. So much so that third party tools have been created to do additional deduping logic on yarn.lock files after they are generated/updated.

Taking a look at udd, it seems to take an approach where the package author doesn't trust the upstream dependency authors to correctly follow semver constraints. Traditional package managers rely on that trust at their core, but also provide ways to compensate when that trust is violated (either intentionally or by accident). It's an interesting reversal of the paradigm that is probably useful for server apps, but as far as I can tell (and I might be wrong) it does nothing to help with deduplication for client apps.

May 19 '20 06:05 jimsimon

Taking a look at udd, it seems to take an approach where the package author doesn't trust the upstream dependency authors to correctly follow semver constraints. Traditional package managers rely on that trust at their core, but also provide ways to compensate when that trust is violated (either intentionally or by accident).

Wouldn't relying on the deno.land url to handle the semver identifiers be bad whenever trust is broken because the site has no way to compensate on a per-user basis? Or am I missing how this compensation is handled by package managers today?

I think semver handling should be done by package managers that figure out the optimal dependency graph (by recognizing semver in URLS) and then populate $DENO_DIR. Of course, you could try to solve this by making the same URL serve different versions, like you suggest, but deduplication would then only work for dependencies that are on a package registry which handles the identifiers correctly. If some of the dependencies aren't then you are back to square one and need to handle this on the client side with your package manager anyway.

May 25 '20 06:05 JohanWinther

I am against this for the same reason as @hayd. Also this is currently not possible due to rate limits on GitHub's API for getting releases for a repository. Also a user has no way to lock the semver version they receive because Deno's lock file does not store the final URL after redirects, only a hash of the content. So every request would have to reevaluate the semver range again (and potentially serve a different version).

May 27 '20 10:05 lucacasonato

May be in future can build a tool which will update the dependencies by following various rules and that will solve the problem I think

May 27 '20 13:05 Swap76

I fail to see how external tooling around the CLI can help here. If I download a script, and it specifies the exact versions of other scripts, and if this leads to duplicate code, then the only possible scenario is that I get duplicate code. I'm not sure it is preferable to ignore the exact versions of scripts, and populate $DENO_DIR with dependencies other than those specified, because that creates issues that are just impossible to fix from anyone's side.

I also don't see how I as a user could solve this issue with tools like udd, it cannot go and modify the import URLs of the scripts I'm downloading, correct? So the baseline here is that I have to contact the maintainers of these scripts to release new versions just to import updated dependencies. If they don't, then I'm just screwed. Given them semver ranges would allow updates without them becoming active for every single patch release of every dependency.

I might not care about having the same files over and over again for server-side applications because the difference in startup time is probably not even measurable, but that certainly excludes me from using deno bundle because I'd then ship hundreds of KiB of redundant JS on my site.

This seems like a recipe for breaking.

But https://deno.land/x does support not specifying a version number at all, which will deliver the most recent version. I would argue that support semver ranges is rather a step forward from this point of view, because it allows to add at least some constraints to what code is downloaded. One case where this would be very helpful is when I put out tutorials on the internet where people copy & paste the link for a one-off execution. At the point of writing, I don't know if the next major version will still work with my code examples.

The web supports loading scripts via semver, cf. https://unpkg.com/. Even loading npm packages in Deno supports semver ranges, cf. https://www.skypack.dev/. Why can't we have that for Deno modules?

Ironically, a workaround right now is to circumvent https://deno.land/x and to upload the Deno code to npm, so it can be imported with semver through skypack.

Apr 25 '21 11:04 KnorpelSenf

https://github.com/EdJoPaTo/deno-semver-redirect (written in Rust) solves this problem already by wrapping deno.land/x.

For instance, https://dsr.edjopato.de/grammy/0.x/mod.ts will parse the semver specifier and then redirect you to the correct latest 0.x version on deno.land/x, so you can use these URLs in import specifiers.

Apr 25 '21 17:04 KnorpelSenf

I created it with Deno Deploy. https://lib.deno.dev

Aug 15 '21 10:08 tani

@tani are there any differences in functionality?

Aug 15 '21 16:08 KnorpelSenf

My service is friendly with Deno community.

Migration: The minimum migration styep is the replacement of deno.land by lib.deno.dev. There are no more modifications.
Compatibility: We can use both semver syntaxes of deno libraries (v1.x) and node libraries (1.x) based on deno-semver
Network: This service is deployed on the edge network provided by Deno Deploy. The network speed is important for me.
Couldflare Workers: It seems that deno.land uses Couldflare workers. Deno Deploy is compatible with it.
Deno: This service is written in Deno and is running on Deno Deploy for Deno libraries.

and I like this domain lib.deno.dev. I would like to use pkg.deno.dev like pkg.go.dev but it was taken.

Aug 16 '21 01:08 tani

@tani I took a quick look at your project over the weekend, it looks interesting! I wouldn't be against merging it if you want to open a PR implementing it on this repo. the deno.land website is already running on Deploy so it should be pretty easy. One thing I would want to see on that PR though would be unit tests; Your implementation seems to work using Regex and I would want to be sure that it works as expected before merging it.

Aug 16 '21 11:08 wperron

Thank you for pushing me to make a pull request. Now, I added (almost) exhaustive tests. Please see this file. https://github.com/tani/lib.deno.dev/blob/main/mod.test.ts

Aug 16 '21 17:08 tani

dotland dotland copied to clipboard

Allow semver ranges and identifiers in urls

dotland
dotland copied to clipboard