nickel icon indicating copy to clipboard operation
nickel copied to clipboard

Handle non-local dependencies

Open thufschmitt opened this issue 4 years ago • 11 comments

Is your feature request related to a problem? Please describe.

The only way in Nickel to load some external code is by hardcoding its path (relative or absolute) in the nickel file. This is fine to refer to files inside of a project, but doesn't scale as soon as someone wants to refer to an external nickel “library”

Describe the solution you'd like

Have a mechanism to reference “external files” in nickel.

I guess designing the final shape will require a bit of work, but right on top of my head I can think of several approaches

  1. Dhall-like semantics where some urls can be directly imported
  2. Unix-style (also Jsonnet's) semantics, with a list of search paths (either provided on the cli with a -I argument, or via a NICKEL_PATH environment variable)
  3. lockfile semantics, with a nickel.lock file (or whatever) that would specify how to map specific inputs to actual files. This lockfile could either be generated by nickel itself (but that seems quite out-of-scope for the language), or could have a public schema so that it could be generated by other tools (Nix, looking at you) or manually.

thufschmitt avatar Mar 18 '21 12:03 thufschmitt

I think, the second one can be tackled easily.

About the first one, do you mean to have sugaring for the imports? something like: import "./path/to/some.ncl" would be equivalent to ./path/to/some.ncl ?

I suppose that the 3rd is not in the scope for a MVP. May be the 1st one neither.

francois-caddet avatar Oct 13 '21 07:10 francois-caddet

About the first one, do you mean to have sugaring for the imports? something like: import "./path/to/some.ncl" would be equivalent to ./path/to/some.ncl ?

I had in mind something a bit more involved, allowing for example import "https://some.nickel.file.on/the/internet.ncl". But I’m actually not fond of it as it makes the language notably more complex and creates a whole class of issues.

I suppose that the 3rd is not in the scope for a MVP

It might be as it can be reasonably simple (not really more complex than the 2d approach) and it’s the kind of things that might be worth setting early-enough to make it standard.

thufschmitt avatar Oct 13 '21 11:10 thufschmitt

Preamble

In the following, PM is used for package manager.

We explored the following possibilities:

  1. A lockfile based system, Nickel-specific, where a dedicated tool would do all the package management itself (like npm, opam, cargo, etc.)
  2. Lockfile-based, but offloading the pinning, updating and dependencies management in general to another package manager. Nickel would just need to be able to understand lockfiles to know how to resolve non-local imports, but wouldn't have to handle the rest.
  3. Use an even more primitive solution, like include paths. A separate tool could set the right environment from a lockfile while calling nickel from your preferred package manager.
  4. Allow to import URLs, possibly with SHA hashes for basic pinning/security.

This discussions revolved a lot around what should be the exact role of Nickel. We all agreed that an ideal solution would be a composable, lockfile-based solution. The problem is that package management is hard, it is a deep rabbit-hole that requires to figure out a lot of things upfront. It has been re-implemented so many times that would be sad to do it yet another time. What's more, one constraint is that we want to maintain a reasonable closure size as much as possible (for embedding Nickel in many workflows easily, to deploy the WebASM playground, etc.).

Although not mentioned explicitly during the call, we morally based our assessment on the following criterions:

  • Complexity: How complex is the solution to design, implement, and maintain? Does it increase dramatically Nickel's closure size?
  • Featureful/composable: How many practical use-cases can we handle? Can we easily compose package/repo/directories/module (whatever is the chosen notion for a non-local import unit): can we handle the transitive dependencies?
  • Fragmentation/evolutivity: is the chosen solution likely to be superseded by a different solution at some point? If yes, is the transition manageable (locally vs having to change all dependencies transitively for example)? Or will that likely cause ecosystem fragmentation?

Option 1: full-fledged package management

Option 1. is complex. It requires a lot of time, design, implementation and maintenance burden. It could also increase Nickel's closure size, although the package management part would be in a separate tool. It's probably the most featureful and evolutive one, though. However, going for this one would mean not having even a basic mechanism for non-local dependencies for quite some time, until the system is designed and starts to be implemented, which can be prohibitive.

Option 2: lockfile based resolution, offload package management

Option 2. is appealing, as it is still quite featureful and evolutive, while being less complex. A natural choice would be to use Nix/flakes as the PM, and have Nickel understand lockfiles directly. We can also define a simpler, PM-agnostic Nickel-specific lockfile format and a simple tool flake2nickel to convert a flake lockfile to a Nickel one. The issue is we want to be Windows compatible, which exclude Nix as a unique solution.

Using a PM-agnostic lockfile format though, users could use whatever PM they want, such as npm, only requiring an npm2nickel tool. But this is not as simple as it first appears: what about transitive dependencies (NPM packages inside node_modules having themselves dependencies)? Would they also need their own local nickel lockfile? Even so, having several possible PM would fragment the ecosystem, as it is then not trivial to compose a Nickel package served as a flake with one served as a npm package.

Option 3: include paths

Option 3. is something used by e.g. Jsonnet or even the good old C/C++. We think it doesn't really provide any advantage over 2., as it is marginally less complex, but has the disadvantage of making composability even harder, and risk introducing fragmentation if/when at some point we introduce a lockfile-based mechanism.

Option 4: include URLS

Option 4. is used by Dhall. It has the advantage of being simple, not having to handle dependencies, packages and all, although the impact on the closure size may be noticeable because of the requirement of an HTTP(S) stack. It is more bare-bone, as you can't really have a fine-grained management of dependencies nor an easy update: either you pin things via SHAs but update is painful, or you don't pin but this becomes brittle. We can still imagine having an external tool that handle the update, as importing from URLs with SHAs is in some sense a lockfile that is inlined in the source file.

All in all, it could still make for a nice alternative for simple workflows while waiting for a better solution like option 1. or option 2. However, such a transition is always risky and may cause a lot of friction. One way to lower this risk is to make clear in documentation from the beginning that this is a transitory solution.

Conclusion

At the end, our feeling is that going for option 4. while calmly figuring the details of option 2. is a reasonable trade-off.

yannham avatar Oct 27 '21 16:10 yannham

I think that only option 4 makes sense today. We don't really have an identified need for anything more complex today. When we know more about how people use Nickle, we can revisit the question.

aspiwack avatar Nov 04 '21 15:11 aspiwack

The Dhall experience seems much worse than NIx today, when it's really unclear what code needs network access. I do not recommend that.

Ericson2314 avatar Nov 12 '21 17:11 Ericson2314

Maybe we want to ping @Gabriel439 on this, since they have the most experience with non-local dependencies in a config language.

Profpatsch avatar Nov 12 '21 18:11 Profpatsch

From what i gathered you really want a semantics where file A can import file H1 on Host H, which in turn wants to import file H2 on Host H or even File I1 on Host I.

So you need both a more complex notion of what “relative” import means, as well as a concept of cross origin security (CORS).

Profpatsch avatar Nov 12 '21 18:11 Profpatsch

I view the tradeoffs of URL-based imports differently: from my perspective URL imports provide a simpler user experience but they are not necessarily simpler to implement. In fact, they are actually the greatest source of complexity in Dhall and the thing that most Dhall binding authors complain about implementing.

Specifically, URL imports simplify the user experience in the following ways:

  • Purity

    All of the information that you need to reason about the code (including how to fetch dependencies), resides within the code itself. What you see is what you get. For example, if you've ever been bitten by impurities in Nix (e.g. the NIX_PATH) then you will probably understand the importance of this. Having an out-of-band package management process is (in my view) analogous to introducing impurities into your code.

    If I had to pick only one reason for doing "inline" package management within the code this would be the one.

  • Package publication is simpler

    Any web service that can host source code can be used to publish a package (e.g. a gist). Contrast this with a language like Haskell where you need to create a .cabal file and upload the package to a dedicated package repository (Hackage in this case)

  • Package subscription is simpler

    In the simplest case, all you have to do to use an expression is to paste the expression's address in your code directly where you need it. Contrast this with, say, Haskell where you need to add a package to your dependency list (optional: specify a version range), import the package, and reference the imported code

However, URL imports actually complicate the implementation in the following ways:

  • URL imports need to support relative imports correctly

    In other words, if you import https://example.com/A and that contains a relative import of ./B then that relative import needs to resolve to https://example.com/B and not ${PWD}/B

  • You need to disallow remote imports from importing "local" imports that are not relative paths

    Local imports can include absolute paths or environment variable imports.

    Dhall calls this the "referential sanity check" if you want to search for more details on this.

  • You need to support CORS

    … to protect against server-side request forgery

  • You need to support header-based authentication

    This is actually kind of a mess in Dhall. We have two separate ways of doing this since we had to learn over time what use cases users had in mind. I'm not entirely sure we've totally solved the user experience for this.

  • You need to support integrity checks

    … and ideally they should semantic integrity checks and not textual integrity checks, for the reasons outlined in this post. You don't necessarily have to interpret code before computing the hash like Dhall does (I think that might have been a mistake in retrospect), but it should definitely be a hash of the AST and not a hash of the source code.

  • You need to implement a cache

    … and it should be content-addressable and use the integrity check as the lookup key

You will probably also want to read Dhall's Safety Guarantees post, which covers the above topics in more detail.

From the perspective of making your language integrate with an existing package manager (e.g. Nix or Bazel), URL imports with integrity checks can play the same role as lockfiles, meaning that you can generate code for an external package manager from URL imports if they all have integrity checks. See, for example, the Dhall integration for Nixpkgs, which explains this in more detail:

The way I like to think of it is that Dhall technically does have a lockfile, but it's intermingled with the source code (in the form of the URL imports with their associated semantic integrity checks).

Gabriella439 avatar Nov 12 '21 19:11 Gabriella439

Thanks @Gabriel439 !

This makes me think that there is a solution that we haven't been considering yet which is “just use Git submodules”. I'm not terribly fond of Git submodules. But they exist, which makes them quite a bit simpler than what we've been proposing so far.

aspiwack avatar Nov 15 '21 08:11 aspiwack

Two more points to consider:

  • Blowup of the dependency tree because of dependency on a http client and tls stack
  • Harder to compile the resulting language “to the web”, because you have to stub out the http stack with e.g. browser fetch requests
  • There’s effectively two subsets of the language, one where people can afford to do http requests and one where people can’t (for example in a sandbox, for dhall that means a lot of workarounds to pre-cache things in the nix sandbox and lots of conceptual overhead)

Profpatsch avatar Nov 15 '21 10:11 Profpatsch

Thanks for the hindsight, @Gabriel439. It sounds like we underestimated the URL route :confused:

This makes me think that there is a solution that we haven't been considering yet which is “just use Git submodules”. I'm not terribly fond of Git submodules. But they exist, which makes them quite a bit simpler than what we've been proposing so far.

I've seldom used git submodules, but indeed that could be a (zero-cost I guess? There's pretty much nothing to do on the Nickel side for us?) portable solution in the meantime. In parallel, we could have some flake template/Nix library that makes it easy to distribute Nickel code as flakes, as it is highly likely that most of first adopters are using it.

yannham avatar Nov 15 '21 14:11 yannham

  • Git submodules don't get us very far, for example, in nickel-nix where we want to provide Nix flake templates, and Nix templates don't play well with submodules.

  • I think we should be very explicit about when we're importing "local" and "non-local" dependency. I propose to use syntax import "some/path.ncl" from "scheme:package".

  • I would really like to not introduce yet another locking+caching mechanism for dependencies. Every language ecosystem in the world has one, and Nickel could leverage that. In the syntax example above I use scheme:package as source definition. Here scheme could point to which ecosystem to use. For example, nix: could use flakes, cargo: could use Cargo, go: could use go modules, and so on for npm:, pip:, cobolget:, etc. All of these ecosystems have their own lock files and somehow cache the "package" locally. We'd need to query relevant tools for the path to package and then append some/path.ncl to it to import it. Note that all querying should happen via tools and their output, so no additional dependencies should be required.

    For example, for nix: schema we could support:

    • nix:flake:nixpkgs - flake:nixpkgs part is passed down to nix flake metadata --json to get .path pointing to where the flake lives in local filesystem. This relies on Nix registry and is not locked in any way. Nix will cache the flake in local store before returning the result.
    • nix:input:nixpgks - similarly to above looks up input from current flake with nix flake metadata --json --inputs-from . nixpkgs. Current flake is defined by closest flake.nix in the filesystem hierarchy or however nix flake changes this definition in the future. The input is locked in the relevant flake.lock and Nix will cache it in the local store.
    • nix:<installable> - for everything else, call nix build --no-link --print-out-paths <installable> to fetch/build the target, and use the result as base.

    With cargo: schema:

    • cargo:package could trigger cargo metadata --locked call that will fetch and cache all dependencies for current crate, then output data for each package, including its cached path. It will be locked in Cargo.lock.

    We could work in similar way with all other ecosystems:

    • call package manager to cache the dependency
    • query path to the cache
    • use the result as base directory for import

YorikSar avatar Apr 21 '23 14:04 YorikSar

Forgot to mention: this is all in favour of option 2 above: reuse other package managers.

YorikSar avatar Apr 21 '23 14:04 YorikSar

I would really like to not introduce yet another locking+caching mechanism for dependencies.

I wholeheartedly agree.

In the syntax example above I use scheme:package as source definition. Here scheme could point to which ecosystem to use. For example, nix: could use flakes, cargo: could use Cargo, go: could use go modules, and so on for npm:, pip:, cobolget:, etc.

On the other hand, it comes with a risk: fragmentation. Let's say you're a user, and you want to use a bunch of different Nickel libraries: that would be very frustrating if you have to install nix, cargo, npm, and yarn in order to finally run your small config. I'm not saying this is a no-go, but I just want to bring this point. Maybe the solution is to bless one package manager by default - totally randomly, nix :roll_eyes: - while still allowing (but make them harder) ways of using a different one? Something as dead simple as not putting any scheme would default to nix could already go some way.

yannham avatar Apr 21 '23 16:04 yannham

On the other hand, it comes with a risk: fragmentation. Let's say you're a user, and you want to use a bunch of different Nickel libraries: that would be very frustrating if you have to install nix, cargo, npm, and yarn in order to finally run your small config.

I agree that it would lead to some fragmentation, but I would expect people who mainly use Nix have Nix dependencies that provide some Nickel libraries in them (this is the case for nickel-nix), but also if some JS library provides some Nickel niceties, it would most likely be published on NPM anyway. I doubt there will be a point where JS app wants to use Nickel library provided by Rust library. Of course, it would be nice to have "pure" Nickel libraries, but I would expect that with Nickel not being an independent general-purpose language (it's always tied to some project), it might be too early to implement separate packaging infrastructure for it.

I'm not saying this is a no-go, but I just want to bring this point. Maybe the solution is to bless one package manager by default - totally randomly, nix 🙄 - while still allowing (but make them harder) ways of using a different one? Something as dead simple as not putting any scheme would default to nix could already go some way.

Defaulting to Nix has its downsides: not everybody wants to or can use Nix. It would be a shame if Nickel wouldn't work on Windows just because it requires Nix for packaging, for example. Also, it wouldn't be as hidden as one would prefer: locking would only be ensured if you write a proper flake, and use nix:input: or smth similar.

YorikSar avatar Apr 21 '23 16:04 YorikSar

Defaulting to Nix has its downsides: not everybody wants to or can use Nix. It would be a shame if Nickel wouldn't work on Windows just because it requires Nix for packaging, for example. Also, it wouldn't be as hidden as one would prefer: locking would only be ensured if you write a proper flake, and use nix:input: or smth similar.

Ah, Windows is a very good point (that I tend to forget about :sweat_smile: ). I think my idea of a blessed package manager was to say: please use your own specific package manager for domain-specific librairies, but for pure Nickel library, we just made an arbitrary choice for you. Even in the future I would be sad if we have to re-implement a package manager for the thousand time, so I guess we mostly align on the idea but I wanted to offload the default package manager for pure libraries to an existing one. Maybe it should just be something different from Nix, then. One that is light and portable, if possible.

yannham avatar Apr 24 '23 17:04 yannham

I'm trying to implement my proposal and I hit a snag: while import is a keyword in Nickel and it's parsed specially, it's actually just a builtin function with some extra sauce (imports are resolved ahead of time). That's why currently import "file.ncl" from "smth" is equivalent to importing a file and passing arguments from and "smth" to the result.

Nix treats import as just a builtin function, there is absolutely nothing special about it, and you can even assign new values to it, or do something nasty like scopedImport {import = throw} ./file.nix to override it completely. I don't think we want this for Nickel since we do want to typecheck before evaluation, and that's impossible in such dynamic case.

What do you think about making import into actual statement? It requires special treatment from Nickel, and it syntactically enforces the "first" argument to be a literal string, so in a way it already is. Also, I think import <nixpkgs> {} is the prevalent use case for "import as a function" in Nix, and it's not the case for Nickel. I found only a couple places in tests where import "smth" args is used. I think (import "smth") args would be more clear for user as it separates place where special rules apply (only literal string in argument) from the rest of the code. This would also allow us to add from to this statement without making it into another reserved identifier.

Other options in the context of this issue would be:

  • Using a different keyword for importing from an external source (i.e. importFrom "nix:input:nickel-nix" "./nix.ncl"). This adds another keyword that should generally be avoided if possible.
  • Switching up the order of import arguments like import from "nix:import:nickel-nix" "./nix.ncl". This is just not pretty and confusing. from might be a variable in scope, but here it's a special marker. Also, now we have 3 "special" arguments to the import "function".

YorikSar avatar May 02 '23 11:05 YorikSar

Fixed by https://github.com/tweag/nickel/pull/1716

thufschmitt avatar Nov 28 '23 15:11 thufschmitt