jsonnet icon indicating copy to clipboard operation
jsonnet copied to clipboard

Relax computed importstr ?

Open ant31 opened this issue 9 years ago • 34 comments

Do you think it will be possible to use importstr with variable?

e.g:

 {
 files: [
        { 
         file: filepath,
         content: importstr filepath
         } for filepath in ['a.txt', 'b.txt', 'c.txt']
  ],
}
some_function(file):: (
    local content = importstr file;
    #process the content
  )

ant31 avatar May 26 '16 08:05 ant31

Any technical reason for importstr to not allow import variable ? I understand for import, but I think imporstr is a little different, it's equivalent to do an file.open(filepath) in other languages, and they mostly all accept a variable.

ant31 avatar Dec 09 '16 10:12 ant31

Widening it to a variable is just as "bad" as widening it to an arbitrary expression, so if we were going to do anything it would be that.

The property that we gain by having the restriction is that you can do a simple static analysis and determine which files are dependencies, and then determine whether or not the code needs re-executing. This is potentially useful although I'm not sure anyone has made use of it yet. However if we lift the restriction and lose the property, we won't be getting it back again. So it needs careful thought.

There is one difference between Jsonnet and other programming languages, and that's that everybody writes programs that terminate in a relatively short period of time, so one answer is that if you want to know what files affect the evaluation of a config, you should execute the config. Or, in the case of deciding whether or not a cached JSON value needs recomputing, save not just the JSON output but the list of files used to compute it.

sparkprime avatar Dec 10 '16 15:12 sparkprime

Thank you for the detailed explanation .

we won't be getting it back again

What about an explicit keyword or function to express it? like: importFrom(expression) , this way users explicitly chose to lose the static analysis property ?

ant31 avatar Dec 10 '16 22:12 ant31

I'm closing it, I've been able to do it easily via a native extension.

Thank you.

ant31 avatar Dec 17 '16 21:12 ant31

I suppose native extensions broke static analysis already :) Same as they do in Java etc.

sparkprime avatar Dec 19 '16 10:12 sparkprime

@sparkprime The stdlib is so limited and so opinionated that, basically, we're forced to write native extensions, which kind of defeats the purpose of using Jsonnet.

nikolay avatar Dec 15 '17 10:12 nikolay

@nikolay It's not about the stdlib, it's about the language. Hermiticity is a core principle of Jsonnet. The output shouldn't be affected by the environment. Jsonnet code shouldn't jump around the filesystem and read files. Import mechanism is designed for importing Jsonnet libraries etc. and not for arbitrary IO.

This guarantees that it will work the same on every machine and makes it much easier to reason about the code. But of course it restricts the structure of the solution - any input data must be prepared beforehand and passed to Jsonnet.

So it's sort of a "it's not a bug it's a feature" situation. That said, it would always be nice to make passing data to Jsonnet easier (and more obvious).

we're forced to write native extensions

Native extensions are not a proper solution here. It's a way to bypass the restrictions of the language. But there are reasons for these restrictions. Also native functions were supposed to be pure functions - for cases in which Jsonnet is not fast enough or a native library is used. Using them to perform IO is an abuse of the mechanism, IMO. It's good it worked for @ant31, but I don't think it's a recommended way.

(Related #422)

sbarzowski avatar Dec 15 '17 14:12 sbarzowski

If you were forced to write native functions to do things that were 1) pure and 2) relatively simple, like not regex matching or gzip compression or whatever, then there's a case for adding it to the stdlib instead.

Things that are pure, complex, but well-specified with pure implementations available in all languages (like gzip) then we could add them as builtins.

Things that have "writing" side effects should not be added as native functions by anyone. Things that have "reading" side effects could be added, as long as you're OK with it maybe reading them more times than you expect (or not at all). Typically this is OK for things like reading from the same dir (you will want to do an import cache to make it completely robust), but less OK when reading remotely.

Not having computed imports is somewhat orthogonal to all of this. It's essentially about knowing what production systems may be impacted by config refactorings. You can see this in Bazel as well - you have to predeclare what you might read. The idea is then it can always know what to rebuild.

When people ask for computed imports, I wonder what they're trying to do. We could definitely strengthen the way Jsonnet can take "input" data from outside. There was a proposal at one point to make it easier to do pipe-like functionality, e.g.

jsonnet pipe -e 'J + {x: 3}'

could be a syntax sugar for

jsonnet -e 'let J = import "/dev/stdin"; J + {x: 3}'

That gives you a jq-like functionality but it's also not that big a difference from the status quo.

sparkprime avatar Dec 15 '17 16:12 sparkprime

@sparkprime Well, that above is nice, but can't be done with importstr if you have more than external dynamic config file you want to import.

nikolay avatar Dec 18 '17 16:12 nikolay

So you have more than one dynamic config file, and you also need their filenames to be parameterized from outside of Jsonnet?

sparkprime avatar Dec 23 '17 12:12 sparkprime

Yes, @sparkprime. There are many legit use cases for it.

nikolay avatar Dec 23 '17 19:12 nikolay

What if there was a way to give a list of files on the commandline, and have them all accessible internally as a dict keyed by filename? You can do this now using --code-str but it'd be a clunky bash one-liner.

sparkprime avatar Jan 09 '18 15:01 sparkprime

@sparkprime That would definitely be an improvement over the status quo if you want to preserve the immutability at any price.

What I'm trying to accomplish is something along the lines of Kapitan and ksonnet, but using as much Jsonnet as possible and as little Bash as possible, too.

This approach would also allow the Bash script to concatenate/template-expand those configs and pass them to the Jsonnet compiler.

nikolay avatar Jan 09 '18 22:01 nikolay

@ant31 would that have helped for your original issue?

@hausdorff what do you think?

@dgarstang you once said you were continually bitten by this, what do you think?

sparkprime avatar Jan 10 '18 14:01 sparkprime

@sparkprime the approach of giving the files in the command line ? It would not solve my initial usecase.

I'm packaging all together kubernetes resources, import the json/yaml and perfom transformations with jsonnet:

manifest.jsonnet
resources/deployment.json
resources/svc.json
resouces
data/config.ini

I want to abstract as much jsonnet internal as possible to end-users to feel like 'plain' json. inside the manifest I have something like:

expand(
resources: {
  file: 'resources/deployment.yaml',
  name: 'toto-app'
 }, 
 {
   file: 'resources/svc.yaml',
   name: 'toto-app'
 }];
configs: [{
 file: 'data/config.ini'
}])

The structure for the end-user is easy and doesn't requires any knowledge about jsonnet expect the call to the 'expand' function.
the work in done internally:

expand(resources, configs):: 
 [createK8sResource(importstr(x.file)) for x in resources] + 
 [createConfigMap(importstr(config.file) for x in configs]

ant31 avatar Jan 11 '18 11:01 ant31

@ant31 In your case only a complete computed import string would do then.

sparkprime avatar Jan 11 '18 15:01 sparkprime

Unless I'm missing something providing an object with directory contents (possibly recursive), and have imported directories specified in the commandline would also work and be a bit easier to control. It would require handling the paths 100% on the jsonnet side (to get resources/svc.yaml it would require std.imported_dirs["resources"]["svc.yaml"]).

This is as strong as computed imports, but doesn't let jsonnet jump around the whole filesystem, only the specified part, hides the absolute paths and discourages libraries from silently depending on /dev/urandom and doing other un-jsonnety stuff.

Maybe that could even be integrated with the current extVar/tla mechanism.

What do you think?

sbarzowski avatar Jan 11 '18 20:01 sbarzowski

but doesn't let jsonnet jump around the whole filesystem

How this would be different in term of Hermiticity ?

ant31 avatar Jan 12 '18 13:01 ant31

but doesn't let jsonnet jump around the whole filesystem

How this would be different in term of Hermiticity ?

The caller of jsonnet command has control over what data gets passed to jsonnet. This changes a lot.

Consequences somewhat related to hermicity:

  • It provides a "upper bound" on what a jsonnet program depends, without running the program. You can be sure that unless contents of the directory change, you can zip it send somewhere, run jsonnet on a differrent machine with the unpacked zip and the result will be exactly the same.
  • Libraries cannot sneakily depend on global stuff (/etc/, /proc/). It's the easiest for library authors to take any data they need as arguments - which is the right thing. Or at the very least they need to explicitly tell what they depend on.
  • Even if some jsonnet program expects to get the whole / nobody can prevent you from passing something else (e.g. mounted volume or archived contents from some other machine) - it's no longer the environment in which jsonnet runs, it's just some data you pass to it.
  • You can actually see the directories you pass as a fancy way to represent json data.
  • The same data can be provided from sources other than the actual filesystem (e.g. json file) and jsonnet won't be able to tell a difference.

It's fairly safe feature to add:

  • It doesn't require any changes to jsonnet language semantics, only to the way data is loaded.
  • The behavior of jsonnet command stays exactly the same, unless you pass an additional option.

Proof of concept

In the previous section I said that:

  • You can actually see the directories you pass as a fancy way to represent json data.

So let's do just that. I wrote a little Go program https://github.com/sbarzowski/dir2json

You can use it to easily provide directory contents to jsonnet:

dir2json dir_you_want_to_pass_to_jsonnet > /tmp/dir.json # it's likely too big for a commandline argument, so I need a temporary file
jsonnet -e "function(directory) directory["my_file.txt"]" --tla-code-file="/tmp/dir.json"

The downside is that it reads the whole directory eagerly and stores it in a temporary file. And it's not exactly convenient.

More precise proposition

Let's add --ext-dir <var>=<path> and --tla-dir <var>=<path> which create an object out of the directory and pass it as extVar or top level argument respectively. Exactly like dir2json above, but it would load the file contents lazily.

So the example from PoC would become:

jsonnet -e "function(directory) directory["my_file.txt"]" --tla-dir="tmp/dir.json"

EDIT: probably std.parseJson and maybe std.parseYaml would be needed for this feature to reach its full potential.

sbarzowski avatar Jan 13 '18 17:01 sbarzowski

I mentioned a use for this in #479, as automatically noted. Another use that I found recently was importing a set of variables. Consider:

local DmConfig(p) = {
  local defaults = {
    dm_path:: error "must override dm_path",

    y_coeffs: import (self.dm_path + "/y_coeffs.json"),
    u_coeffs: import (self.dm_path + "/u_coeffs.json"),
    v_coeffs: import (self.dm_path + "/v_coeffs.json"),
  },

  { "out": defaults + p }
}

{ ... DmConfig({dm_path: "test1/dm"}) ... }

(The reason I specify defaults + p here, rather than directly overriding things here, is #217, which I forgot about until I picked up this project again... but it is just as illustrative if you strip off the function generation there.)

Without a computed import, we have to repeat the path to the coefficient includes three times.

jwise avatar Mar 28 '18 22:03 jwise

I like the idea of "hermeticity" and "explicitness" of imports too. Still it felt like a waste of resources somewhat to prepare a "directory index" for a list of Grafana dashboards I had to wrap in k8s Objects.

Did you consider a variation of "dir2json" or a "builtin" to make "filesystem snapshots" that are lazy? Somewhat like this, still very explicit:

#!/usr/bin/env jsonnet                                                                                                                            
local fs = {
  // Try s/importstr/import/ here! Hint: it wont terminate.                                                                                       
  "fs.jsonnet": importstr "fs.jsonnet",
  "a.txt": importstr "a.txt",
  "b.txt": importstr "b.txt",
  "sub": {
    "c.txt": importstr "sub/c.txt",
    "d.txt": importstr "sub/d.txt"
  },
};

fs

Maybe there could be a "builtin" to Import a static list or a tree like above to reduce boilerplate? Also like this example shows it is usually the "eval" that is "evil" for analysis. As long as the imported strings are not interpreted (parsed into Jsonnet Code?) it is just data in bytes after all.

alexei-matveev avatar Jun 17 '21 20:06 alexei-matveev

As long as the imported strings are not interpreted (parsed into Jsonnet Code?) it is just data in bytes after all.

I agree. I would be on board with importdir which imports a directory structure with each directory being an object and each file being a string. It would be lazy, at least with regard to loading the actual files.

So you could do something like: importdir "my_directory/" and it would load everything from there.

This would require extending the imports abstractions slightly to allow producing directory listings and it could be a bit annoying to implement. Also opens questions about handling of hidden files.

Did you consider a variation of "dir2json" or a "builtin" to make "filesystem snapshots" that are lazy?

Hmm.... Not sure I understand. What would this tool do exactly? Is it the same as importdir? Or do you suggest creating files like in your example automatically. It would be very easy to create an external tool to do that.

sbarzowski avatar Jun 17 '21 21:06 sbarzowski

suggest creating files like in your example automatically

An easy/standard way to create "data structures" eventually like the one returned in the example should probably be the goal. Files with textual representations are overrated.

Yes, I think you just described with "importdir" the kind of builtin most commenters would be happy with, including me. And yes, without such "importdir" many will likely end up with ad-hoc tools creating files/structures like in the example above. I am undecided which is better when I think of "explicit is better that implicit" though.

Disclaimer: I am a happy user of jsonnet since about just a few months. Thanks you for caring about it, btw!

alexei-matveev avatar Jun 17 '21 23:06 alexei-matveev

importdir could be as simple as producing a map that's just the filename -> content.

importdir './dir/' => {
    "file.txt": "content",
    "config.toml": "value = 2",
}

Jezza avatar Jul 22 '21 09:07 Jezza

@Jezza Yes, that's basically the idea. There are still some complications, though:

  1. There might be further directories, so it should probably be recursive.
  2. We support importing from things which are not traditional filesystems. We have an abstract interface for that. We would need to extend the interface so that listing things from a directory is possible. Right now, we don't even have a concept of a directory. We'll need to figure out the guarantees, e.g. (importdir 'a/b')['foo'] == importstr 'a/b/foo' – but we don't know what separator to use, so....
  3. We need to implement it in all interpreters.
  4. We'll need to support it ~forever, so a stupid design mistake can be very costly.

Because of that, a file generation approach would take at least 10x less work IMO. This can be done with very little commitment and as a single, separate tool. And it doesn't require any knowledge of the internals. Ofc. it will work only with the actual file systems, not with the abstract importers.

sbarzowski avatar Jul 24 '21 20:07 sbarzowski

Any further interest in carrying this issue forward? It seems like a desirable and one without many alternatives other than creating a mapping of all file paths yourself.

jacksongoode avatar Oct 31 '22 20:10 jacksongoode

How do people feel about a separate commandline tool for turning directory contents into jsonnet import manifests? We could provide this like jsonnetfmt -- a separate tool but part of the release and maintained.

sparkprime avatar Nov 03 '22 12:11 sparkprime

It's not about the stdlib, it's about the language.

commandline tool for turning directory contents into jsonnet import manifests?

Another possibility to keep it out of the core language would be a command line flag like

jsonnet --ext-DIR lazyDir=./fs/ ...

or a variation thereof making lazyDir accessible in the code. It is probbaly not much different from this use case of hypothetical tool for generating "import manifests":

jsonnet --ext-code lazyDir="$(jsonnet-codegen-import-manifests ./fs/)" ...

I must add the whole issue does not seem a big deal to me anymore --- far less pressing that it seemed originally.

alexei-matveev avatar Nov 22 '22 17:11 alexei-matveev

@alexei-matveev I'd actually be very fascinated to know why your attitude changed as it would give insight into how important this feature is

sparkprime avatar Nov 22 '22 20:11 sparkprime

Not much to report, actually. The isssue did not prove to be a "cost center" when manually maintaining the catalog of those Grafana dashboards all sitting in the same directory. Adding or removing an entry to such catalog was not a big deal compared to all the rest and we managed to stay with a single catalog so far. We are not quite "web scale" yet :-)

Off-topic: i spent much more time converting YAML with comments to Jsonnet and preserving comments. Maybe jsonnetfmt could learn that trick?

alexei-matveev avatar Nov 22 '22 22:11 alexei-matveev