corral icon indicating copy to clipboard operation
corral copied to clipboard

Use dependency format similar to npm

Open hardliner66 opened this issue 4 years ago • 29 comments

I copied this Issue from https://github.com/ponylang/pony-stable/issues/33 because it is also applicable to corral.

Currently the dependencies are listed in an array. With this its possible to list the same dependency twice, even with different tags.

Dependencies should be unique with a set of properties, so an object would be better.

hardliner66 avatar May 07 '20 07:05 hardliner66

With an object wouldn't it still be possible to list the same dependency twice but one would override the other? I agree dependencies should be unique and the corral add command won't let you add the same dependency twice but for the file format itself, I don't see how object prevents me from having the same key twice in the text of the file.

SeanTAllen avatar May 07 '20 12:05 SeanTAllen

Well, with an array I can have duplicates.

This is valid json, but I don't know which version will be used:

[
	{
		"name": "a",
		"version": "1.2.3"
	},
	{
		"name": "a",
		"version": "2.3.4"
	}
]

~~With an object this can't happen, because the keys need to be unique. If they are not, the file is not a valid json file.~~

I think it would be better to use the proper json structure to make it clearer that dependencies are supposed to be unique.

Edit: Apparently keys in JSON objects do not have to be unique, but it is recommended. I still think an object would better represent how the dependencies are used.

hardliner66 avatar May 07 '20 12:05 hardliner66

Let me understand the suggestion better:

you want a structure like this, where the name of the dependency is the key?

{
  "a": {
     "version": "1.2.3"
  }
}

In corral terms the locator string should be the key here, right?

mfelsche avatar May 07 '20 12:05 mfelsche

I can also do this if I am editing by hand, correct?

{
	"a": {
		"version": "1.2.3"
	},
	"a" : {
		"version": "2.3.4"
	}
}

SeanTAllen avatar May 07 '20 12:05 SeanTAllen

I'm not sure in an object what the key would be. One of the goals of corral is to be able to use multiple versions of the same dependency within a program, but in different modules.

SeanTAllen avatar May 07 '20 12:05 SeanTAllen

@mfelsche Yeah, that seems more correct.

@SeanTAllen Sure, but most tools already handle duplicate keys. VSCode tells you out of the box that you have a duplicate key. grafik

Edit: Also, doesn't the use of different versions of a dependency inside a project lead to more problems than it solves?

hardliner66 avatar May 07 '20 12:05 hardliner66

Also, doesn't the use of different versions of a dependency inside a project lead to more problems than it solves?

One of Pony's stated goals for the module system is to support multiple versions of the same dependency. You can argue about trade-offs you want to make. This is the tradeoff that we as a project want to make. The other side of that trade-off is a situation like Java where you can update to new version of Module A because it uses Module C version 2 and you also are using Module B that needs Module C version 1.

It's all trade-offs. Allowing independent modules to not have to use the same version of a common dependency is something we've decided is desirable to us and will work to make sure our module system allows. It was an early goal with Pony and continues to be.

SeanTAllen avatar May 07 '20 13:05 SeanTAllen

I think that makes sense. But how do you decide which version to import? Wouldn't it be better that each module would have their own corral.json with their own dependencies instead of putting everything into the top level?

hardliner66 avatar May 07 '20 13:05 hardliner66

@hardliner66 that's an excellent point. I think the problem here is there's a gray area that needs to be explored. If a bundle.json is for a project that is a single module, I think its pretty straightforward, module A can only include a single dependency for module B. it could get other versions transitively, but that isn't a worry for bundle.json format.

What I think needs to be looked into is how this would work if the bundle.json is for a project that covers multiple modules. I had a couple conversations about this with @cquinn before he passed away. At the time, what he was telling me went completely over my head.

I'm not opposed to original change. I suspect it would be ok for the key to be the locator for the project. However, I think we need to look through Carl's write ups for Corral (including his pre-corral package manager) write up and make sure that what we are doing fits within the vision.

Carl put a lot of good thought into the vision to make it work with our stated desires and goals. At this point in time, I don't think anyone is up to speed with nuances such as the "single project, multiple modules" possibility.

SeanTAllen avatar May 07 '20 14:05 SeanTAllen

During sync we discussed this and decided a couple of things:

1 - we will making the breaking format change to corral to move from an array of things to an object 1a - the unique key will be the item that is currently in the locator (locator will continue to remain as a field as well) 2- when making decisions, we will assume that corral controls all edits to a corral.json and not worry about people editing and introducing errors, therefore we will be strict in parsing the json error out at anything that isn't "normal corral style".

SeanTAllen avatar Jun 02 '20 19:06 SeanTAllen

Sorry I am late to the discussion, but having the dependency name as the key is generally not a good idea, as schemas and software that operates from a schema will become a lot more "messy".

In this case, the "messiness" is not that much, but if this is a general trend then it quickly becomes a burden.

For instance, I am working on a "ponyhub" search engine, written in Java (because I am so fluent in it), and with the previous format, I can simply do;

public class CorralDescriptor
{
    private List<Dependency> deps = new ArrayList<>();
    private Info info;
 : 
    public static class Dependency
    {
        private String locator;
        private String version;
     : 
    }
    public static class Info
    {
        private String name;
     : 
    }
}

and then use a so called Object Mapper to parse the JSON into type safe objects.

Mapper mapper = new MapperBuilder().build();
CorralDescriptor cd = mapper.readObject( corralFileContent, CorralDescriptor.class );

and all the heavy lifting is take care of.

With this change, it will be a Map<String, VersionDescriptor>. In this particular case, this is not detrimental, but it sets a precedence that is unfortunate. Schemas are valuable, just like type-safe languages.

niclash avatar Jun 03 '20 02:06 niclash

To be clear, the change will be from the following example:

{
  "deps": [
    {
      "locator": "github.com/ponylang/net-ssl.git",
      "version": "1.0.0"
    },
    {
      "locator": "github.com/ponylang/regex.git",
      "version": "1.0.0"
    }
  ]
}

to the new format:

{
  "deps": {
    "github.com/ponylang/net-ssl.git" : {
      "locator": "github.com/ponylang/net-ssl.git",
      "version": "1.0.0"
    },
    "github.com/ponylang/regex.git" : {
      "locator": "github.com/ponylang/regex.git",
      "version": "1.0.0"
    }
  }
}

SeanTAllen avatar Jun 03 '20 21:06 SeanTAllen

Sorry, but isn't this the worst of both? If the change is to aid manual editing, this format doesn't help and the static type is less neat. If the "help" is that corral is reporting a mismatch between name and locator value, then corral could just as simply report that the list contains multiple entries of the same. What am I missing?

niclash avatar Jun 04 '20 00:06 niclash

@niclash

"2- when making decisions, we will assume that corral controls all edits to a corral.json and not worry about people editing and introducing errors, therefore we will be strict in parsing the json error out at anything that isn't "normal corral style"."

SeanTAllen avatar Jun 04 '20 01:06 SeanTAllen

Ok, then what is the actual reason for having a non-static schema format?

niclash avatar Jun 04 '20 01:06 niclash

How is it a "non-schema" format?

SeanTAllen avatar Jun 04 '20 01:06 SeanTAllen

Languages that are statically typed, and have "Reflection/Introspection" allows automatic mapping of json to/from languages types.

niclash avatar Jun 04 '20 01:06 niclash

I don't know what you mean, sorry. It's valid Json.

SeanTAllen avatar Jun 04 '20 01:06 SeanTAllen

Did you read/understand https://github.com/ponylang/corral/issues/105#issuecomment-637911034?

niclash avatar Jun 04 '20 01:06 niclash

Yes.

SeanTAllen avatar Jun 04 '20 01:06 SeanTAllen

Take GitHub's API for instance, and look at https://github.com/niclash/pony-hub/tree/master/src/main/java/io/bali/ponyhub/repositories/github

All the Rest API docs are mapped to Java types, with language access, rather than "Map lookup".

niclash avatar Jun 04 '20 01:06 niclash

ObjectMapper mapper = new ObjectMapper();

GitHubRepository repo = mapper.readValue(json, GitHubRepository.class);

niclash avatar Jun 04 '20 01:06 niclash

I don't see the problem with Map<String, VersionDescriptor>.

Consider the future existance of a package repository. In this case the simplest entry you can have is:

{
    "deps": {
        "dependencyName": "versionString"
    }
}

which would be just a shorter version of:
```json
{
    "deps": {
        "dependencyName": {
            "version": "versionString"
        }
    }
}

And as I said, using a map communicates better that the keys should not be duplicates. This is already integrated in most IDEs or editors.

@SeanTAllen I had something like the following in mind:

{
  "deps": {
    "net-ssl" : {
      "locator": "github.com/ponylang/net-ssl.git",
      "version": "1.0.0"
    },
    "regex" : {
      "locator": "github.com/ponylang/regex.git",
      "version": "1.0.0"
    }
  }
}

hardliner66 avatar Jun 16 '20 07:06 hardliner66

@hardliner66 the problem is that "regex" as a key isn't unique. nor is "net-ssl", thus using the locator as the key.

SeanTAllen avatar Jun 16 '20 18:06 SeanTAllen

I mentioned in the sync call today, that I think this change actually makes things somewhat worse with respect to being able to prevent accidental duplication of the dependencies

That is, if we move from an array format like this:

{
  "deps": [
    {
      "locator": "github.com/ponylang/net-ssl.git",
      "version": "1.0.0"
    },
    {
      "locator": "github.com/ponylang/regex.git",
      "version": "1.0.0"
    }
  ]
}

to a mapping format like this:

{
  "deps": {
    "github.com/ponylang/net-ssl.git" : {
      "locator": "github.com/ponylang/net-ssl.git",
      "version": "1.0.0"
    },
    "github.com/ponylang/regex.git" : {
      "locator": "github.com/ponylang/regex.git",
      "version": "1.0.0"
    }
  }
}

Then duplicate keys will be swallowed by the JSON parser, rather than being exposed to corral's internals. If we care about preventing duplication, the only way to prevent it is in corral's internals because JSON parsers (usually, though this varies from one parser implementation to another) will silently swallow/ignore duplicate keys rather than preventing them.

jemc avatar Jun 16 '20 18:06 jemc

I think there's a bit of confusion as to the problem that package.json tries to solve. The key of the JSON object (called the package name) specifies which package should be used for node_modules (i.e. a package "regex" goes into node_modules/regex). Usually this will match a package name and version (from the value) in a package registry -- by default the public NPM registry --; but you can specify different versions, for example a SCV such as a Git repository and commit hash.

I think the biggest part of the confusion is because usually in NPM we think of local package names and remote package names as the same thing. It doesn't have duplication only because the registry itself enforces that uploaded packages cannot have conflicting names. However, we can actually force duplication of packages:

$ npm init -y
$ npm install lodash
$ npm install lodash2@npm:[email protected]
$ npm install lodash3@npm:[email protected]
$ jq .dependencies package.json
{
  "lodash": "^4.17.15",
  "lodash2": "npm:lodash@^4.17.15",
  "lodash3": "npm:lodash@^4.17.15"
}
$ ls node_modules
lodash
lodash2
lodash3

In the case of Corral, there's no central registry handling package name conflicts, i.e. saying that regex canonically refers to github.com/ponylang/regex.git. (As of now, the closest projects to filling this role of "central package registry" are main.actor and Niclas' ponyhub, both of which are currently WIP)


TL;DR: The way I see it, the heart of the issue lies in not having a central repository/registry for Corral dependencies. Otherwise, potential duplication is unavoidable.

OTOH, if the issue is actually about moving from array-form to object-form, Sean's solution seems to be the best for me. It avoids any unnecessary and/or breakable name-mangling.

EpicEric avatar Jun 16 '20 19:06 EpicEric

@jemc I'm 100% fine with saying that we happen to use json (for now). That bundle.json is a private implementation detail of corral and that all validation is done in corral and leave the format as is.

That fits with our previous statement during sync that we all agreed on that we do not consider bundle.json to be end-user editable or consumable and would not optimize for that.

SeanTAllen avatar Jun 16 '20 20:06 SeanTAllen

If it is not to be consumed by external tools, then that puts additional strain on corral binary to be able to print machine readable information (preferably in a formal format, say yaml or json) for external tools (such as my https://ponyhub.bali.io) to read. And then you are back to square one, as this format could just as well be the file format. Not having an official format (either in file or as output) is worse than any json structure.

niclash avatar Jun 17 '20 02:06 niclash

It's hard to say which way is better, even the design of the Cargo and npm are rated as bad. This is the radical approach of deno's designers:

import { serve } from "https://deno.land/[email protected]/http/server.ts";
const s = serve({ port: 8000 });
console.log("http://localhost:8000/");
for await (const req of s) {
  req.respond({ body: "Hello World\n" });
}

This leads to too many details that need to be discussed.

  • Reference uniqueness
  • corral.json format
  • How to cooperate with corral web front
  • Cross-references and multiple versions of indirect references
  • Current ponyc version adaptation.Some packages may use a higher version of the ponyc feature.
  • How to avoid npm-style dependency hell
  • How to deal with the removal of certain packages by authors? It caused the downstream tool chain to not work. (For corral, the problem may be that the author deleted the git repo)

damon-kwok avatar Jun 17 '20 03:06 damon-kwok