corral
corral copied to clipboard
Use dependency format similar to npm
I copied this Issue from https://github.com/ponylang/pony-stable/issues/33 because it is also applicable to corral.
Currently the dependencies are listed in an array. With this its possible to list the same dependency twice, even with different tags.
Dependencies should be unique with a set of properties, so an object would be better.
With an object wouldn't it still be possible to list the same dependency twice but one would override the other? I agree dependencies should be unique and the corral add
command won't let you add the same dependency twice but for the file format itself, I don't see how object prevents me from having the same key twice in the text of the file.
Well, with an array I can have duplicates.
This is valid json, but I don't know which version will be used:
[
{
"name": "a",
"version": "1.2.3"
},
{
"name": "a",
"version": "2.3.4"
}
]
~~With an object this can't happen, because the keys need to be unique. If they are not, the file is not a valid json file.~~
I think it would be better to use the proper json structure to make it clearer that dependencies are supposed to be unique.
Edit: Apparently keys in JSON objects do not have to be unique, but it is recommended. I still think an object would better represent how the dependencies are used.
Let me understand the suggestion better:
you want a structure like this, where the name of the dependency is the key?
{
"a": {
"version": "1.2.3"
}
}
In corral terms the locator string should be the key here, right?
I can also do this if I am editing by hand, correct?
{
"a": {
"version": "1.2.3"
},
"a" : {
"version": "2.3.4"
}
}
I'm not sure in an object what the key would be. One of the goals of corral is to be able to use multiple versions of the same dependency within a program, but in different modules.
@mfelsche Yeah, that seems more correct.
@SeanTAllen Sure, but most tools already handle duplicate keys. VSCode tells you out of the box that you have a duplicate key.
Edit: Also, doesn't the use of different versions of a dependency inside a project lead to more problems than it solves?
Also, doesn't the use of different versions of a dependency inside a project lead to more problems than it solves?
One of Pony's stated goals for the module system is to support multiple versions of the same dependency. You can argue about trade-offs you want to make. This is the tradeoff that we as a project want to make. The other side of that trade-off is a situation like Java where you can update to new version of Module A because it uses Module C version 2 and you also are using Module B that needs Module C version 1.
It's all trade-offs. Allowing independent modules to not have to use the same version of a common dependency is something we've decided is desirable to us and will work to make sure our module system allows. It was an early goal with Pony and continues to be.
I think that makes sense. But how do you decide which version to import? Wouldn't it be better that each module would have their own corral.json with their own dependencies instead of putting everything into the top level?
@hardliner66 that's an excellent point. I think the problem here is there's a gray area that needs to be explored. If a bundle.json is for a project that is a single module, I think its pretty straightforward, module A can only include a single dependency for module B. it could get other versions transitively, but that isn't a worry for bundle.json format.
What I think needs to be looked into is how this would work if the bundle.json is for a project that covers multiple modules. I had a couple conversations about this with @cquinn before he passed away. At the time, what he was telling me went completely over my head.
I'm not opposed to original change. I suspect it would be ok for the key to be the locator for the project. However, I think we need to look through Carl's write ups for Corral (including his pre-corral package manager) write up and make sure that what we are doing fits within the vision.
Carl put a lot of good thought into the vision to make it work with our stated desires and goals. At this point in time, I don't think anyone is up to speed with nuances such as the "single project, multiple modules" possibility.
During sync we discussed this and decided a couple of things:
1 - we will making the breaking format change to corral to move from an array of things to an object 1a - the unique key will be the item that is currently in the locator (locator will continue to remain as a field as well) 2- when making decisions, we will assume that corral controls all edits to a corral.json and not worry about people editing and introducing errors, therefore we will be strict in parsing the json error out at anything that isn't "normal corral style".
Sorry I am late to the discussion, but having the dependency name as the key is generally not a good idea, as schemas and software that operates from a schema will become a lot more "messy".
In this case, the "messiness" is not that much, but if this is a general trend then it quickly becomes a burden.
For instance, I am working on a "ponyhub" search engine, written in Java (because I am so fluent in it), and with the previous format, I can simply do;
public class CorralDescriptor
{
private List<Dependency> deps = new ArrayList<>();
private Info info;
:
public static class Dependency
{
private String locator;
private String version;
:
}
public static class Info
{
private String name;
:
}
}
and then use a so called Object Mapper to parse the JSON into type safe objects.
Mapper mapper = new MapperBuilder().build();
CorralDescriptor cd = mapper.readObject( corralFileContent, CorralDescriptor.class );
and all the heavy lifting is take care of.
With this change, it will be a Map<String, VersionDescriptor>
. In this particular case, this is not detrimental, but it sets a precedence that is unfortunate. Schemas are valuable, just like type-safe languages.
To be clear, the change will be from the following example:
{
"deps": [
{
"locator": "github.com/ponylang/net-ssl.git",
"version": "1.0.0"
},
{
"locator": "github.com/ponylang/regex.git",
"version": "1.0.0"
}
]
}
to the new format:
{
"deps": {
"github.com/ponylang/net-ssl.git" : {
"locator": "github.com/ponylang/net-ssl.git",
"version": "1.0.0"
},
"github.com/ponylang/regex.git" : {
"locator": "github.com/ponylang/regex.git",
"version": "1.0.0"
}
}
}
Sorry, but isn't this the worst of both? If the change is to aid manual editing, this format doesn't help and the static type is less neat. If the "help" is that corral is reporting a mismatch between name and locator value, then corral could just as simply report that the list contains multiple entries of the same. What am I missing?
@niclash
"2- when making decisions, we will assume that corral controls all edits to a corral.json and not worry about people editing and introducing errors, therefore we will be strict in parsing the json error out at anything that isn't "normal corral style"."
Ok, then what is the actual reason for having a non-static schema format?
How is it a "non-schema" format?
Languages that are statically typed, and have "Reflection/Introspection" allows automatic mapping of json to/from languages types.
I don't know what you mean, sorry. It's valid Json.
Did you read/understand https://github.com/ponylang/corral/issues/105#issuecomment-637911034?
Yes.
Take GitHub's API for instance, and look at https://github.com/niclash/pony-hub/tree/master/src/main/java/io/bali/ponyhub/repositories/github
All the Rest API docs are mapped to Java types, with language access, rather than "Map lookup".
ObjectMapper mapper = new ObjectMapper();
GitHubRepository repo = mapper.readValue(json, GitHubRepository.class);
I don't see the problem with Map<String, VersionDescriptor>
.
Consider the future existance of a package repository. In this case the simplest entry you can have is:
{
"deps": {
"dependencyName": "versionString"
}
}
which would be just a shorter version of:
```json
{
"deps": {
"dependencyName": {
"version": "versionString"
}
}
}
And as I said, using a map communicates better that the keys should not be duplicates. This is already integrated in most IDEs or editors.
@SeanTAllen I had something like the following in mind:
{
"deps": {
"net-ssl" : {
"locator": "github.com/ponylang/net-ssl.git",
"version": "1.0.0"
},
"regex" : {
"locator": "github.com/ponylang/regex.git",
"version": "1.0.0"
}
}
}
@hardliner66 the problem is that "regex" as a key isn't unique. nor is "net-ssl", thus using the locator as the key.
I mentioned in the sync call today, that I think this change actually makes things somewhat worse with respect to being able to prevent accidental duplication of the dependencies
That is, if we move from an array format like this:
{
"deps": [
{
"locator": "github.com/ponylang/net-ssl.git",
"version": "1.0.0"
},
{
"locator": "github.com/ponylang/regex.git",
"version": "1.0.0"
}
]
}
to a mapping format like this:
{
"deps": {
"github.com/ponylang/net-ssl.git" : {
"locator": "github.com/ponylang/net-ssl.git",
"version": "1.0.0"
},
"github.com/ponylang/regex.git" : {
"locator": "github.com/ponylang/regex.git",
"version": "1.0.0"
}
}
}
Then duplicate keys will be swallowed by the JSON parser, rather than being exposed to corral's internals. If we care about preventing duplication, the only way to prevent it is in corral's internals because JSON parsers (usually, though this varies from one parser implementation to another) will silently swallow/ignore duplicate keys rather than preventing them.
I think there's a bit of confusion as to the problem that package.json
tries to solve. The key of the JSON object (called the package name) specifies which package should be used for node_modules
(i.e. a package "regex"
goes into node_modules/regex
). Usually this will match a package name and version (from the value) in a package registry -- by default the public NPM registry --; but you can specify different versions, for example a SCV such as a Git repository and commit hash.
I think the biggest part of the confusion is because usually in NPM we think of local package names and remote package names as the same thing. It doesn't have duplication only because the registry itself enforces that uploaded packages cannot have conflicting names. However, we can actually force duplication of packages:
$ npm init -y
$ npm install lodash
$ npm install lodash2@npm:[email protected]
$ npm install lodash3@npm:[email protected]
$ jq .dependencies package.json
{
"lodash": "^4.17.15",
"lodash2": "npm:lodash@^4.17.15",
"lodash3": "npm:lodash@^4.17.15"
}
$ ls node_modules
lodash
lodash2
lodash3
In the case of Corral, there's no central registry handling package name conflicts, i.e. saying that regex
canonically refers to github.com/ponylang/regex.git
. (As of now, the closest projects to filling this role of "central package registry" are main.actor and Niclas' ponyhub, both of which are currently WIP)
TL;DR: The way I see it, the heart of the issue lies in not having a central repository/registry for Corral dependencies. Otherwise, potential duplication is unavoidable.
OTOH, if the issue is actually about moving from array-form to object-form, Sean's solution seems to be the best for me. It avoids any unnecessary and/or breakable name-mangling.
@jemc I'm 100% fine with saying that we happen to use json (for now). That bundle.json is a private implementation detail of corral and that all validation is done in corral and leave the format as is.
That fits with our previous statement during sync that we all agreed on that we do not consider bundle.json to be end-user editable or consumable and would not optimize for that.
If it is not to be consumed by external tools, then that puts additional strain on corral binary to be able to print machine readable information (preferably in a formal format, say yaml or json) for external tools (such as my https://ponyhub.bali.io) to read. And then you are back to square one, as this format could just as well be the file format. Not having an official format (either in file or as output) is worse than any json structure.
It's hard to say which way is better, even the design of the Cargo
and npm
are rated as bad. This is the radical approach of deno's designers:
import { serve } from "https://deno.land/[email protected]/http/server.ts";
const s = serve({ port: 8000 });
console.log("http://localhost:8000/");
for await (const req of s) {
req.respond({ body: "Hello World\n" });
}
This leads to too many details that need to be discussed.
- Reference uniqueness
- corral.json format
- How to cooperate with corral web front
- Cross-references and multiple versions of indirect references
- Current ponyc version adaptation.Some packages may use a higher version of the ponyc feature.
- How to avoid npm-style dependency hell
- How to deal with the removal of certain packages by authors? It caused the downstream tool chain to not work. (For corral, the problem may be that the author deleted the git repo)