c3c icon indicating copy to clipboard operation
c3c copied to clipboard

A decentralised package discovery and fetching for C3

Open joshring opened this issue 8 months ago • 29 comments

We could create a simple API and database to aggregate c3 projects.

A C3 project is defined as having a topic assigned like eg c3-module

A similar idea has happened here for zig

https://zigistry.dev/ has the output https://zigistry.dev/help/ is how the topics are assigned to a repo

Searching GitHub API Todo: initial search https://stackoverflow.com/questions/51912862/how-to-list-all-repositories-for-a-given-topic-with-rest-api-in-github

Decentralised in that we don't have to manage those projects, registration, emails, resetting password etc, who gets updated etc. But we could add a filter for malicious projects which we do not wish to allow.

joshring avatar Apr 06 '25 13:04 joshring

I'm confused about how the GitHub API is decentralized, but regardless, there's an idea I'd love to see made manifest in C3's package manager. That'd be Unison's Content Addressing. To pull the point from directly from the article; "No dependency conflicts"

devin12422 avatar Apr 07 '25 03:04 devin12422

Decentralised in that we don't have to manage those projects, registration, emails, resetting password etc, who gets updated etc. But we could add a filter for malicious projects which we do not wish to allow.

Thanks for the link, I'll have a look later on.

joshring avatar Apr 07 '25 12:04 joshring

Since it is decentralised, I hope you guys won't use GitHub for this (as it is not decentralised). Git is, GitHub is not.

As for the rest, I hope it will not end up like Rust's crates, or Node.js's NPMs. That is an utter mess. People will abuse it and create unnecessary "crates" of <10 LOC just for the sake of stars and all that. Gotta think of the incentives. People do it for the stars and all that. I think this will require some serious brainstorming. Get the incentives wrong, and you end up with a mess that npm or even Rust's crates have.

Do not get me wrong, these websites look neat, but we do not need yet another npm nightmare. Best to think this through thoroughly early.

Ccocconut avatar Jul 03 '25 00:07 Ccocconut

Something like Sourcegraph's code search is better because it indexes code repositories from multiple sites, like Github, Gitlab, etc. I would recommend curating the software packages, and allow bringing them in from any git repo hosted on the web. A few issues can arise from relying on decentralized Git hosting services, like the possibility of them shutting down unexpectedly, or lack of content moderation. The most decentralized software forge I know of is Radicle. If you're going to make a centralized package manager/registry, I recommend namespaces to solve the naming issues.

Also, could you elaborate on what the biggest problems are with crates.io/npm and how to solve them? Crates.io is better than npm... Rust supports making your own registry, too. Just gotta figure out a way to reduce the number of dependencies per package, and handle malicious packages and package removal problems. C3 doesn't have a slow compilation time like Rust, so maybe you wouldn't have to distribute binaries for packages.

Another thing I would like add is that making a package manager right now isn't a super high priority because C3 still has breaking changes. The closest thing to that C3 has right now is vendor-fetch, but it doesn't work on my machine. Forcing people to clone other people's git repositories ensures that they actually test and read the code that they use, especially in the early stages of C3's ecosystem.

waveproc avatar Jul 27 '25 02:07 waveproc

Unfortunately I do not have a solution, but I think the biggest issue is people publishing junk (and/or malicious code) that people blindly depend on. I do not think there is an easy solution to this. Not incentivizing publishing packages sounds good to me as it may deter people from creating packages for one-liners, which I think is silly. It does not deter malicious people, of course.

Forcing people to clone other people's git repositories ensures that they actually test and read the code that they use, especially in the early stages of C3's ecosystem.

I think this is the right way, especially for now.

C3 doesn't have a slow compilation time like Rust

Thank goodness. Every Rust project I "cargo build" ends up pulling >100 dependencies and it takes ages to build.

Ccocconut avatar Jul 28 '25 10:07 Ccocconut

I propose introducing support for a reqs.toml or reqs.json file to define and manage external dependencies in a structured and standardized format.

This dedicated file would serve as a version-controlled manifest of required external packages, similar to other ecosystems but without a central registry.

This enables some interesting behaviors:

  • Dependencies are simply arbitrary Git repositories or tarballs;
  • Not having a registry would reduce the incentive for low-effort packages;
  • Dependency declarations can adopt flexible schema formats, similar to those supported by Bun for non-npm sources;
  • A commit hash could be required to avoid ambiguity and potential semver-related issues;
  • Another possible feature could be the ability to target a specific branch and use Git submodules to check out the a source tree directly — making it easier to maintain and work with forks.
  • Projects could also override metadata from unmaintained packages (e.g., fields in package.json), or specify a custom entry point to treat the dependency as a CLI or tool.

For context, here’s some examples from the Bun documentation:

{
  "dependencies": {
    "dayjs": "git+https://github.com/iamkun/dayjs.git",
    "lodash": "git+ssh://github.com/lodash/lodash.git#4.17.21",
    "moment": "[email protected]:moment/moment.git",
    "zod": "github:colinhacks/zod",
    "react": "https://registry.npmjs.org/react/-/react-18.2.0.tgz",
    "bun-types": "npm:@types/bun"
  }
}

A potential reqs.toml could look like:

[math-utils]
repo = "https://github.com/example/math-utils.git"
override-opt = "O3"

[json-parser]
repo = "https://git.example.org/libs/json-parser.git"
entry-file = "parser.c3"
checkout-branch = "dev"

wfzyx avatar Jul 29 '25 02:07 wfzyx

Is it a good a good idea to make it NOT VCS-agnostic? Genuine question. I would rather prefer if it was VCS-agnostic somehow, or maybe "repo" should accept supported VCSs? We may have to need a format to differentiate between SVN, Git, Mercurial, Darcs, Pijul, etc.

However, I am all up for "reduce the incentive for low-effort packages", and "commit hash could be required to avoid ambiguity and potential semver-related issues", for example. Some of the rest is heavily git-specific, and I wonder if we could make it less dependent on Git. I think Git is great and I use Git only, but yeah...

Ccocconut avatar Jul 29 '25 22:07 Ccocconut

I propose a website listing curated C3 packages, similar to blessed.rs. If you need a package manager, I would recommend Nixpkgs. Both the curated packages website and Nixpkgs could be cloned from a git repository.

@wfzyx

Dependencies are simply arbitrary Git repositories or tarballs

You mean like cmake's FetchContent or bazel's git_repository?

A commit hash could be required to avoid ambiguity and potential semver-related issues

Bazel and buck2 are capable of that https://bazel.build/remote/caching. I am not sure if it is necessary to reinvent the wheel. You could have a .toml or .json for convenience, but I think the correct solutions have already been created. All you would need to do is agree on which build system to use for the "standardization" part.

@Ccocconut

I wonder if we could make it less dependent on Git

How would that work? More than 90% of developers use Git. Maybe the content addressable hashing from Unison would work... Bazel and buck2 have http_archive support.

waveproc avatar Jul 30 '25 02:07 waveproc

Yes, I know that such solutions do exist. I'm just not a fan of making these huge solutions mandatory dependencies for such a trivial thing as downloading text in an organized manner.

wfzyx avatar Jul 30 '25 02:07 wfzyx

Yes, I know that such solutions do exist. I'm just not a fan of making these huge solutions mandatory dependencies for such a trivial thing as downloading text in an organized manner.

These solutions would be a good fit for C3, because they already have all of the functionality you might need, including polyglot builds. I think people really want to make a "simpler" solution but it would end up being more fragile than a complex build system.

Maybe making a wrapper over the top of nixpkgs, etc., would help people use these solutions.

Edit: Less complex solutions like Bob exist, too:

When to consider using Bob?

  • You want a pipeline which runs locally and on CI.
  • You want remote caching and never having to do the same build twice.
  • You want to get rid of "Works on My Machine".
  • You like Bazel and its features but think it's too complex.
  • You want a build system which keeps frontend tooling functional.

waveproc avatar Jul 30 '25 02:07 waveproc

@wfzyx Devbox supports the functionality you described at first glance.

Quickstart: Fast, Deterministic Shell

In this quickstart we'll create a development shell with specific tools installed. These tools will only be available when using this Devbox shell, ensuring we don't pollute your machine.

  1. Open a terminal in a new empty folder.

  2. Initialize Devbox:

    devbox init
    

    This creates a devbox.json file in the current directory. You should commit it to source control.

  3. Add command-line tools from Nix. For example, to add Python 3.10:

    devbox add [email protected]
    

    Search for more packages on Nixhub.io

  4. Your devbox.json file keeps track of the packages you've added, it should now look like this:

    {
      "packages": [
        "[email protected]"
      ]
    }
    
  5. Start a new shell that has these tools installed:

    devbox shell
    

    You can tell you're in a Devbox shell (and not your regular terminal) because the shell prompt changed.

  6. Use your favorite tools.

    In this example we installed Python 3.10, so let's use it.

    python --version
    
  7. Your regular tools are also available including environment variables and config settings.

    git config --get user.name
    
  8. To exit the Devbox shell and return to your regular shell:

    exit
    

Read more on the Devbox docs Quickstart.

waveproc avatar Jul 30 '25 05:07 waveproc

@Ccocconut How would we get more people to adopt alternative version control systems? I still haven't fully adopted jj yet.

waveproc avatar Jul 30 '25 06:07 waveproc

@Ccocconut How would we get more people to adopt alternative version control systems? I still haven't fully adopted jj yet.

I am not asking for adoption of alternative version control systems, just that whatever is being implemented should not entirely depend on git, or leave some room for alternative ways of downloading besides git.

Ccocconut avatar Jul 30 '25 06:07 Ccocconut

@Ccocconut How would we get more people to adopt alternative version control systems? I still haven't fully adopted jj yet.

I am not asking for adoption of alternative version control systems, just that whatever is being implemented should not entirely depend on git, or leave some room for alternative ways of downloading besides git.

Yeah, it sounds like a good idea, because people using the libraries wouldn't need knowledge of how the different VCSs worked. Personally I have only used git repositories, but anyways, buck2 supports git (and maybe Sapling too?), and the functionality is extendable using the starlark DSL and python scripts: https://github.com/facebook/buck2/tree/main/prelude/git. So it probably wouldn't be that difficult to modify the logic for git_fetch.py to make it work with other VCSs.

waveproc avatar Jul 30 '25 09:07 waveproc

@Ccocconut How would we get more people to adopt alternative version control systems? I still haven't fully adopted jj yet.

I am not asking for adoption of alternative version control systems, just that whatever is being implemented should not entirely depend on git, or leave some room for alternative ways of downloading besides git.

Yeah, it sounds like a good idea, because people using the libraries wouldn't need knowledge of how the different VCSs worked. Personally I have only used git repositories, but anyways, buck2 supports git (and maybe Sapling too?), and the functionality is extendable using the starlark DSL and python scripts: https://github.com/facebook/buck2/tree/main/prelude/git. So it probably wouldn't be that difficult to modify the logic for git_fetch.py to make it work with other VCSs.

Since it calls to an external program, it is very easy to make it work with other VCSs. It would be a different story if it used libgit2. I hope it would support direct links to .tar.* archives, too and whatnot, with sha{256,512} checksum.

Ccocconut avatar Jul 30 '25 10:07 Ccocconut

Anyways, @joshring is on the right track. You would need to search multiple websites/repositories for c3 packages: github, gitlab, codeberg, sourcehut, etc., and index them on a centralized website like Zigistry. I don't think the talk of package managers is fully relevant to @joshring's original post. But anyways, I would like for such a centralized registry to have a positive feedback loop where developers can see how many times their packages are downloaded, and a centralized, bot-proof feedback mechanism (stars, upvotes/downvotes, comments from developers vouching for the code). Github has the Insights tab, but only the repository authors have access to the "Traffic" data. So why not just set up a Gitea/Forgejo forge on a cloud-based server, track all the packages from various sources there, and point the users of the packages to the centralized registry, so that you can get that data from the git clones (or at the very least through a tracking mechanism without needing a full-blown forge), and thus the package authors feel proud of their work.

waveproc avatar Jul 30 '25 10:07 waveproc

I am deeply suspicious of automatically pulling in additional dependencies. That's what making development go to ****. Because if there is no friction to pulling in a billion dependencies, then people will do that because there is no incentive to not do it.

So here's a simple setup:

  1. The service allows you to download a library, including older versions of a library.
  2. This library then ends up in your default library (just like today with vendor fetch)
  3. However, you have to manually request missing libraries that it might depend on.

So basically like this:

  1. You get "Foo" lib using the tool.
  2. You try to compile, the compiler says it depends of Baz and LeftPad
  3. You now need to manually install Baz and LeftPad in the same way as Foo.

So when you have a lot of dependencies, this is super annoying. Which is exactly what we want.

Because the friction is needed because we know people misuse dependencies. This will always happen.

lerno avatar Jul 31 '25 16:07 lerno

I am deeply suspicious of automatically pulling in additional dependencies. That's what making development go to ****. Because if there is no friction to pulling in a billion dependencies, then people will do that because there is no incentive to not do it.

So here's a simple setup:

  1. The service allows you to download a library, including older versions of a library.
  2. This library then ends up in your default library (just like today with vendor fetch)
  3. However, you have to manually request missing libraries that it might depend on.

So basically like this:

  1. You get "Foo" lib using the tool.
  2. You try to compile, the compiler says it depends of Baz and LeftPad
  3. You now need to manually install Baz and LeftPad in the same way as Foo.

So when you have a lot of dependencies, this is super annoying. Which is exactly what we want.

Because the friction is needed because we know people misuse dependencies. This will always happen.

I agree. Lower the barrier to entry, make sure there is friction. We do not want to end up like NPM.

Ccocconut avatar Aug 03 '25 10:08 Ccocconut

I agree. Low barrier to entry makes things worse. Look at npm or cargo. They all suffer from the same supply chain attack.

What are the ways to avoid this? On Thursday, July 31st, 2025 at 6:57 PM, Christoffer Lerno @.***> wrote:

lerno left a comment (c3lang/c3c#2077)

I am deeply suspicious of automatically pulling in additional dependencies. That's what making development go to ****. Because if there is no friction to pulling in a billion dependencies, then people will do that because there is no incentive to not do it.

So here's a simple setup:

  • The service allows you to download a library, including older versions of a library.
  • This library then ends up in your default library (just like today with vendor fetch)
  • However, you have to manually request missing libraries that it might depend on.

So basically like this:

  • You get "Foo" lib using the tool.
  • You try to compile, the compiler says it depends of Baz and LeftPad
  • You now need to manually install Baz and LeftPad in the same way as Foo.

So when you have a lot of dependencies, this is super annoying. Which is exactly what we want.

Because the friction is needed because we know people misuse dependencies. This will always happen.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Ccocconut avatar Oct 01 '25 18:10 Ccocconut

NPM and Cargo are both major successes because of their low barrier to entry.

  • To avoid a "supply chain" attack (malicious code) 100% of the time, you must manually read and test the code each time it is uploaded by a contributor. That requires trusting the people who are testing the code. You could try using a language model or another automated system to do some of the work for you. You could also prove the correctness of the program with formal mathematical methods.
  • Another amazing thing that is missing from C3 is a code formatter like rustfmt/clippy/sonarlint that would alert you to bad code patterns, clean up dead code, unnecessary semicolons, unnecessary whitespace (the code could be hidden somewhere on a random line if you don't have wrapping enabled), obfuscated code.
  • You could also force people to vendor all packages to prevent a left-pad situation, or even prevent people from removing packages from the registry/index. But still with this approach you would need a security team tasked with identifying and removing malicious packages, which would also notify people when a package needs a security update.
  • Another way to reduce the attack surface is by requiring that all C3 code is memory safe. Maybe even put a memory/CPU usage limit on the programs. Maybe require that the programs are run inside of a virtual machine, or on an airgapped computer.
  • Another way is by enforcing capability based security; each package will list out its capabilities (ex.: camera, microphone, sound, IO, internet, read/write disk, read/write clipboard, make notifications, schedule processes, modify OS settings/rules, startup/shutdown/sleep, etc).
  • Another way @lerno suggested is to enforce manual installation of missing dependencies. If there is a popular package that 10,000 people know and trust, they will have to pull in the dependencies manually of course, but this does not require them to read the source code of those dependencies like it's a Microsoft Windows license agreement. So they will probably end up installing all of its dependencies regardless of the nudge, unless of course there's a reason to not install some of the dependencies (size, complexity, architecture of the code, sketchy origin, maybe embedded software). But then they would have to modify the source code.
  • You could offer an official "C3 Language Course" or book to provide prerequisite knowledge of software development in general as well as indoctrinating people to use certain patterns/styles/packages/stdlib modules. Or, you could offer free (optional) code reviews to package uploaders.
  • Another way is to provide visualizations of the program/library's dependency tree and the control flow. You could also add more functionality to the C3 LSP and editor extensions. You could also force users of the program/library to run their program in debug mode the first time around to show them exactly what the code did. Or you could create a list of each line of code executed in the order it was executed like python's trace module. Maybe also upload runtime telemetry to the internet which would include crashes and fatal errors. Maybe enforce the use of Wireshark to analyze network use. Maybe enforce the use of address sanitizers if C3 isn't made to be fully memory safe.
  • If you are allowing installation of binary packages instead of forcing everyone to compile from source, then you could hash the binaries and ensure they are fully reproducible. I believe Unison lang does this. You could also enforce that only C3 code is being used, if C3 is considered trustworthy.
  • Use namespaces to identify packages, with limits on what characters can be used in the names (no unicode), with a limit on how few/many characters can be used, and reserve certain organization names, and prevent names too similar to one another from being used (measured by Levenshtein distance). Also moderate organization/package names for offensiveness and legality. Also ensure the package names are readable by humans without too many random characters.
  • You could enforce NASA's ten rules.
  • You could encourage people to sell their software or encourage users of the software to give money to the developers.
  • You could also just have no package manager at all.
  • I am sure I am missing a bunch of things.

waveproc avatar Oct 03 '25 02:10 waveproc

NPM and Cargo are both major successes because of their low barrier to entry.

No thank you.

If someone truly believes it to be the case, I know everything I need to know and their fundamental lack of knowledge with regarding to history and security.

"Low barrier to entry" = lots of useless (and often wrong code), wasting of one's time, supply chain attacks, and the list goes on and on and is quite lengthy.

"Low barrier to entry" does more harm than good, I thought we would have figured it out by now, but apparently that is not the case.

On Friday, 3 October 2025 at 04:31, waveproc @.***> wrote:

waveproc left a comment (c3lang/c3c#2077)

NPM and Cargo are both major successes because of their low barrier to entry.

  • To avoid a "supply chain" attack (malicious code) 100% of the time, you must manually read and test the code each time it is uploaded by a contributor. That requires trusting the people who are testing the code. You could try using a language model or another automated system to do some of the work for you. You could also prove the correctness of the program with formal mathematical methods.
  • Another amazing thing that is missing from C3 is a code formatter like rustfmt/clippy/sonarlint that would alert you to bad code patterns, clean up dead code, unnecessary semicolons, unnecessary whitespace (the code could be hidden somewhere on a random line if you don't have wrapping enabled), obfuscated code.
  • You could also force people to vendor all packages to prevent a left-pad situation, or even prevent people from removing packages from the registry/index. But still with this approach you would need a security team tasked with identifying and removing malicious packages, which would also notify people when a package needs a security update.
  • Another way to reduce the attack surface is by requiring that all C3 code is memory safe. Maybe even put a memory/CPU usage limit on the programs. Maybe require that the programs are run inside of a virtual machine, or on an airgapped computer.
  • Another way is by enforcing capability based security; each package will list out its capabilities (ex.: camera, microphone, sound, IO, internet, read/write disk, read/write clipboard, make notifications, schedule processes, modify OS settings/rules, startup/shutdown/sleep, etc).
  • Another way @.***(https://github.com/lerno) suggested is to enforce manual installation of missing dependencies. If there is a popular package that 10,000 people know and trust, they will have to pull in the dependencies manually of course, but this does not require them to read the source code of those dependencies like it's a Microsoft Windows license agreement. So they will probably end up installing all of its dependencies regardless of the nudge, unless of course there's a reason to not install some of the dependencies (size, complexity, architecture of the code, sketchy origin, maybe embedded software). But then they would have to modify the source code.
  • You could offer an official "C3 Language Course" or book to provide prerequisite knowledge of software development in general as well as indoctrinating people to use certain patterns/styles. Or, you could offer free (optional) code reviews to package uploaders.
  • Another way is to provide visualizations of the program/library's dependency tree and the control flow. You could also force users of the program/library to run their program in debug mode the first time around to show them exactly what the code did. Or you could create a list of each line of code executed in the order it was executed like python's trace module. Maybe enforce the use of Wireshark to analyze network use. Maybe enforce the use of address sanitizers if C3 isn't made to be fully memory safe.
  • If you are allowing installation of binary packages instead of forcing everyone to compile from source, then you could hash the binaries and ensure they are fully reproducible. I believe Unison lang does this. You could also enforce that only C3 code is being used, if C3 is considered trustworthy.
  • You could enforce NASA's ten rules.
  • I am sure I am missing a bunch of things.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Ccocconut avatar Oct 06 '25 00:10 Ccocconut