bun icon indicating copy to clipboard operation
bun copied to clipboard

Implement a text-based lockfile format

Open Jarred-Sumner opened this issue 1 year ago • 16 comments

This is a tracking issue for implementing a text-based lockfile format for bun install and making it the default format going forward. There will be a smooth migration path from bun.lockbbun.lock.

Why?

When first working on the package manager, the flamegraph showed parsing JSON was the biggest number. So instead of a JSON lockfile, we designed two efficient binary formats: one for the registry manifest cache and one for the lockfile.

The binary lockfile format has served us well, but it isn't worth the cost in developer experience, particularly for larger teams working together. We suggest workarounds today for various things, but none of them are great.

  • Merge conflicts are hard. Do you pick bun.lockb.1 or bun.lockb.2? What about both?
  • How do you inspect the lockfile from a PR on GitHub/GitLab/etc?
  • How do you diff the lockfile? You can print as a yarn.lock and configure git to diff that way, but that's a whole lot more complicated than it not being a problem in the first place
  • When there are lockfile changes, it's not clear enough why. Sometimes, you run bun install and the lockfile has changes due to hashes changing, package.json scripts changing, or other reasons and this is hard to understand right now because it's hard to read in a text editor.
  • People too frequently add bun.lockb to the .gitignore file

Will bun install still be fast?

Yes.

Will it make bun install slower?

Based on what we've seen, about 1-20 milliseconds.

What will the new format be?

Probably JSON with Trailing Commas, like tsconfig.json.

Bun already supports this schema format for the runtime (and package.json). TOML is another option and it's what we use for bunfig.toml, though I kind of think that was a mistake and it should've been JSON with Trailing Commas. TOML's editor tooling support is not as mature as JSON with Trailing Commas.

Why JSON with Trailing Commas instead of JSON?

Too many collective human lifetimes have been spent fixing merge conflicts from diffs caused by adding or removing trailing commas at the end of lists in JSON

-    }
+    },

Why JSON with Trailing Commas instead of YAML?

YAML is fine. We don't have a YAML parser in Bun yet, and the indentation gets really confusing sometimes. Also, YAML parsers tend to be slower than JSON parsers (and YAML parsers are also JSON parsers). My favorite YAML fact is that the two-digit country code for Norway in YAML is parsed as false, though this isn't relevant to a lockfile

Why JSON with Trailing Commas instead of JSON5?

We don't have a JSON5 parser in Bun, and would like to avoid formats that're slower to parse than JSON

When will bun.lock be released?

Q3.

What will the migration plan be?

bun install will support both bun.lockb and bun.lock for awhile, but once released, new features will only be supported via bun.lock.

Do people really edit lockfiles manually?

Yes, for small tweaks it can be important

Why not fix merge conflicts via bun install instead?

We will support that as well. But, it shouldn't be impossible to do manually.

Jarred-Sumner avatar Jun 14 '24 06:06 Jarred-Sumner

Feels like json5 could also be a good choice, that has a real spec at least. I realise there is no parser currently but surely it's better than having jsons that are not valid with most parsers

whyman avatar Jun 14 '24 07:06 whyman

Regarding json5 vs jsonc, there is a related issue with some discussion about tsconfig.json from a few years ago:

  • https://github.com/microsoft/vscode/issues/100688

styfle avatar Jun 14 '24 10:06 styfle

Good design goals from yarn:

  1. Must be easy to read by a human
  2. Must be easy to diff
  3. Must be fast to parse

To achieve 3 it is useful to have a dedicated format and parser. Common formats, formats that support many options and functionality require additional logic, branches in the code, etc. which negatively affects the parsing speed.

Diffability at yarn.lock is simply top notch. Yes, commas in JSON fixes the situation from catastrophic to acceptable, but yarn.locks are always more compact and win in readability over JSON files.

The downside of a custom format is that if it is not compatible or is not a subset of some other, more common format, it will be harder to work with it at the user or infrastructure code level.

In any case, this undertaking cannot but rejoice.

anru avatar Jun 17 '24 11:06 anru

This would really unblock bun for being usable in storybooks as parseable outputs are important for it to adopt as a supported package manager. https://github.com/storybookjs/storybook/issues/28164 JSON with trailing commas is good. But it's also important to have a --json parameter on the package manager commands so scripts can JSON.parse its output.

rhuanbarreto avatar Jun 17 '24 20:06 rhuanbarreto

I'm not as familiar with the technical implications, but as a user I have to say that YAML makes a lot of sense in a long, auto-generated file like the lock file.

GermanJablo avatar Jun 29 '24 03:06 GermanJablo

Wondering why just re-using package-lock.json is not an option?

nurulhudaapon avatar Jul 15 '24 20:07 nurulhudaapon

I think it's a good idea to keep a way to support a backwards compatible (package-lock.json) in addition to the current (binary) one. Then people could choose in the bunfig.toml if they are willing to pay the performance penalty in order to be more compatible with the node ecosystem.

rhuanbarreto avatar Jul 16 '24 06:07 rhuanbarreto

Two same lockfile content bun.lockb files were generated from different versions of Bun may have different md5 bun.lockb results.

This is super annoying with Git.

LitoMore avatar Aug 19 '24 04:08 LitoMore

Re-using package-lock.json or yarn.lock will cause tooling problems. Lots of things rely on those to detect the package manager (for better or for worse). You'd end up with scripts installing things with yarn or npm if bun were to use the same name and/or format.

evelant avatar Aug 19 '24 19:08 evelant

If anyone thinks YAML is a good idea, read this: https://noyaml.com/ Then read all articles the above page links to.

Sure, the basic concept of syntax is great, but there are too many pitfalls. That's why there are some simpler configuration languages with similar syntax which retain the same level of readability and diffability, while being simpler and way less error-prone. https://gura.netlify.app/ https://nestedtext.org/en/latest/ I would bet these are also faster to parse than YAML.

notramo avatar Aug 26 '24 10:08 notramo

Maybe not as relevant for a lockfile, but there is some great innovation in better configuration files: https://nickel-lang.org/

evelant avatar Aug 26 '24 12:08 evelant

If anyone thinks YAML is a good idea, read this: https://noyaml.com/ Then read all articles the above page links to.

Sure, the basic concept of syntax is great, but there are too many pitfalls. That's why there are some simpler configuration languages with similar syntax which retain the same level of readability and diffability, while being simpler and way less error-prone. https://gura.netlify.app/ https://nestedtext.org/en/latest/ I would bet these are also faster to parse than YAML.

I totally agree with you. YAML and other counterparts like TOML, are totally harmful and bad. Either go JSON or .js if you want full power.

Barzi-Ahmed avatar Sep 07 '24 18:09 Barzi-Ahmed

If you decide to move to jsonc, does this mean the current binary lock file going to change from yarn lock to the new jsonc. Also is bun going to provide a parser for the new lock file like yarn does or at least the types like pnpm?

Jordan-Hall avatar Sep 26 '24 03:09 Jordan-Hall

Is there any progress on the matter?

It could be a game changer💪🏻

hadarziv-army avatar Sep 27 '24 20:09 hadarziv-army

One additional benefit from a text-based format is the ability for a tool like Nix to parse it and pull dependencies for reproducible builds. Bun's largest pain point for me is the lack of a way to easily provide package lock information to Nix so that packages can be fetched for use in the sandbox. An awkward conversion of bun.lockb > yarn.lock > package-lock.json is possible, but there are some significant issues in the translation which results in some packages missing hash information.

jakehamilton avatar Oct 18 '24 15:10 jakehamilton

@Jarred-Sumner, text-based format is crucial for security, as dependency scanners cannot scan or read binary files. Please prioritize this feature.

Barzi-Ahmed avatar Oct 18 '24 17:10 Barzi-Ahmed

Another advantage of a text-based approach would be fewer merge conflicts.

Tools like renovate can update buns binary lock files, but merge conflicts arise immediately. Effectively only one update can be merged

jase88 avatar Nov 12 '24 05:11 jase88

One additional benefit from a text-based format is the ability for a tool like Nix to parse it and pull dependencies for reproducible builds. Bun's largest pain point for me is the lack of a way to easily provide package lock information to Nix so that packages can be fetched for use in the sandbox.

This is still not possible with the current text based lock format because it cannot be parsed by a standard JSON parser. I'd like to suggest dropping trailing commas from the format and use standard JSON instead so the lock file can be used with standard JSON parsers.

adisbladis avatar Jan 25 '25 07:01 adisbladis

One additional benefit from a text-based format is the ability for a tool like Nix to parse it and pull dependencies for reproducible builds. Bun's largest pain point for me is the lack of a way to easily provide package lock information to Nix so that packages can be fetched for use in the sandbox.

This is still not possible with the current text based lock format because it cannot be parsed by a standard JSON parser. I'd like to suggest dropping trailing commas from the format and use standard JSON instead so the lock file can be used with standard JSON parsers.

It is problem to use jsonc parser at nix?

kravetsone avatar Jan 25 '25 08:01 kravetsone

It is problem to use jsonc parser at nix?

Yes, Nix only includes a standard JSON parser builtin.

jakehamilton avatar Jan 25 '25 11:01 jakehamilton

It is problem to use jsonc parser at nix?

Yes, Nix only includes a standard JSON parser builtin.

they should include JSONC parser too

trailing comas is important to prevent useless merge conflicts

also configs should be in JSONC

i wait adoption of package.jsonc

kravetsone avatar Jan 25 '25 12:01 kravetsone

they should include JSONC parser too

trailing comas is important to prevent useless merge conflicts

also configs should be in JSONC

i wait adoption of package.jsonc

Don't get me wrong, supporting comments and dangling commas is great. But other programs exist that do not support that and it is quite hard to argue that they should all change just to support Bun. Nix builds lots of software (literally everything) and Bun packages currently aren't easily doable because of the lock file format. Previously the binary lock file was an issue for other tools to parse and use, now jsonc is. It is the same issue. It's not really clear to me how comments in a lock file are useful anyway considering the whole thing is auto generated. Dangling commas don't seem entirely necessary for an auto generated file either.

I would like for Nix to have built-in support for jsonc parsing, but you can only pack so many things into one tool before it becomes unmanageable.

jakehamilton avatar Jan 25 '25 16:01 jakehamilton

Or they could've used just yaml. ¯_(ツ)_/¯

shinebayar-g avatar Jan 25 '25 22:01 shinebayar-g

Bun as a runtime supports parsing the file directly as a module. It also gives code completion hints by using @types/bun.

Example:

// As a module import
import lock from "./bun.lock";
console.log(Object.keys(lock.packages));

// As a dynamic import
const lockDynamic = await import("./bun.lock");
console.log(Object.keys(lockDynamic.packages));

This code will output the packages used in the whole repo (single project or monorepo).

rhuanbarreto avatar Jan 27 '25 00:01 rhuanbarreto

@Jarred-Sumner I don't remember where this was written but this is worth referencing at the docs page https://bun.sh/docs/install/lockfile#text-based-lockfile

rhuanbarreto avatar Jan 27 '25 00:01 rhuanbarreto