toml icon indicating copy to clipboard operation
toml copied to clipboard

The decision to preserve order should probably not be a Cargo feature

Open cbiffle opened this issue 1 year ago • 5 comments

(This is a copy-paste of a bug I sent you in 2022, which was closed rather than copied to the new repo. I note that you still have a preserves-order feature, which suggests that the bug is still present.)

Currently, toml::Value's Table variant discards the order of the input data by default. This behavior can be changed by setting the preserves_order feature, which switches the implementation to an IndexMap.

Using a Cargo feature for this means that the decision is made across the entire dependency graph, such that a single crate requesting ordered tables will change the behavior for all other crates. Since the alphabetical ordering of Table is exposed through iterators by default, it seems likely that crates will depend on it (and I have personally written at least one crate that depended on it by accident), but there's no way for them to express that requirement at the type level or in the build system -- so it can be silently broken by changes in distant dependencies.

It seems to me that preserves_order violates the "purely additive" notion of Cargo features.

I'm not entirely sure what the right alternative is, though, the way the crate is structured. For now, we're probably going to fork the crate into a variant that has predictable order without the risk of subtly breaking our dependencies, but that's obviously not great.

cbiffle avatar Jun 23 '23 18:06 cbiffle

See also toml-rs/toml-rs#454

epage avatar Jun 23 '23 18:06 epage

Going through the linked issue

ehuss:

Ah, apologies, I forgot that it was using a btree instead of a hashmap.

I was in a similar boat, I had assumed HashMap was used. The transition to BTreeMap (TreeMap at the time) was done in https://github.com/toml-rs/toml-rs/commit/b4a4ed72d7b13fb8012878a191c7ca15de995030 with the reason given being "Migrate to a TreeMap for determinism"

I am of two minds on this:

  • Switch to HashMap so no ordering is guaranteed
  • And/or remove preserves_order, telling people they need to use toml_edit for that (see below)
  • Keep using BTreeMap as that is what serde_json does

cbiffle:

I would argue that throwing away information from the input files (i.e. automatically sorting tables) is a poor default, and IndexMap should be the default-and-only. Others might not agree. :-) But either way, I think this should probably not be under control of a Cargo feature, and suggest that removing the feature would be good.

The information being thrown away is not guaranteed by the TOML spec and relying on it ties you to a parser's implementation. I've amended my list of options to include removing preserves_order since it is providing information that should not be relied upon.

epage avatar Jun 23 '23 18:06 epage

While I think preserving the information is a useful default, removing preserves_order entirely would address my concern. AFAIK we don't rely on order, but we were relying on round-tripped versions of the TOML being equivalent, for equivalent input, from one build for another, for checking things in a build system -- it turned out we had been implicitly relying on order because a distant dependency had set preserves_order. When they removed the feature in a patch-level update, our behavior changed for no obvious reason, causing me to file this bug. (Separately a different thing we had was relying on btreemap order. If y'all move away from btreemap, we will obviously need to stop doing that.)

So, any predictable behavior works for me and our system.

I'd discourage HashMap for this reason -- round tripping files through toml-rs is a useful use case, and while it doesn't preserve everything (that's what tomledit is for) it's nice if it's not random. (that being said, hashmap's randomness is at least predictable, which would reveal such bugs early.)

cbiffle avatar Jun 23 '23 21:06 cbiffle

I'm having trouble understanding the status of this. The feature appears to have been removed. Is preserves_order always in effect? Seems so here https://github.com/toml-rs/toml/commit/a7e1daf8df7e11a999ae22f577cdf2446c2abd3f but the PR and/or commit did not link to this issue.

webern avatar Sep 28 '23 17:09 webern

#148 was about the behavior and API of toml_edit. This issue is about the map type used in toml::Table.

epage avatar Sep 28 '23 17:09 epage