Consistent design for unstable APIs (including provider modules)
See also: https://github.com/unicode-org/icu4x/issues/4193, https://github.com/unicode-org/icu4x/pull/6630, https://github.com/unicode-org/icu4x/pull/6629
@robertbastian came up with this plan earlier. The idea is that every ICU4X crate has an unstable feature, for exposing unstable/experimental APIs.
Provider modules (and try_new_unstable) of course must always be publicly available, but we mark them as cfg(not(feature = "unstable"), doc(hidden)).
This means:
- docs.rs still shows them (we run that with all-features). There will be stability banners (https://github.com/unicode-org/icu4x/pull/6630)
- Local docs builds will not show them for most people
-
cargo-semver-checkswill ignore these APIs.
Crates may still choose to have doc(hidden) things that are not exposed with the feature: the line is whether we expect users to use this API at all. For example, the properties open enum newtype field should remain unconditionally doc(hidden); we do not wish for ICU4X users to see that field. We continue to be conservative about using doc(hidden).
cc @sffc @robertbastian
I think we mostly have agreement on this plan, @sffc noted that he couldn't find notes on it, filing an issue to document things and to get formal approval.
A related question is: how do we make cargo-semver-checks happy for when we make such a transition? It does let one override the baseline, but I'm not sure if it works the way we want: @robertbastian can you experiment with that?
An explicit ignorelist would be a nice feature addition.
Another related question is whether we want to continue maintaining two different classes of unstable and internal APIs, or whether we should consolidate this down to just one class? Manish and I discussed some pros and cons.
State of doc(hidden) in ICU4X:
$ find . -name '*.rs' -exec grep -F '#[doc(hidden)]' {} ';' | awk '{$1=$1};1' | sort | uniq -c
92 #[doc(hidden)]
1 #[doc(hidden)] // #2417
2 #[doc(hidden)] // appears in public bound
1 #[doc(hidden)] // canonical location in super
1 #[doc(hidden)] // compiled constructors look for the baked provider here
3 #[doc(hidden)] // databake
7 #[doc(hidden)] // databake internal
2 #[doc(hidden)] // draft API in ICU4C
4 #[doc(hidden)] // for databake, but there are no invariants
1 #[doc(hidden)] // for testing and macros
1 #[doc(hidden)] // for testing only
1 #[doc(hidden)] // for testing purposes only: checks if it is using the Bigger format
4 #[doc(hidden)] // internal
13 #[doc(hidden)] // macro
3 #[doc(hidden)] // macro use
2 #[doc(hidden)] // `ScriptWithExt` not intended as public-facing but for `ScriptWithExtensionsProperty` constructor
3 #[doc(hidden)] // sealed trait
2 #[doc(hidden)] // singletons might be used cross-crate
1 #[doc(hidden)] // subject to change
1 #[doc(hidden)] // testing
1 #[doc(hidden)] // TODO(#4467): establish this as an internal API
7 #[doc(hidden)] // TODO(#4467): Should be internal
10 #[doc(hidden)] // unstable
4 #[doc(hidden)] // used by FFI code
1 #[doc(hidden)] // used in collator
Another related question is whether we want to continue maintaining two different classes of unstable and internal APIs, or whether we should consolidate this down to just one class? Manish and I discussed some pros and cons.
My position is that these are two different things: we have "unstable but exposed to power users" and we have "internal to be used cross-crate/in macros/for testing". I am very strongly against some of our internal-to-be-used-cross-crate-or-macros-or-testing stuff showing up on our public docs. However, I do not want unstable-but-exposed-to-power-users to require power users to sift through --doc-hidden-items. I don't really see an upside of consolidating these two aside from conceptual simplicity, but it's largely internal conceptual simplicity (it's simpler for us as maintainers, but doesn't really simplify things for users).
- @robertbastian: cargo-semver-checks treats any API or feature starting with
unstableas unstable. - @manishearth: does cargo-semver-checks figure out that it needs to explicitly disable the unstable feature?
- @robertbastian: I think so.
- @manishearth another benefit: we can now temporarily unstableize new APIs that we still want to iterate on, with ease, because the feature exists
- @robertbastian: metacrate should expose icu_experimental behind the unstable feature
- @sffc So does this mean that we will have an
unstablefeature but not anexperimentalfeature? - @manishearth CSC treats unstable as a prefix. We could have
unstable_experimental. - @robertbastian what's the diff bw unstable and hidden
- @manishearth My position is that these are two different things: we have "unstable but exposed to experimentors and power users" and we have "internal to be used cross-crate/in macros/for testing".
- @robertbastian: I don't see a distinction between "unstable" and "experimental"
- @manishearth I don't really either. Bit of a distinction between perma-unstable and experimental.
- @manishearth We can use banners if we want to talk about this. Exp APIs should be marked "experimental and unstable" in docs.
- @nekevss Has the experimental keyword been brought up to CSC before
- @manishearth no, we could.
- @nekevss Can see many libraries needing that
- @manishearth would be cool if CSC had configs for this
- @robertbastian: unstable features are also pretty widespread so it's just good
Conclusion 1: All component crates get an unstable feature (other crates as needed).
Conclusion 2: All of our non-provider experimental/unstable APIs that are intended to be used by power users or experimentors should be cfg(feature = "unstable"). We may make a distinction between experimental and perma-unstable in doc banners.
Conclusion 3: Provider code gets marked cfg(not(feature = "unstable"), doc(hidden)). The feature continues to be enabled on docs.rs. We continue to have stability banners.
Addendum to 3: also doc(cfg(feature = "unstable"))
More discussion:
- @sffc Objection to conclusion 2 is that we already have an experimental feature (plurals, datetime). Don't want to break it.
- @robertbastian It's unstable , we can break it
- @manishearth We can leave the feature around empty. But overall I don't think we have
- @robertbastian We have it on capi, Moritz was using it for something
- @sffc We can have two features: unstable_provider and experimental.
- @robertbastian: it should be unstable_experimental (not a fan)
- @manishearth the feature name needs to change for CSC anyway, unless we decide to permanently stick with the pinned baseline model
- @robertbastian Using these feature names would be better for the ecosystem.
- @manishearth: Shane, do you care more about there being two feature names (I don't see a strong need for it) or changing the "experimental" feature name?
- @sffc Don't care abotu two feature names. Care about changing it. Would rather not be driven by the tool conventions.
- @manishearth it's not just a tool convention thing, it makes some sense from a user perspective to be explicit
- @zbraniecki: What's wrong with "experimental", why can't CSC treat it specially
- @manishearth: chance it could be used for non-unstable things. we could ask for an explicit opt out
- @manishearth options on the table:
- We have one unstable feature for everything
- We continue to have an experimental feature, and we also have unstable_provider. a. We teach CSC about opt outs b. We use the scheme proposed in https://github.com/unicode-org/icu4x/pull/6663 to allow for acknowledged CI breakages. (Downside: that's a bit of a custom workflow and is slow)
- We have unstable and experimental. New things get added to the unstable feature and we phase out the experimental feature.
- @manishearth Why do we care about the experimental feature continuing to work
- @sffc We do have code that uses the manual plural operands. And now we have the serde builder in datetime. This is a human timescale thing as opposed to a code timescale. The plurals experimental code we don't want to bless as stable, but it's basically stable.
- @robertbastian: Sounds like provider modules, "more unstable than experimental"
- @manishearth Overall the main reason I'd be hesitant about a "breakage" in unstable code is a change you could make that would be hard to mitigate. I don't think that's the case here; it's easy to notice.
- @robertbastian: the compiler just tells you to use the new feature
- @robertbastian re: human timescales; we had the same discussion about providers.
- @sffc Was okay with changing the collator struct thing but it had only been out by two weeks
- @sffc convinced by rustc being helpful about this.
- @robertbastian We should keep the cargo feature around and have it do nothing so we can have rustc being helpful about it.
- @sffc Think it's cleaner if we're going to replace the feature for people to get the error "the feature is missing", they edit Cargo.toml, and then rustc tells them what to use instead
- @sffc We should discuss the degree to which we make things pub/crate and doc(hidden) in provider.
- @robertbastian We need a single cfg in the pub mod provider. Don't see why we need the markers to be stable.
- @robertbastian datagen can enable the unstable feature
- @manishearth would rather not have datagen use the unstable feature unless it is using non-provider code. Provider code is hidden-but-pub anyway.
Conclusion 1: All component crates get an unstable feature (other crates as needed).
Agreed: @hsivonen @manishearth @nekevss @sffc @robertbastian @zbraniecki
Conclusion 2: All of our non-provider experimental/unstable APIs that are intended to be used by power users or experimentors should be cfg(feature = "unstable") (or something with pub reexports if it is needed internally, but not doc hidden). We may make a distinction between experimental and perma-unstable in doc banners.
Agreed: @manishearth @hsivonen @nekevss @sffc @robertbastian @zbraniecki
Conclusion 3: Provider code gets marked cfg(not(feature = "unstable"), doc(hidden)). The feature continues to be enabled on docs.rs. We continue to have stability banners (or doc(cfg(feature = "unstable")) when we have it). We don't necessarily consider this the long term solution and may refine it, for example, to expose markers without the unstable feature, or make parts of the provider module be pub(crate).
Agreed: @manishearth @robertbastian @sffc @zbraniecki (@nekevss @hsivonen abstain)
All of our non-provider experimental/unstable APIs that are intended to be used by power users or experimentors should be
cfg(feature = "unstable")(or something with pub reexports if it is needed internally, but not doc hidden). We may make a distinction between experimental and perma-unstable in doc banners.
We should have a caveat on this that we can use doc(hidden) instead of completely disappearing the trait if the trait needs to be used in other code.
This came up in https://github.com/unicode-org/icu4x/pull/7092
We also need to iterate on this wrt traits. In particular: new trait methods cannot be gated without having feature resolution hazards, they should have a default impl and be hidden.
Traits themselves can probably be gated unless they need to be hidden due to cross-crate usage.
It's worth going through https://github.com/unicode-org/icu4x/pull/6629 and making sure our policy fills those gaps. I may also just trial-by-fire it by making PRs for individual traits in calendar/datetime.
When it comes to trait stability we have four levels of what can be stable
- using their implementors (gate completely, or doc(hidden) if used cross-crate)
- using them in bounds (do not gate, make methods hidden)
- calling their methods (make UnstableSealed or Sealed, and the UnstableSealed trait is gated)
- implementing them (no gate, fully stable)
A lot of the datetime things live at level 2.
There's some discussion about whether we should put things in an unstable module. Overall I prefer having things live where they will ultimately live. I also don't see much value in having people know from the import path that an item is unstable if they can only get to it by doing the crate-level feature toggle.
So to me the only utility of this is if we make things hidden. The main time we will make things hidden is if an unstable trait is used cross-crate, and having a mod unstable_internal sounds great for that.
In that case, the plan for traits becomes:
(1) If the trait shows up in a bound but we only allow stable users to use concrete type implementors, we gate the trait entirely. Rules is one such trait.
(2) If we want users to stably use a trait in bounds, we do not gate it, but we hide methods. We can potentially prefix them if we want, especially when we expect people might stumble across those methods by default.
Rules could be one such trait, personally I do not find a strong reason to do this right now. A lot of the datetime unstable scaffolding traits are like this.
(3) If calling the methods of the trait is stable but implementing it is not, we use UnstableSealed or Sealed, with UnstableSealed being gated
(4) If implementing them is stable, well, then, we do nothing.
Then, separately, we have an addendum to all of this, where if a gated item or impl item needs to be used cross-crate, we should doc(hidden) it instead of gating it, and ideally have it exported behind an unstable_internal module or something. This way the normal users do not stumble upon it. I've implemented that plan in https://github.com/unicode-org/icu4x/pull/7094 for Rules.
Then, separately, we have an addendum to all of this, where if a gated item or impl item needs to be used cross-crate, we should
doc(hidden)it instead of gating it, and ideally have it exported behind anunstable_internalmodule or something. This way the normal users do not stumble upon it. I've implemented that plan in #7094 for Rules.
My justification for this route is that a conditional doc(hidden) will still show up on docs.rs, which means people may use it without enabling the unstable feature.
Such a module means that if people wish to use the thing on docs.rs they must explicitly opt in to the unstable feature, and internally we use a different, secret path that is very clearly unstable.
I think this is the only situation where we should be using mod unstable or fn unstable_foo.
Marking as discuss to settle on the above.
I find it a little hard to follow the categorization of traits.
Here's my attempt:
- Is the trait intended to be implementable?
- If No:
Sealed - If Yes, but not graduated yet:
UnstableSealed - If Yes, and graduated: (nothing)
- If No:
- Is the trait intended to be used in bounds when it graduates?
- If No: well, if the answer here is no, then the answer to the other two must also be no
- If Yes, but not graduated yet: export in
unstablemodule - If Yes, and graduated: export in a stable spot
- Is the trait intended to be callable when it graduates?
- If No: all methods should be prefixed with
unstable_ - If Yes, but not graduated yet: export in
unstablemodule, because it needs to be imported in order for the methods to be callable, and then the wordunstableappears in the userland code file - If Yes, and graduated: export in a stable spot
- If No: all methods should be prefixed with
Please note: these things can proceed at different rates!
Let's apply this to some traits.
- Datetime scaffolding traits
- Implementable: Yes, but not graduated. Should be bound by
UnstableSealed - Used in bounds: Yes, and graduated. Should be in a stable module.
- Callable: No. Methods should be prefixed with
unstable_. (they aren't, which was a mistake)
- Implementable: Yes, but not graduated. Should be bound by
- Calendar
- Implementable: Yes, but not graduated. Should be bound by
UnstableSealed - Used in bounds: Yes, and graduated. Should be in a stable module.
- Callable: Yes. Methods have normal names.
- Implementable: Yes, but not graduated. Should be bound by
- Rules
- Implementable: Yes, but not graduated. Should be bound by
UnstableSealed - Used in bounds: Yes, and graduated (as far as I am concerned). Should be in a stable module.
- Callable: Yes, but not graduated yet. Should be in an unstable module. (oops, conflict!)
- Implementable: Yes, but not graduated. Should be bound by
So, my framework generally works, but there's a conflict if a trait is graduated for use in bounds but is not yet graduated for callability. In such a case, I think the unstable module should win, but the trait can be pub(crate) at its intended destination in order to reduce churn within the crate, where most 1st-party implementers tend to live.
@sffc Why have you chosen to use an unstable module in your solutions instead of feature = unstable? We already have agreement on using feature = unstable if possible, an unstable module is reverting the prior decision. Is that intentional?
There are specifically, two cases where feature = unstable is not necessarily sufficient:
- Items that need to be hidden for internal ICU4X use: you risk people seeing them in docs and using them without opting in via the feature.
- The methods of traits that are stable for bounds but unstable for calls.
Both of these cases are ones where the "best we can do" with the current model is cfg_attr(not(feature = "unstable"), doc(hidden)), where the main hazard is that people might notice the item in docs, try to use it, and not be forced to opt in to an "unstable" feature. Having an unstable module/prefix helps make this more overt.
Outside of that, if we want to reverse the decision on feature = unstable, I would like to see arguments for why a fine-grained per-API opt-in is better than a per-crate Cargo.toml opt in.
but there's a conflict if a trait is graduated for use in bounds but is not yet graduated for callability. In such a case, I think the unstable module should win, but the trait can be pub(crate) at its intended destination in order to reduce churn within the crate, where most 1st-party implementers tend to live.
The unstable module feels like the wrong solution here? If it's stable for bounds I should be able to name the trait without needing to type unstable, it's misleading.
feature = unstable is not sufficient if the trait is used across crate boundaries, as you noted. The feature is only sufficient for me if the trait can be fully gated.
Here's a table form with an additional condition about crate boundaries. Read the table column-first.
| Intended State | Implementable? | Usable in Bounds? | Callable? |
|---|---|---|---|
| No | Sealed | Private | Unstable Methods |
| No, except within ICU4X crates | UnstableSealed | Unstable Module | Unstable Methods |
| Yes, not graduated or for power users only Not used in any stable API |
Unstable Feature | Unstable Feature | Unstable Feature |
| Yes, not graduated or for power users only Used in a public stable API |
UnstableSealed | Unstable Module | Unstable Module |
| Yes, graduated for general users | no constraints | Stable Module | Stable Module |
Again, the goal is to prevent clients from using something that isn't graduated without having the word "unstable" somewhere in their project, either in their Cargo.toml or in their code file.
feature = unstable is not sufficient if the trait is used across crate boundaries, as you noted. The feature is only sufficient for me if the trait can be fully gated.
This is the exception to the norm, though. I do not want all of our unstable-for-bounds traits to need to be in an unstable module just because some of them need to be. We have a policy that mostly gives us the right answer already agreed upon, there are just a couple edge cases around cross-crate internal uses.
Here's a table form with an additional condition about crate boundaries. Read the table column-first.
This works for me, with the caveat that it should be "Not used cross-crate in any stable API".
Again, the goal is to prevent clients from using something that isn't graduated without having the word "unstable" somewhere in their project, either in their Cargo.toml or in their code file.
Yes, I agree.
This works for me, with the caveat that it should be "Not used cross-crate in any stable API".
Not sure if I understand? If any stable API uses it as a bound, then my table implies that it need to be in an unstable module. If all APIs using the trait are themselves gated by an unstable feature, even cross-crate, then the trait need not be in an unstable module.
This distinction is why I had the four-way distinction in my comment. I do not think something being used as a stable bound implies it needs to be in an unstable module, it can still be hidden behind an unstable feature.
(We should try and discuss this)
Shane hadn't realized this was one of the options on the table:
pub struct LunarChinese<X: inner::Rules>(..);
#[cfg(feature = "unstable")]
pub use inner::Rules;
mod inner {
pub trait Rules {}
}
We are agreed on the general principle that people should be unable to use unstable stuff without typing "unstable" anywhere. Furthermore we seem to prefer using unstable feautres where possible, and unstable modules/method names can be used where the feature does not suffice. The above trick, which is what we currently do for LunarChinese, works for most cases where a trait is used in bounds for a public API but we do not want to allow people to use it in their own bounds (they should use concrete types only).
The strategy in https://github.com/unicode-org/icu4x/pull/7094 works nicely and is illustrative.
In https://github.com/unicode-org/icu4x/pull/7094, LunarChinese (which is not used cross crate) has its Rules be "private on stable, exposed on unstable".
On the other hand, Hijri's Rules is exported twice: it is exported as hijri::Rules only with "unstable" (which is how it will show up in the docs), and people who are looking at the all-features docs attempting to use hijri::Rules will be forced to use the feature. The internal usage happens via doc(hidden) hijri::unstable_internal::Rules. If you discover that internal path somehow (maybe via diagnostics), it will be immediately obvious that you are doing something weird and unstable.