icu4x Have a better API changelog

It would be nice if the changes to public API were presented in a readable form for approval. cargo public-api exists, however it is noisy. Turning that into a readable API log requires a lot of manual effort.

We have diplomat-coverage which already does this manual allowlisting, but the FFI allowlist is not the same as the public allowlist. We could potentially build off of a shared allowlist, though.

I'm not yet sure if building a useful API changelog tool is an easy endeavor. Worth investigating.

An MVP for this might just be a better-organized changelog, where public API changes are listed separately.

cc @sffc @robertbastian

Oct 23 '25 16:10 Manishearth

I think the main tricky thing here is that this needs a proper design for how it filters and presents API changes in a way that does not end up with a huge changelog. Once we have that, we can see if existing tooling can be massaged into the right shape.

Oct 23 '25 21:10 Manishearth

#7197 brings up an interesting facet: here I've been talking about this document as a TC-facing one. But it would be nice to include it in the changelog, too. We might want to further winnow it somehow, or figure out a process for integrating it:

Generally, how this integrates into CHANGELOG.md is a tricky question, too, and one not yet covered by https://github.com/unicode-org/icu4x/issues/7153. Some potential options:

Maybe the API changelog is succinct enough that we can just copy paste the whole thing into

Maybe we have some rules as to which bullet points are included in the main changelog

Maybe we copy paste all API changelog entries as sub-bullet points of "normal" changelog entries, e.g. the entry would be "Add arithmetic for Date (#nnn)" and it would have a sub-bullet-point saying "New APIs: Date::add, Date::until, Calendar::whatever, DateDifferenceOptions, ...".

Oct 29 '25 17:10 Manishearth

My mental model has been something like the third bullet above, where we put the entries directly in CHANGELOG.md if they are succinct enough, and we link to a supplementary doc if they are too big and not useful to clients.

Oct 29 '25 20:10 sffc

Alright, I promised in #7197 that I would write out the challenges / design decisions to be made.

I do think I have answers to most of these questions. I think we will need to discuss some of those answers, I can post my full thoughts on a solution later. But this should be sufficient to demonstrate that there is a fair amount of work to be done here to produce a useful (for TC) API changelog.

What do we need to display?

So, firstly, the fully expansive list of things that constitute "public API" in Rust are as follows. Note that many of these have an obvious answer whether they should be included or not, but this helps give an idea of what we get when we run various tools like cargo public-api:

Modules and reexports
Items (structs, enums, traits, type defs, constants)
Enum variants
Struct fields
Methods (incl trait methods)
Associated types and consts
implicit implementations of auto traits (Send, Sync)
direct implementation of derive traits (Copy, Clone, Debug, PartialEq, ...)
direct implementation of other traits
indirect implementation of other traits, via blanket impls (this includes stuff like auto-Into impls)
The variance of the type over its type parameters
Impl items (methods, assoc types, etc) inherited on a type because of a new trait impl (for each of the cases above).
The actual type parameters (we might add new defaults!)
For a new reexport, every method, field, impl, (....) found on that item
Actually this list isn't fully expansive there's a whole bunch of little things that I'm probably forgetting.

This rapidly explodes for even a small API surface. Here's the 1500-entry public-api for icu_calendar filtered for unstable items, etc.

Almost certainly we don't care about auto traits, indirect impls, and recursively following reexports. Some of the rest, I'm not sure.

We probably don't need to list associated items if we are listing that a type now implements a trait. That would immediately greatly winnow down the output of a tool like public-api. From manual inspection that's still not enough, though.

Do we filter new APIs differently from modifications to existing APIs?

Secondly, I think the answer here is different for new APIs vs existing APIs. For example, if we add a new options bag, maybe it sufficies to say "new struct: DateFromFieldsOptions (docs link)". Maybe we don't need to list its fields (maybe! I don't know the best answer here). Does that apply to its methods? Its initial trait set?

Whereas if we add a new field or method to an existing type in a new release, of course we should list that.

How do we organize it?

Even after heavy filtering, I'm not sure the information will be useful or reviewable as just a list. Probably we'd want to organize it somewhat. How is this organized?

What's the process?

Finally, once we know what we want, we should figure out how it fits in the process. It depends on the precise requirements, but my current impression of a useful API changelog is that it is a fair amount of work to produce. Maybe that ends up not being the case, but I wouldn't be convinced until we have something concrete. If the API changelog is a lot of work, we have three routes forward there:

Make it the release drivers' problem. This feels like it would explode the amount of work in the release and make release drivering a much harder job to take on. In general we have been trying to make it easier to do releases; so that we can serve client needs better when releases are needed.
Make it the PR author's problem (or the PR reviewer's problem). This might be similar to some of the things we are talking about in https://github.com/unicode-org/icu4x/issues/7154, where the PR author produces an API changelog.
Write a tool to do this. cargo public-api uses the public-api crate, which we could presumably build on. This is my preferred route.

Oct 30 '25 01:10 Manishearth

An example of a case that a process must catch, or else the process is insufficient: valid APIs that smell internal but are not internal being changed to doc(hidden), such as https://github.com/unicode-org/icu4x/pull/7147#discussion_r2500972062. I'm highlighting this because it seems like something likely to fall through the cracks if it relies on only human review.

Nov 07 '25 06:11 sffc

This is already caught by semver CI

Nov 07 '25 15:11 robertbastian

Our current process around semver CI does not notice that, though, or it did not work in #7147.

Nov 07 '25 16:11 Manishearth

Yes it did: https://github.com/unicode-org/icu4x/pull/7147/checks?check_run_id=54741206689

Nov 07 '25 16:11 robertbastian

@robertbastian No, I mean humans in the process did not notice that change. If you noticed it, you should have highlighted it in the PR, which is a potential process change we need. Or the reviewers should be on the hook for checking. IMO everyone should be, preventing breakages is everyone's responsibility.

Part of the problem is that CSC has a lot of false positives around new APIs so personally I didn't click through to that failure because I expected it to be red there. Which is a dangerous expectation to get into and I should probably stop doing that.

Nov 07 '25 16:11 Manishearth

Looking at CI, and in particular the semver check, is part of the reviewer's responsibility.

Nov 07 '25 16:11 robertbastian

Yes, and, the PR author also has a stake in giving the reviewer the information they need to successfully review, readily available.

We caught this eventually; and when I first looked at the PR I was doing a partial-but-sufficient review. So perhaps this wasn't a process failure. I would probably have looked at the CSC eventually. I don't know.

But I'm not sure as it stands I trust the current process to catch this at the PR layer. Having a prerelease or mid-release-cycle process that produces such an API changelog would definitely be useful here, which is the extent of the discussion on this here.

Nov 07 '25 16:11 Manishearth

I don't think we should over dissect whether we had a process failure in #7147. I think the core point is that that's the type of thing that an API changelog ought to surface, regardless of whether this is "handled" in the PR process.

Nov 07 '25 16:11 Manishearth

If this would have been merged, it would have been listed in the semver CI check on main, checking that is part of the release checklist.

Nov 07 '25 16:11 robertbastian

Agreed. So the process would have probably eventually caught this before a release.

As noted elsewhere, the semver CI check is both too noisy for some things and not noisy enough for others, though. It's fine for its primary purpose which is preventing accidental breakages.

This thing is caught by it, but the semver CI check is insufficient in other ways for an "API changelog". So we should make sure this case is caught by the API changelog being discussed here.

Nov 07 '25 16:11 Manishearth

icu4x icu4x copied to clipboard

Have a better API changelog

What do we need to display?

Do we filter new APIs differently from modifications to existing APIs?

How do we organize it?

What's the process?

icu4x
icu4x copied to clipboard