rushstack icon indicating copy to clipboard operation
rushstack copied to clipboard

[api-extractor] API report shows spurious diffs because compiler emits inferred types inconsistently

Open dmichon-msft opened this issue 5 years ago • 12 comments

Is this a feature or a bug?

  • [ ] Feature
  • [x] Bug

Please describe the actual behavior. TypeScript does not define the output order of the constituents of a union type. As a result of this, the output from API-extractor is inconsistent between fresh builds and incremental builds. From discussion with the TypeScript team, this is related to the order of loading of types and whether or not certain types need to be instantiated to type check the set of files that were built during the compilation.

The default sort order for union type constituents is based on the internal debug id, which is assigned when the type is instantiated.

If the issue is a bug, how can we reproduce it? Please provide detailed steps and include a GitHub branch if applicable. Your issue will get resolved faster if you can make it easy to investigate. https://github.com/dmichon-msft/union-type-order-demo Repro steps:

  1. Increment value of incrementStep1 in index.ts. Run npm run build. Observe order of export const test declaration.
  2. Increment value of incrementStep2 in B.ts. Run npm run build. Observe order of export const test declaration.
  3. Repeat (1) if necessary.

The order will change depending on which files needed recompilation. The order directly matches that which is output in lib/index.d.ts, which is what changes.

What is the expected behavior? Since API extractor is concerned with substantive changes, it should define a standard (potentially configurable, but not required) sort order for the constituents of union types, regardless of the order output by TypeScript. This should be similar to the sorting of interface members that api-extractor already performs.

If this is a bug, please provide the tool version, Node.js version, and OS.

  • Tool: api-extractor
  • Tool Version: 7.8.12, with TypeScript 3.7.5
  • Node Version: 10.19.0
    • Is this a LTS version? Yes
    • Have you tested on a LTS version? Yes
  • OS: Windows 10

Edited 6/23@1:29 PM: simplified repro.

dmichon-msft avatar Jun 23 '20 20:06 dmichon-msft

@dmichon-msft This repro doesn't seem to work.

1. Increment value of `incrementStep1` in `index.ts`

index.ts looks like this -- I don't see incrementStep1 in there:

export * from './A';
export * from './B';
export * from './C';

The README.md in the repo instead says:

Make a change to A.ts (e.g. add an optional property to IKeyed). Run npm run build. Observe order of export const test declaration.

Add what property? What should I expect to observe exactly?

Please give explicit instructions.

octogonz avatar Jun 23 '20 22:06 octogonz

Oops, forgot to push when I simplified the repro. The states are:

export declare const test: ObjectWithProps<Pick<ITest, "a" | "b" | "d" | "c">>;

and

export declare const test: ObjectWithProps<Pick<ITest, "d" | "c" | "b" | "a">>;

Further investigation has concluded that this repro is specifically dependent on inferring a type that contains a union. Providing an explicit typedef solves the underlying instability issue.

dmichon-msft avatar Jun 23 '20 22:06 dmichon-msft

Providing an explicit typedef solves the underlying instability issue.

(BTW I thought your ESLint rules require explicit type declarations?)

That said, if the compiler's output depends non-deterministically on the order in which an incremental build occurs -- that seems like a design flaw in the incremental build feature. Implementing a deterministic algorithm might be difficult work. But that doesn't make it acceptable to remove the guarantee of determinism.

If we remove that guarantee, then essentially ANY type expression can shuffle around arbitrarily during compilation (not just unions), which would imply that API Extractor has the burden of normalizing every detail of every type expression. This seems very much like the compiler's responsibility.

octogonz avatar Jun 23 '20 22:06 octogonz

To clarify a bit: Since there are many equivalent ways to infer a type, it's unsurprising that a type expression might shuffle around a bit due to modifications of its inputs. That part doesn't bother me, or seem like much of a problem for API Extractor's reporting.

The problematic part of this repro is:

  1. npm run build
  2. The compiler outputs "d" | "c" | "b" | "a"
  3. rd /s lib
  4. npm run build
  5. The compiler outputs "a" | "b" | "d" | "c"

Are we really considering this to be a correctly functioning incremental build implementation?

CC @RyanCavanaugh @DanielRosenwasser for thoughts

octogonz avatar Jun 24 '20 00:06 octogonz

Yes, this is functioning correctly. The types are equivalent and we have never instructed anyone to take a dependency on the text representation of a union type, nor would we.

RyanCavanaugh avatar Jun 25 '20 16:06 RyanCavanaugh

The types are equivalent and we have never instructed anyone to take a dependency on the text representation of a union type, nor would we.

@RyanCavanaugh Then we need to normalize the expression afterwards. (This scenario is not about type equivalence, but rather about a text file being processed for reporting and documentation purposes.)

Question: If the compiler now produces random outputs when invoked on the same input file, how random is it? Is it merely union type members shuffling around? Can entire expressions be replaced with a semantically equivalent expression? Can functions be reordered? Can line numbers be added/removed that will shift the source maps?

Basically we need to know what kind of normalization is needed. Today API Extractor makes efforts to faithfully preserve stylistic aspects of its input as much as possible.

octogonz avatar Jun 25 '20 17:06 octogonz

Relevant things that could happen to my knowledge:

  • Union types are printed in an arbitrary order
  • Type aliases may or may not be used to represent equivalent types that were produced from inference. If multiple type aliases for the same type exist, which alias is used (if any) is arbitrary
  • Properties in an inferred object type may be printed in an arbitrary order

RyanCavanaugh avatar Jun 25 '20 17:06 RyanCavanaugh

Does this only happen with inferred types? Will explicitly declared types always get emitted consistently?

octogonz avatar Jun 25 '20 17:06 octogonz

Yeah, if we're able to re-use the existing text of a node, we always will

RyanCavanaugh avatar Jun 25 '20 17:06 RyanCavanaugh

  • Union types are printed in an arbitrary order
  • Properties in an inferred object type may be printed in an arbitrary order

For this, it sounds pretty straightforward to normalize by sorting them alphabetically. (Might even be cheap and useful for the compiler to do that.)

Type aliases may or may not be used to represent equivalent types that were produced from inference. If multiple type aliases for the same type exist, which alias is used (if any) is arbitrary

🤔 Does a normal form even exist for this problem?

Or suppose we wanted to keep the previously reported expression if nothing has changed. Is there even a well defined test to determine if two type expressions are exactly equivalent?

octogonz avatar Jun 25 '20 17:06 octogonz

Possible solutions:

  • The simplest complete solution is for @rushstack/eslint-config to warn about usage of inferred types. We already do that.

  • It would be nice for API Extractor itself to warn about inferred types, but this is not easy because API Extractor analyses .d.ts files. We would need to use .d.ts maps and inspect the corresponding .ts files.

  • Another idea is for API Extractor to sort subexpressions that are a union of string types ("a" | "b" | "d" | "c") but NOT try to normalize any other inferred type subexpression. This would technically resolve @dmichon-msft's original problem, but it implies more general support. I wouldn't want to implement this without evidence that string unions are somehow an important/common special case.

  • Lastly, we could simply update the website to document that API reporting doesn't work reliably with inferred types, and leave it at that. Based on what @RyanCavanaugh said, I'm coming to think emitted inferred types are simply (1) not easy to track in an API report and thus (2) not a best practice for a public API.

@iclanton do you have thoughts?

octogonz avatar Jul 02 '20 22:07 octogonz

We are currently using "incremental": false in tsconfig.json to handle this problem

maximzavadskiy avatar Jun 14 '24 12:06 maximzavadskiy

Using TS project references and composite builds makes this issue for more pronounced. Union ordering and object properties shuffle ordering depending on build order leaning to false positive diffs

relm923 avatar Jul 23 '24 23:07 relm923