TypeScript icon indicating copy to clipboard operation
TypeScript copied to clipboard

Annotate immediately-invoked functions for inlining flow control analysis

Open RyanCavanaugh opened this issue 9 years ago • 69 comments

Per #9998 and its many offshoots, we have problems analyzing code like this

let x: string | number = "OK";
mystery(() => {
  x = 10;
});
if (x === 10) { // Error, we believe this to be impossible  
}

The problem is that we don't know if mystery invokes its argument "now" or "later" (or possibly even never!).

The converse problem appears during reads:

let x: string | number = ...;
if (typeof x === 'string') {
  // Error, toLowerCase doesn't exist on string | number
  let arr = [1, 2, 3].map(n => x.toLowerCase());
}

Proposal is to allow some form of indicating that the function is invoked immediately, e.g.

declare function mystery(arg: immediate () => void): void;

This would "inline" the function body for the purposes of flow control analysis

RyanCavanaugh avatar Oct 10 '16 19:10 RyanCavanaugh

I suggested something similar in https://github.com/Microsoft/TypeScript/issues/7770#issuecomment-216024556.

Though after thinking about it, I thought it was better to annotate callbacks as non-immediate https://github.com/Microsoft/TypeScript/issues/10180#issuecomment-238063683:

Immediate case:

function doSomething(callback: () => any) {
     callback();
}

function fn(x: number|string) {
  if (typeof x === 'number') {
    doSomething(() => x.toFixed()); // No error.
  }
}

Non-immediate case:

function doSomething(callback: async () => any) {
     setTimeout(callback, 0);
}

function fn(x: number|string) {
  if (typeof x === 'number') {
    doSomething(() => x.toFixed()); // Error x is  'number|string'
  }
}

Sync callbacks is only assignable to sync callbacks and vice versa for Async callbacks:

function doSomething(callback: () => any) {
     setTimeout(callback, 0); // Error a sync callback is not assignable to an async one.
}
function doSomething1(callback: () => any) {
     callback();
}
function doSomething2(callback: () => any) {
     doSomething1(callback); // No error
}

Though re-using async keyword as a type annotation might conflict with the meaning of async declarations in async/await which makes a function returning a promise. An alternative is to use nonimmediate, though I think that keyword is too long. .

tinganho avatar Oct 11 '16 02:10 tinganho

For reference, there was also some discussion of deferred and immediate modifiers in #9757.

yortus avatar Oct 11 '16 03:10 yortus

@RyanCavanaugh you commented here that implementing this kind of type modifier would require a massive architectural change. Is that still the case?

yortus avatar Oct 11 '16 03:10 yortus

For completeness #11463 proposed an alternative solution for this problem. Since that was just an "escape hatch" and this is an instruction to CFA to better understand the code, I would prefer this.

The use case where we have run into this is converting a Promise into a deferral and having to generate a noop function to get around strict null checks:

const noop = () => { };

function createDeferral() {
    let complete = noop;
    let cancel = noop;
    const promise = new Promise<void>((resolve, reject) => {
        complete = resolve;
        cancel = reject;
    });

    return { complete, cancel, promise };
}

const deferral = createDeferral();

Though after thinking about it, I thought it was better to annotate callbacks as non-immediate

While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.

kitsonk avatar Oct 11 '16 06:10 kitsonk

Also 🚲 🏠 while it might be a tad bit confusing, why not sync to be a counter to async:

interface PromiseConstructor {
    new <T>(executor: sync (resolve: (value?: T | PromiseLike<T>) => void, reject: (reason?: any) => void) => void): Promise<T>;
}

kitsonk avatar Oct 11 '16 06:10 kitsonk

I forgot to mention my points on why I think annotating it as async rather sync.

  1. Annotating it as immediate or sync means nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule.
  2. We already annotate async functions(as in async/await) as async. Sync functions has no annotations. So to keep consistent with the current design we should not annotate with immediate.

Maybe to minimize confusion that async only accept a Promise returning callback, a deferred keyword might be more desirable.

tinganho avatar Oct 11 '16 06:10 tinganho

I think to cover all cases of "used before assigned" checks you would need both a deferred and an immediate modifier.

The deferred modifier would be needed in cases like @kitsonk reported here where a non-immediately invoked callback references a variable that is initialized later in the calling scope (i.e. lexically later but temporally earlier). immediate wouldn't help with that (unless the compiler assumes that an un-annotated callback must be deferred).

yortus avatar Oct 11 '16 07:10 yortus

What about the interplay between immediate vs deferred invocation and synchronous vs asynchronous functions, which can occur in any combination?

function invokeImmediately<R>(cb: () => R) {
    return cb();
}

let p = invokeImmediately(async () => {
    console.log(1);
    await null;
    console.log(2);
});

console.log(3);

p.then(() => {
    console.log(4);
});

// Output:
// 1
// 3
// 2
// 4

Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes synchronously until it either awaits or returns. Then the rest of the function is deferred to another tick.

It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.

yortus avatar Oct 11 '16 07:10 yortus

@tinganho,

Annotating it as immediate or sync means nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule.

In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn.

@yortus,

Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes synchronously until it either awaits or returns. Then the rest of the function is deferred to another tick.

It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.

Not knowing anything about how CFA is actually implemented, presumably it could still make a prediction up until the first await statement?

For both your points it would seem that sync doesn't quite cover the nuance of immediate invocation, so immediate may be a better keyword.

novemberborn avatar Oct 11 '16 08:10 novemberborn

@novemberborn I'm assuming though no annotation is immediate?

tinganho avatar Oct 11 '16 09:10 tinganho

@tinganho,

I'm assuming though no annotation is immediate?

How do you mean?

novemberborn avatar Oct 11 '16 09:10 novemberborn

Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.

kitsonk avatar Oct 11 '16 09:10 kitsonk

@novemberborn

In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn

In below, my arguments so far have been that no annotation means sync:

interface PromiseConstructor {
    new <T>(executor: (resolve: (value?: T | PromiseLike<T>) => void /* is sync*/, reject: (reason?: any) => void) => void): Promise<T>;
}

tinganho avatar Oct 11 '16 09:10 tinganho

Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.

@kitsonk the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.

Or are you arguing for that either immediate or deferred must be annotated in all callbacks? Or just implicit deferred is more preferable? In case of the latter, CFA still makes assumption.

tinganho avatar Oct 11 '16 09:10 tinganho

@tinganho I don't think this discussion is about how the callback behaves, it's about whether it's called before or after the function it's passed to has returned. PromiseConstructor guarantees it always invokes the `executor callback before returning.

novemberborn avatar Oct 11 '16 09:10 novemberborn

the CFA need to assume either implicit deferred or implicit immediate in case of no annotation

It could be conservative and make neither assumption, consistent with present compiler behaviour. Otherwise all the existing code out there with no annotation would suddenly compile differently (i.e. assuming immediate invocation). Being explicit with immediate (and/or deferred) opts in to extra CFA assumptions that the compiler would not otherwise make.

yortus avatar Oct 11 '16 09:10 yortus

the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.

No, it needs to assume what it assumes today, which is it could be either, therefore values could have changed, so reset the types.

And picking an example of a Promise resolve is exactly the sort of situation where you can run into problems assuming too much, since promises are always resolved at the end of the turn:

let foo = { foo: 'bar' };

new Promise((resolve) => {
        console.log('executor -->', foo.foo);
        resolve(foo);
    })
    .then((result) => {
        console.log('result -->', result.foo);
    });

foo.foo = 'qat';
console.log('post -->', foo.foo);

/* logs
executor --> bar
post --> qat
result --> qat
*/

kitsonk avatar Oct 11 '16 09:10 kitsonk

What's the difference between neither, both and async? Isn't neither or both assuming worst case which is the async case?

tinganho avatar Oct 11 '16 10:10 tinganho

@tinganho currently this example has a "used before assigned" error under conservative (i.e. 'neither') assumptions. But CFA could compile this successfully if it knew that the callback would be invoked later (e.g. with a deferred annotation). So there is a difference between no annotation and a deferred annotation (and an immediate annotation).

yortus avatar Oct 11 '16 10:10 yortus

@yortus that was filed as a bug and fixed, the neither assumption is still in my opinion an async assumption no?

If you thought the bug should be there, then you are assuming an implicit immediate callback and not neither?

I mean let say a callback can be both. Wouldn't it in all cases assume the async case since it is the worst case?

Just for clarity I think when you say neither, you actually mean either? You must choose between async or sync or both. I don't think a callback can be classified as none of them.

tinganho avatar Oct 11 '16 10:10 tinganho

@tinganho yes you are right, I suppose the 'fix' there means the compiler now does assume deferred execution on the basis of pragmatism. But to be clear, it assumes deferred invocation, not whether or not the callback itself is asynchronous.

yortus avatar Oct 11 '16 11:10 yortus

I'm sure you have considered this at length but given the original example

declare function mystery(f: () => void);
let x: string | number = "OK";
mystery(() => {
    x = 10;
});
if (x === 10) { // Error, we believe this to be impossible  
}

I think the most intuitive behavior would be for a closure which rebinds x to cause the type of x to revert to string | number.

As @kitsonk remarked

While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.

While I am in favor of what CFA brings especially in catching logic errors and redundant (superstitious) checks, I think the example above should not be an error because the caller is justified in checking x against 10 simply because mystery may have invoked the closure but it may not have invoked the closure. Consider a case where mystery invokes the closure immediately, but conditionally based on some other arbitrary state. consider:

function conditionalBuilder() {
  return {
    when: (predicate: () => boolean) => ({
      then: (action: () => void) => {
        if (predicate()) {
          action();
        }
      }
    });
  };
}

conditionalBuilder()
  .when(() => Math.random() * 100) % 2 > 1)
  .then(() => {
    x = 10;
  });

EDIT fixed typo, formatting

aluanhaddad avatar Oct 11 '16 11:10 aluanhaddad

The useful distinction isn't between sync vs async. There are many synchronous higher-order functions (let's not call them callbacks, because that sounds async) which are not always immediately invoked.

// Invoke N times when N may be 0
function f() {
    let x: number
    ([] as number[]).map(n => {
        x = n
    })
    console.log(x)
}

function provideDefault<T>(x: T | undefined, getDefault: () => T) {
    return x === undefined ? getDefault() : x
}

// Invoke 0 or 1 times
function g(n: number | undefined) {
    let x: number
    const y = provideDefault(n, () => {
        x = 0
        return 0
    })
    console.log(x)
}

function createIterator<T>(next: () => IteratorResult<T>) {
    return { next }
}

// Invoke later, but *not* asynchronously
function h() {
    let x: number
    const iter = createIterator(() => {
        x = 0
        return { value: undefined, done: true }
    })
    console.log(x)
}

We correctly flag all of these currently.

ghost avatar Oct 11 '16 13:10 ghost

We (I) need to categorize all the different use cases here - things invoked immediately, things sometimes invoked immediately, etc, and come up with a taxonomy of what callbacks do so we can think about what a holistic solution might look like.

RyanCavanaugh avatar Oct 31 '16 22:10 RyanCavanaugh

Dealing with a use case where we frigged some types and were caught by the new weak type detection has highlighted that narrowed types work great, until that is referenced in a immediately invoked callback. It is great that CFA tracks perfectly, until the callback...

type FooObject = { values: string[] };
const map: { [key: string]: FooObject | string[] } = { a: { values: ['b'] }, b: ['c'] };

function bar(callback: () => void) { }

Object.keys(map).forEach((key) => {
    let value = map[key];
    if (Array.isArray(value)) {
        value = {
            values: value
        };
    }
    value; // FooObject - awesome!!!!
    bar(() => {
        value; // string[] | FooObject - sad face emoji
    });
});

kitsonk avatar Jun 23 '17 09:06 kitsonk

There is a way to work around this problem -- and still keep types -- for the case where the compiler mistakenly assumes the value must be undefined or null.

The following reports, "Property 'toLowerCase' does not exist on type 'never'":

function firstOverToLower(above: string, among: string[]): string {
    let found: string|null = null;
    among.forEach(s => {
        if (s > above)
            found = s;
    });
    if (found === null)
        return "NONE";
    return found.toLowerCase();
}

A postfix bang on the last line eliminates the problem:

function firstOverToLower(above: string, among: string[]): string {
    let found: string|null = null;
    among.forEach(s => {
        if (s > above)
            found = s;
    });
    if (found === null)
        return "NONE";
    return found!.toLowerCase();
}

I realize this doesn't address the general case. Is the only solution to the general case to declare the variable in question as any?

jtlapp avatar Sep 29 '17 22:09 jtlapp

There's another case to include here, that none of the examples above seem to cover. This triggers in React Components that initialize a variable to null in the component rather than in a constructor for the component.

A minimal contrived example (you wouldn't normally use null as an option here, but it serves to demonstrate what I am trying to describe):

interface MyComponentState {
  name: string | null;
}
export class MyComponent extends React.Component<{},MyComponentState> {
  state = { name: null };
  
  setName = (event: React.FormEvent<HTMLInputElement>) => {
    const name = event.currentTarget.value;
    this.setState({ name });
  }

  getNameLength = (): number => {
    const { name } = this.state;
    return name ? name.length : 0;
  }

  render{
    return (
      <div>
        { this.state.name === null && <p>No name set</p> }
        <input type="text" onChange={this.setName} value={this.state.name || ''} />
        <p>Length of name: {this.getNameLength()}</p>
      <div>
    );
  }
}

In the example above, this.state.name is considered to be permanently null - the getNameLength function will complain that it cannot get length from type never. The work-around is to change the initialization for state to be done inside a constructor function:

export class MyComponent extends React.Component<{},MyComponentState> {
  constructor(props: {}) {
    super(props);
    this.state = { name: null };
  }

In reality, that shouldn't be needed when there were no props to be passed in and the initial state did not depend on the input.

dawnmist avatar Dec 04 '17 04:12 dawnmist

That is an unrelated (but also annoying) problem.

You are declaring that your class has a state prop that is of type { name: null }, which happens to be assignable to MyComponentState. But the type of this.state is not MyComponentState.

MyComponentState happens to be the type of super.state (even though it doesn't exist in reality), which is the type that this.state inherits if you don't declare it in the subclass body, like in your second example.

To get the correct type of state while initializing it in the class body, you need to declare it as readonly state: Readonly<MyComponentState> = { ... }.

Jessidhia avatar Dec 04 '17 06:12 Jessidhia

@Kovensky Thank you for your explanation. I would never have expected to need to declare state as readonly - but it does make some sense given that all further access to change it is via a function instead of directly.

dawnmist avatar Dec 04 '17 11:12 dawnmist

@dawnmist see also #10570

RyanCavanaugh avatar Dec 04 '17 20:12 RyanCavanaugh