TypeScript
TypeScript copied to clipboard
Annotate immediately-invoked functions for inlining flow control analysis
Per #9998 and its many offshoots, we have problems analyzing code like this
let x: string | number = "OK";
mystery(() => {
x = 10;
});
if (x === 10) { // Error, we believe this to be impossible
}
The problem is that we don't know if mystery invokes its argument "now" or "later" (or possibly even never!).
The converse problem appears during reads:
let x: string | number = ...;
if (typeof x === 'string') {
// Error, toLowerCase doesn't exist on string | number
let arr = [1, 2, 3].map(n => x.toLowerCase());
}
Proposal is to allow some form of indicating that the function is invoked immediately, e.g.
declare function mystery(arg: immediate () => void): void;
This would "inline" the function body for the purposes of flow control analysis
I suggested something similar in https://github.com/Microsoft/TypeScript/issues/7770#issuecomment-216024556.
Though after thinking about it, I thought it was better to annotate callbacks as non-immediate https://github.com/Microsoft/TypeScript/issues/10180#issuecomment-238063683:
Immediate case:
function doSomething(callback: () => any) {
callback();
}
function fn(x: number|string) {
if (typeof x === 'number') {
doSomething(() => x.toFixed()); // No error.
}
}
Non-immediate case:
function doSomething(callback: async () => any) {
setTimeout(callback, 0);
}
function fn(x: number|string) {
if (typeof x === 'number') {
doSomething(() => x.toFixed()); // Error x is 'number|string'
}
}
Sync callbacks is only assignable to sync callbacks and vice versa for Async callbacks:
function doSomething(callback: () => any) {
setTimeout(callback, 0); // Error a sync callback is not assignable to an async one.
}
function doSomething1(callback: () => any) {
callback();
}
function doSomething2(callback: () => any) {
doSomething1(callback); // No error
}
Though re-using async keyword as a type annotation might conflict with the meaning of async declarations in async/await which makes a function returning a promise. An alternative is to use nonimmediate, though I think that keyword is too long.
.
For reference, there was also some discussion of deferred and immediate modifiers in #9757.
@RyanCavanaugh you commented here that implementing this kind of type modifier would require a massive architectural change. Is that still the case?
For completeness #11463 proposed an alternative solution for this problem. Since that was just an "escape hatch" and this is an instruction to CFA to better understand the code, I would prefer this.
The use case where we have run into this is converting a Promise into a deferral and having to generate a noop function to get around strict null checks:
const noop = () => { };
function createDeferral() {
let complete = noop;
let cancel = noop;
const promise = new Promise<void>((resolve, reject) => {
complete = resolve;
cancel = reject;
});
return { complete, cancel, promise };
}
const deferral = createDeferral();
Though after thinking about it, I thought it was better to annotate callbacks as non-immediate
While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.
Also 🚲 🏠 while it might be a tad bit confusing, why not sync to be a counter to async:
interface PromiseConstructor {
new <T>(executor: sync (resolve: (value?: T | PromiseLike<T>) => void, reject: (reason?: any) => void) => void): Promise<T>;
}
I forgot to mention my points on why I think annotating it as async rather sync.
- Annotating it as
immediateorsyncmeans nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule. - We already annotate async functions(as in async/await) as async. Sync functions has no annotations. So to keep consistent with the current design we should not annotate with immediate.
Maybe to minimize confusion that async only accept a Promise returning callback, a deferred keyword might be more desirable.
I think to cover all cases of "used before assigned" checks you would need both a deferred and an immediate modifier.
The deferred modifier would be needed in cases like @kitsonk reported here where a non-immediately invoked callback references a variable that is initialized later in the calling scope (i.e. lexically later but temporally earlier). immediate wouldn't help with that (unless the compiler assumes that an un-annotated callback must be deferred).
What about the interplay between immediate vs deferred invocation and synchronous vs asynchronous functions, which can occur in any combination?
function invokeImmediately<R>(cb: () => R) {
return cb();
}
let p = invokeImmediately(async () => {
console.log(1);
await null;
console.log(2);
});
console.log(3);
p.then(() => {
console.log(4);
});
// Output:
// 1
// 3
// 2
// 4
Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes synchronously until it either awaits or returns. Then the rest of the function is deferred to another tick.
It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.
@tinganho,
Annotating it as immediate or sync means nearly all callbacks needs to be annotated, assuming most callbacks is sync. Async is the exception, sync is the rule.
In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn.
@yortus,
Here we have immediate invocation of an asynchronous function (fairly common in real code). As shown by the output, the async function executes synchronously until it either awaits or returns. Then the rest of the function is deferred to another tick.
It would be pretty hard to do accurate CFA in this case with types and assignments involved, since some of the callback happens 'in line' with the surrounding code and some execution is deferred until later.
Not knowing anything about how CFA is actually implemented, presumably it could still make a prediction up until the first await statement?
For both your points it would seem that sync doesn't quite cover the nuance of immediate invocation, so immediate may be a better keyword.
@novemberborn I'm assuming though no annotation is immediate?
@tinganho,
I'm assuming though no annotation is immediate?
How do you mean?
Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.
@novemberborn
In the new Promise() example, the executor function implements the Revealing Constructor Pattern. I wouldn't classify it as a callback — the constructor guarantees it's invoked immediately. Callbacks are often invoked in a future turn
In below, my arguments so far have been that no annotation means sync:
interface PromiseConstructor {
new <T>(executor: (resolve: (value?: T | PromiseLike<T>) => void /* is sync*/, reject: (reason?: any) => void) => void): Promise<T>;
}
Still think it would be dangerous to have implicit immediate. There are many callbacks where it doesn't matter and if it does matter, the developer should be explicit in order for CFA not to make assumptions.
@kitsonk the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.
Or are you arguing for that either immediate or deferred must be annotated in all callbacks? Or just implicit deferred is more preferable? In case of the latter, CFA still makes assumption.
@tinganho I don't think this discussion is about how the callback behaves, it's about whether it's called before or after the function it's passed to has returned. PromiseConstructor guarantees it always invokes the `executor callback before returning.
the CFA need to assume either implicit deferred or implicit immediate in case of no annotation
It could be conservative and make neither assumption, consistent with present compiler behaviour. Otherwise all the existing code out there with no annotation would suddenly compile differently (i.e. assuming immediate invocation). Being explicit with immediate (and/or deferred) opts in to extra CFA assumptions that the compiler would not otherwise make.
the CFA need to assume either implicit deferred or implicit immediate in case of no annotation.
No, it needs to assume what it assumes today, which is it could be either, therefore values could have changed, so reset the types.
And picking an example of a Promise resolve is exactly the sort of situation where you can run into problems assuming too much, since promises are always resolved at the end of the turn:
let foo = { foo: 'bar' };
new Promise((resolve) => {
console.log('executor -->', foo.foo);
resolve(foo);
})
.then((result) => {
console.log('result -->', result.foo);
});
foo.foo = 'qat';
console.log('post -->', foo.foo);
/* logs
executor --> bar
post --> qat
result --> qat
*/
What's the difference between neither, both and async? Isn't neither or both assuming worst case which is the async case?
@tinganho currently this example has a "used before assigned" error under conservative (i.e. 'neither') assumptions. But CFA could compile this successfully if it knew that the callback would be invoked later (e.g. with a deferred annotation). So there is a difference between no annotation and a deferred annotation (and an immediate annotation).
@yortus that was filed as a bug and fixed, the neither assumption is still in my opinion an async assumption no?
If you thought the bug should be there, then you are assuming an implicit immediate callback and not neither?
I mean let say a callback can be both. Wouldn't it in all cases assume the async case since it is the worst case?
Just for clarity I think when you say neither, you actually mean either? You must choose between async or sync or both. I don't think a callback can be classified as none of them.
@tinganho yes you are right, I suppose the 'fix' there means the compiler now does assume deferred execution on the basis of pragmatism. But to be clear, it assumes deferred invocation, not whether or not the callback itself is asynchronous.
I'm sure you have considered this at length but given the original example
declare function mystery(f: () => void);
let x: string | number = "OK";
mystery(() => {
x = 10;
});
if (x === 10) { // Error, we believe this to be impossible
}
I think the most intuitive behavior would be for a closure which rebinds x to cause the type of x to revert to string | number.
As @kitsonk remarked
While it is more "burden" on the developer, in the sense of "strictness" I still think it is safer to assume the worst (that it might not be immediate). Like with strictNullChecks overall, so I would be only in favour of an annotation that indicates that it is immediate.
While I am in favor of what CFA brings especially in catching logic errors and redundant (superstitious) checks, I think the example above should not be an error because the caller is justified in checking x against 10 simply because mystery may have invoked the closure but it may not have invoked the closure.
Consider a case where mystery invokes the closure immediately, but conditionally based on some other arbitrary state.
consider:
function conditionalBuilder() {
return {
when: (predicate: () => boolean) => ({
then: (action: () => void) => {
if (predicate()) {
action();
}
}
});
};
}
conditionalBuilder()
.when(() => Math.random() * 100) % 2 > 1)
.then(() => {
x = 10;
});
EDIT fixed typo, formatting
The useful distinction isn't between sync vs async.
There are many synchronous higher-order functions (let's not call them callbacks, because that sounds async) which are not always immediately invoked.
// Invoke N times when N may be 0
function f() {
let x: number
([] as number[]).map(n => {
x = n
})
console.log(x)
}
function provideDefault<T>(x: T | undefined, getDefault: () => T) {
return x === undefined ? getDefault() : x
}
// Invoke 0 or 1 times
function g(n: number | undefined) {
let x: number
const y = provideDefault(n, () => {
x = 0
return 0
})
console.log(x)
}
function createIterator<T>(next: () => IteratorResult<T>) {
return { next }
}
// Invoke later, but *not* asynchronously
function h() {
let x: number
const iter = createIterator(() => {
x = 0
return { value: undefined, done: true }
})
console.log(x)
}
We correctly flag all of these currently.
We (I) need to categorize all the different use cases here - things invoked immediately, things sometimes invoked immediately, etc, and come up with a taxonomy of what callbacks do so we can think about what a holistic solution might look like.
Dealing with a use case where we frigged some types and were caught by the new weak type detection has highlighted that narrowed types work great, until that is referenced in a immediately invoked callback. It is great that CFA tracks perfectly, until the callback...
type FooObject = { values: string[] };
const map: { [key: string]: FooObject | string[] } = { a: { values: ['b'] }, b: ['c'] };
function bar(callback: () => void) { }
Object.keys(map).forEach((key) => {
let value = map[key];
if (Array.isArray(value)) {
value = {
values: value
};
}
value; // FooObject - awesome!!!!
bar(() => {
value; // string[] | FooObject - sad face emoji
});
});
There is a way to work around this problem -- and still keep types -- for the case where the compiler mistakenly assumes the value must be undefined or null.
The following reports, "Property 'toLowerCase' does not exist on type 'never'":
function firstOverToLower(above: string, among: string[]): string {
let found: string|null = null;
among.forEach(s => {
if (s > above)
found = s;
});
if (found === null)
return "NONE";
return found.toLowerCase();
}
A postfix bang on the last line eliminates the problem:
function firstOverToLower(above: string, among: string[]): string {
let found: string|null = null;
among.forEach(s => {
if (s > above)
found = s;
});
if (found === null)
return "NONE";
return found!.toLowerCase();
}
I realize this doesn't address the general case. Is the only solution to the general case to declare the variable in question as any?
There's another case to include here, that none of the examples above seem to cover. This triggers in React Components that initialize a variable to null in the component rather than in a constructor for the component.
A minimal contrived example (you wouldn't normally use null as an option here, but it serves to demonstrate what I am trying to describe):
interface MyComponentState {
name: string | null;
}
export class MyComponent extends React.Component<{},MyComponentState> {
state = { name: null };
setName = (event: React.FormEvent<HTMLInputElement>) => {
const name = event.currentTarget.value;
this.setState({ name });
}
getNameLength = (): number => {
const { name } = this.state;
return name ? name.length : 0;
}
render{
return (
<div>
{ this.state.name === null && <p>No name set</p> }
<input type="text" onChange={this.setName} value={this.state.name || ''} />
<p>Length of name: {this.getNameLength()}</p>
<div>
);
}
}
In the example above, this.state.name is considered to be permanently null - the getNameLength function will complain that it cannot get length from type never. The work-around is to change the initialization for state to be done inside a constructor function:
export class MyComponent extends React.Component<{},MyComponentState> {
constructor(props: {}) {
super(props);
this.state = { name: null };
}
In reality, that shouldn't be needed when there were no props to be passed in and the initial state did not depend on the input.
That is an unrelated (but also annoying) problem.
You are declaring that your class has a state prop that is of type { name: null }, which happens to be assignable to MyComponentState. But the type of this.state is not MyComponentState.
MyComponentState happens to be the type of super.state (even though it doesn't exist in reality), which is the type that this.state inherits if you don't declare it in the subclass body, like in your second example.
To get the correct type of state while initializing it in the class body, you need to declare it as readonly state: Readonly<MyComponentState> = { ... }.
@Kovensky Thank you for your explanation. I would never have expected to need to declare state as readonly - but it does make some sense given that all further access to change it is via a function instead of directly.
@dawnmist see also #10570