squiggle icon indicating copy to clipboard operation
squiggle copied to clipboard

Semantic analysis (typed ASTs, type checking, etc.)

Open berekuk opened this issue 1 year ago • 2 comments

berekuk avatar Aug 10 '24 17:08 berekuk

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
quri-hub ✅ Ready (Inspect) Visit Preview Aug 30, 2024 7:23pm
squiggle-components ✅ Ready (Inspect) Visit Preview Aug 30, 2024 7:23pm
squiggle-website ✅ Ready (Inspect) Visit Preview Aug 30, 2024 7:23pm
1 Skipped Deployment
Name Status Preview Updated (UTC)
quri-ui ⬜️ Ignored (Inspect) Visit Preview Aug 30, 2024 7:23pm

vercel[bot] avatar Aug 10 '24 17:08 vercel[bot]

⚠️ No Changeset found

Latest commit: 3ac3c85bcd2d1449910435099aafd9b62e00a4da

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

changeset-bot[bot] avatar Aug 10 '24 17:08 changeset-bot[bot]

Some parts of this work decently now, and some need more work.

Things that do work:

  • call inference for builtins:
    • 1 + 1 is Number
    • lognormal(1, 10) is `SampleSetDist
  • call inference for user-defined functions:
    • f() = 1; f() is Number
  • ternary inference with unions:
    • true ? 1 : "2" is Number|String
    • true ? 1 : 2 is Number
  • lookups:
    • { foo: 5 }.foo is Number
    • { foo: 5 }.foo2 will fail at compile time
    • { foo: 5 }["foo"] is Number
    • [1,2,3][1] is Number etc.

More examples in __tests__/analysis/*_test.ts.

berekuk avatar Aug 22 '24 17:08 berekuk

Overview of remaining subtasks that I want to implement:

  • analysis stage only returns the single error, it should return the list of all errors

  • analyzed AST is produced only when the code compiles, so right now you can't debug types in the playground when types are incorrect

  • type.toString() is not wrapped, so produced type strings are often hard to read: Screenshot 2024-08-22 at 14 17 49

  • produced types can sometimes blow up in size, especially with unions (I don't have a good example for this yet but pretty sure it's going to be a problem)

  • complex union types can cause performance issues, there's a quadratic complexity when I merge two union types

  • annotation types are not used; f(x: [3,5]) = x is any => any, not number => any

  • no generics yet

  • no inference for inline lambdas

(I'll explain the last two points below)

berekuk avatar Aug 22 '24 17:08 berekuk

On generics and inline lambdas.

Example: List.map([1,2,3], {|x| x}) infers to List(any).

There are two missing pieces that are necessary for it to become List(Number).

First, generics: during a call like this, [1, 2, 3] (List(Number)) should be matched against List('A), so 'A should become Number, and then it should be used for inferring ('A, index?: Number) => 'B output type.

Second, inferring {|x| x} signature. Right now I'm always inferring things in the bottom-up fashion; e.g. for 1 + 2 I infer that 1 is Number and 1 is Number, and only then I infer that Number + Number is a Number.

In case of inline lambdas, the flow goes in the opposite way.

This entire problem is pretty complicated; I'm not sure yet if I should look into bidirectional typechecking to approach it in a principled way, or try to hack together some good enough solution.

But Squiggle code is often so list-heavy that type inference without map and reduce would feel very incomplete.

berekuk avatar Aug 22 '24 17:08 berekuk

Btw, basic type inference examples story is here: https://squiggle-components-git-semantic-ab63a5-quantified-uncertainty.vercel.app/?path=/story/squiggle-squiggleplayground-typeinference--basic-examples

berekuk avatar Aug 22 '24 17:08 berekuk

Another issue: types and packing/unpacking are tangled in a somewhat weird way; I had to write more type classes than I'd prefer, e.g. TOr and TDistOrNumber and TUnion are kind of the same.

This complicates the type-checking code, and quite possibly can lead to errors in the type-checking implementation.

Packing/unpacking is only relevant to the function registry; user-defined functions do type.check(value): boolean, while builtin functions do type.unpack(value): T | undefined.

For now, this is the only use of unpack, which makes me want to re-introduce "fr types"; objects that would combine:

  • unpack function
  • reference to the underlying type to use at the compile time

This would require more code in the function registry, but less code in types/ and the type checker, which is probably a win, because type checking is harder to get right, so moving any unnecessary complexity out of type checker is good.

The annoying thing is that I'll have to update all of fr/*.ts, that's a lot of code, and it's not trivial to automate. Also, constructing nested "fr types" that match the nested "compile-time types" is a pain.

berekuk avatar Aug 22 '24 19:08 berekuk

Does this get rid of units, or is there a plan there? image

OAGr avatar Aug 24 '24 03:08 OAGr

It's neat that it types all the in-between variables (ones not in the outer scope)

image

OAGr avatar Aug 24 '24 03:08 OAGr

Some of these are registering as strings for some reason.

image

OAGr avatar Aug 24 '24 03:08 OAGr

Does this get rid of units, or is there a plan there?

No, they're just not implemented yet.

There are two parts to this:

  1. there's a bug in AST handling that I missed
  2. for now, units are handled separately from the new type system, I'm not sure if I'll implement this in this PR or not, this depends on if I'll manage to make the support for generics flexible and robust enough

berekuk avatar Aug 24 '24 17:08 berekuk