langium icon indicating copy to clipboard operation
langium copied to clipboard

add an option to generate a partial ast definition

Open ydaveluy opened this issue 8 months ago • 7 comments

~~This PR adds an option in the langium-cli configuration to generate the ast with optional properties except for boolean and array types which are respectively initialized with false and [].~~

~~When optionalProperties is undefined or false the behavior is unchanged: the generated ast interfaces are as it would be expected when the document is successfully parsed (no recovered token).~~

~~When optionalProperties is true the generated ast interfaces match with the output generated by the parser (including invalid inputs which are parsed with recovered tokens generated by the parser): all properties are optional except for boolean and arrays.~~

edit: The generated/ast.ts file is relevant when the input document is valid (i.e., no missing tokens) and can be safely used for code generation or any other tasks that require a valid document.

For scoping, validation, and most LSP services, relying on generated/ast.ts is very error-prone because the input document may be incomplete during editing. The parser automatically generates some recovered tokens to parse the document as much as possible. As a result, some mandatory properties in the AST may be undefined.

This PR introduces a generated/ast-partial.ts type definition that can be used as a drop-in replacement for generated/ast.ts. The same types are exported, except that properties that may be undefined after parsing are marked as optional.

As an example the arithmentic language is buggy due to false assumptions on the parsed AST:

Module basicMath
def b: 3;
b % ; 

=> An error occurred during validation: Cannot read properties of undefined (reading '$document')

ydaveluy avatar Mar 18 '25 19:03 ydaveluy

Hi @msujew , I have some doubts about the usability of this feature. It will generate an AST that will be good for scoping and validation but annoying for code generation or when it is assumed that the AST is valid.

So why not generating 2 ast.ts files : keeping the current ast.ts as is and generating an ast-parsial.ts with the same types but with optional properties where required.

With such design it would be possible to import types from ast-partial.ts for scoping and validation and to use ast.ts for generation.

ydaveluy avatar Mar 22 '25 09:03 ydaveluy

I have some doubts about the usability of this feature. It will generate an AST that will be good for scoping and validation but annoying for code generation or when it is assumed that the AST is valid.

👍 yes, that's exactly why we decided not to make properties optional.

I assume it's possible to achieve what you want (all properties optional except for array and boolean) using a TypeScript wrapper type (a more sophisticated version of Partial). Have you tried that?

spoenemann avatar Mar 26 '25 13:03 spoenemann

@spoenemann See the discussion over in https://github.com/eclipse-langium/langium/pull/1848#issuecomment-2729378402.

@ydaveluy I don't think we should generate 2 ASTs, as that has no benefit compared to the "normal" AST + an optional wrapper type.

msujew avatar Mar 26 '25 13:03 msujew

@spoenemann Yes you're right, a more sophisticated version of DeppPartial works well;

it makes properties optional except if it starts with $ or is a boolean or an array:

export type DeepOptional<T> = T extends Reference<infer U>
  ? Reference<DeepOptional<U>>
  : T extends AstNode
  ? {
      [K in keyof T as K extends `$${string}`
        ? K
        : T[K] extends boolean
        ? K
        : never]: T[K];
    } & {
      [K in keyof T as K extends `$${string}`
        ? never
        : T[K] extends boolean
        ? never
        : K]: T[K] extends Array<infer U>
        ? Array<DeepOptional<U>>
        : DeepOptional<T[K]> | undefined;
    }
  : T;

ydaveluy avatar Mar 26 '25 14:03 ydaveluy

Nice! We could add that to our code base instead of this PR, or WDYT?

However, we'd need a name that better reflects its intent – DeepOptional is almost the same as DeepPartial.

spoenemann avatar Mar 26 '25 15:03 spoenemann

@spoenemann Yes I think it could be usefull to add it to Langium. For the new name, no idea :-)

@msujew According to me it is still very error prone to rely only on DeepOptional in the validator or scoping. It is very easy to forget to add DeepOptional everywhere when the ast contains lots of types. Moreover, when using ast.isMyType(element) the element is inferred as MyType which does not provides the optional properties as DeepOptional<MyType>.

I wrote this small script that generate ast-partial.ts from ast.ts, each type is mapped with DeepOptional, functions like isMyType are redefined to return a DeepOptional type and constants are re-exported. This way the ast-partial.ts can be used like the ast.ts:


import * as fs from 'fs'

const INPUT_FILE = "./src/language/generated/ast.ts";
const OUTPUT_FILE = "./src/language/generated/ast-partial.ts";

const content = fs.readFileSync(INPUT_FILE, "utf-8");

const typeRegex = /export\s+(?:interface|type)\s+(\w+)/g;
const functionRegex = /export\s+function\s+(is\w+)\s*\(/g;
const constRegex = /export\s+const\s+(\w+)\s*=/g;

const types = [...content.matchAll(typeRegex)].map(match => match[1]);

const functions = [...content.matchAll(functionRegex)].map(match => match[1]);

const constants = [...content.matchAll(constRegex)].map(match => match[1]);

const output = `import * as base from './ast.js';
import type { AstNode } from 'langium';

type DeepOptional<T> = T extends AstNode
  ? {
      [K in keyof T as K extends \`$\${string}\`
        ? K
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        : T[K] extends boolean | any[]
        ? K
        : never]: T[K];
    } & {
      [K in keyof T as K extends \`$\${string}\`
        ? never
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        : T[K] extends boolean | any[]
        ? never
        : K]?: T[K] extends object ? DeepOptional<T[K]> : T[K];
    }
  : T;

${types.map(type => `export type ${type} = DeepOptional<base.${type}>;`).join("\n")}

${functions.map(fn => `export function ${fn}(item: unknown): item is ${fn.slice(2)} {\n    return base.${fn}(item);\n}`).join("\n")}

${constants.map(constName => `export const ${constName} = base.${constName};`).join("\n")}

`;

fs.writeFileSync(OUTPUT_FILE, output, "utf-8");

Using ast-partial.ts instead of ast.ts helped me to detect many access on potentially undefined properties in my validators.

ydaveluy avatar Mar 26 '25 18:03 ydaveluy

In my experience working with TypeScript, there are often situations where we need to make a compromise between full type safety and complexity of the solution. I would not always regard the most type-safe solution as the best one – it depends on the effort required to get there and the amount of additional complexity introduced.

In this case, I lean towards keeping things simpler rather than generating even more type definitions.

spoenemann avatar Mar 28 '25 09:03 spoenemann