codehike icon indicating copy to clipboard operation
codehike copied to clipboard

Pass custom grammars

Open JaKXz opened this issue 3 years ago • 11 comments

Reference: https://shiki.matsu.io/

JaKXz avatar Oct 31 '22 00:10 JaKXz

I'd also be interested in either passing a custom grammar or some guidance for how to go about adding languages "officially". I'd love to be able to use Code Hike for PromQL educational stuff and there's an old PromQL textmate grammar at https://github.com/prometheus-community/vscode-promql/blob/master/syntaxes/promql.tmlanguage.yml that I could revive. Is Shiki the right place to contribute, or would https://github.com/code-hike/lighter be (if there is interest in adding new langauges at all)?

juliusv avatar Aug 12 '23 14:08 juliusv

Any thoughts on implementing this? I'm happy to assist, but didn't want to work my way toward a solution if maintainers have a specific implementation in mind.

lachieh avatar Nov 18 '23 21:11 lachieh

This is something I want to add. I'm working on v1.0, so it will have to wait until I release that, which may take some time.

pomber avatar Nov 20 '23 11:11 pomber

As in, work shouldn't start until v1 is released because of churn? Or you'd like to consider this for v1?

lachieh avatar Nov 20 '23 15:11 lachieh

As in, work shouldn't start until v1 is released because of churn? Or you'd like to consider this for v1?

As in: I want to include it in v1.0

pomber avatar Nov 22 '23 13:11 pomber

@lachieh @juliusv can you share your custom grammars and sample code so I can test this with real world scenarios?

pomber avatar Dec 08 '23 13:12 pomber

Sure thing! Here's a grammar for the WIT WebAssembly Interface Types language, and a code sample to match:

WIT Grammar
{
  "$schema": "https://raw.githubusercontent.com/martinring/tmlanguage/master/tmlanguage.json",
  "name": "WIT",
  "fileTypes": [
    "wit"
  ],
  "uuid": "73554272-ff1a-4515-879e-39a6dcec955d",
  "foldingStartMarker": "(\\{|\\[)\\s*",
  "foldingStopMarker": "\\s*(\\}|\\])",
  "scopeName": "source.wit",
  "patterns": [
    {
      "include": "#comment"
    },
    {
      "include": "#package"
    },
    {
      "include": "#toplevel-use"
    },
    {
      "include": "#world"
    },
    {
      "include": "#interface"
    },
    {
      "include": "#whitespace"
    }
  ],
  "repository": {
    "whitespace": {
      "name": "meta.whitespace.wit",
      "comment": "whitespace token",
      "match": "\\s+"
    },
    "comment": {
      "patterns": [
        {
          "include": "#block-comments"
        },
        {
          "include": "#doc-comment"
        },
        {
          "include": "#line-comment"
        }
      ]
    },
    "doc-comment": {
      "name": "comment.line.documentation.wit",
      "comment": "documentation comments",
      "begin": "^\\s*///",
      "end": "$",
      "patterns": [
        {
          "include": "#markdown"
        }
      ]
    },
    "line-comment": {
      "name": "comment.line.double-slash.wit",
      "comment": "line comments",
      "match": "\\s*//.*"
    },
    "block-comments": {
      "patterns": [
        {
          "name": "comment.block.empty.wit",
          "comment": "empty block comments",
          "match": "/\\*\\*/"
        },
        {
          "name": "comment.block.documentation.wit",
          "comment": "block documentation comments",
          "begin": "/\\*\\*",
          "end": "\\*/",
          "applyEndPatternLast": 1,
          "patterns": [
            {
              "include": "#block-comments"
            },
            {
              "include": "#markdown"
            },
            {
              "include": "#whitespace"
            }
          ]
        },
        {
          "name": "comment.block.wit",
          "comment": "block comments",
          "begin": "/\\*(?!\\*)",
          "end": "\\*/",
          "applyEndPatternLast": 1,
          "patterns": [
            {
              "include": "#block-comments"
            },
            {
              "include": "#whitespace"
            }
          ]
        }
      ]
    },
    "markdown": {
      "patterns": [
        {
          "match": "\\G\\s*(#+.*)$",
          "captures": {
            "1": {
              "name": "markup.heading.markdown"
            }
          }
        },
        {
          "match": "\\G\\s*((\\>)\\s+)+",
          "captures": {
            "2": {
              "name": "punctuation.definition.quote.begin.markdown"
            }
          }
        },
        {
          "match": "\\G\\s*(\\-)\\s+",
          "captures": {
            "1": {
              "name": "punctuation.definition.list.begin.markdown"
            }
          }
        },
        {
          "match": "\\G\\s*(([0-9]+\\.)\\s+)",
          "captures": {
            "1": {
              "name": "markup.list.numbered.markdown"
            },
            "2": {
              "name": "punctuation.definition.list.begin.markdown"
            }
          }
        },
        {
          "match": "(`.*?`)",
          "captures": {
            "1": {
              "name": "markup.italic.markdown"
            }
          }
        },
        {
          "match": "\\b(__.*?__)",
          "captures": {
            "1": {
              "name": "markup.bold.markdown"
            }
          }
        },
        {
          "match": "\\b(_.*?_)",
          "captures": {
            "1": {
              "name": "markup.italic.markdown"
            }
          }
        },
        {
          "match": "(\\*\\*.*?\\*\\*)",
          "captures": {
            "1": {
              "name": "markup.bold.markdown"
            }
          }
        },
        {
          "match": "(\\*.*?\\*)",
          "captures": {
            "1": {
              "name": "markup.italic.markdown"
            }
          }
        }
      ]
    },
    "operator": {
      "patterns": [
        {
          "name": "punctuation.equal.wit",
          "match": "\\="
        },
        {
          "name": "punctuation.comma.wit",
          "match": "\\,"
        },
        {
          "name": "keyword.operator.key-value.wit",
          "match": "\\:"
        },
        {
          "name": "punctuation.semicolon.wit",
          "match": "\\;"
        },
        {
          "name": "punctuation.brackets.round.begin.wit",
          "match": "\\("
        },
        {
          "name": "punctuation.brackets.round.end.wit",
          "match": "\\)"
        },
        {
          "name": "punctuation.brackets.curly.begin.wit",
          "match": "\\{"
        },
        {
          "name": "punctuation.brackets.curly.end.wit",
          "match": "\\}"
        },
        {
          "name": "punctuation.brackets.angle.begin.wit",
          "match": "\\<"
        },
        {
          "name": "punctuation.brackets.angle.end.wit",
          "match": "\\>"
        },
        {
          "name": "keyword.operator.star.wit",
          "match": "\\*"
        },
        {
          "name": "keyword.operator.arrow.skinny.wit",
          "match": "\\-\\>"
        }
      ]
    },
    "package": {
      "name": "meta.package-decl.wit",
      "match": "^(package)\\s+([^\\s]+)\\s*",
      "captures": {
        "1": {
          "name": "storage.modifier.package-decl.wit"
        },
        "2": {
          "name": "meta.id.package-decl.wit",
          "patterns": [
            {
              "name": "meta.package-identifier.wit",
              "match": "([^\\:]+)(\\:)([^\\@]+)((\\@)([^\\s]+))?",
              "captures": {
                "1": {
                  "name": "entity.name.namespace.package-identifier.wit",
                  "patterns": [
                    {
                      "include": "#identifier"
                    }
                  ]
                },
                "2": {
                  "name": "keyword.operator.namespace.package-identifier.wit"
                },
                "3": {
                  "name": "entity.name.type.package-identifier.wit",
                  "patterns": [
                    {
                      "include": "#identifier"
                    }
                  ]
                },
                "5": {
                  "name": "keyword.operator.versioning.package-identifier.wit"
                },
                "6": {
                  "name": "constant.numeric.versioning.package-identifier.wit"
                }
              }
            }
          ]
        }
      }
    },
    "toplevel-use": {
      "name": "meta.toplevel-use-item.wit",
      "match": "^(use)\\s+([^\\s]+)(\\s+(as)\\s+([^\\s]+))?\\s*",
      "captures": {
        "1": {
          "name": "keyword.other.use.toplevel-use-item.wit"
        },
        "2": {
          "name": "meta.interface.toplevel-use-item.wit",
          "patterns": [
            {
              "name": "entity.name.type.declaration.interface.toplevel-use-item.wit",
              "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
            },
            {
              "name": "meta.versioning.interface.toplevel-use-item.wit",
              "match": "(\\@)((0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?)",
              "captures": {
                "1": {
                  "name": "keyword.operator.versioning.interface.toplevel-use-item.wit"
                },
                "2": {
                  "name": "constant.numeric.versioning.interface.toplevel-use-item.wit"
                }
              }
            }
          ]
        },
        "4": {
          "name": "keyword.control.as.toplevel-use-item.wit"
        },
        "5": {
          "name": "entity.name.type.toplevel-use-item.wit"
        }
      }
    },
    "world": {
      "name": "meta.world-item.wit",
      "comment": "Syntax for WIT like `world \"id\" {`",
      "begin": "^\\b(default\\s+)?(world)\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "storage.modifier.default.world-item.wit"
        },
        "2": {
          "name": "keyword.declaration.world.world-item.wit storage.type.wit"
        },
        "3": {
          "name": "entity.name.type.id.world-item.wit"
        },
        "8": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.export-item.wit",
          "comment": "Syntax for WIT like `export \"id\"`",
          "begin": "\\b(export)\\b\\s+([^\\s]+)",
          "beginCaptures": {
            "1": {
              "name": "keyword.control.export.export-item.wit"
            },
            "2": {
              "name": "meta.id.export-item.wit",
              "patterns": [
                {
                  "name": "variable.other.constant.id.export-item.wit",
                  "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
                },
                {
                  "name": "meta.versioning.id.export-item.wit",
                  "match": "(\\@)((0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?)",
                  "captures": {
                    "1": {
                      "name": "keyword.operator.versioning.id.export-item.wit"
                    },
                    "2": {
                      "name": "constant.numeric.versioning.id.export-item.wit"
                    }
                  }
                }
              ]
            }
          },
          "patterns": [
            {
              "include": "#extern"
            },
            {
              "include": "#whitespace"
            }
          ],
          "end": "((?<=\\n)|(?=\\}))",
          "applyEndPatternLast": 1
        },
        {
          "name": "meta.import-item.wit",
          "comment": "Syntax for WIT like `import \"id\"`",
          "begin": "\\b(import)\\s+([^\\s]+)",
          "beginCaptures": {
            "1": {
              "name": "keyword.control.import.import-item.wit"
            },
            "2": {
              "name": "meta.id.import-item.wit",
              "patterns": [
                {
                  "name": "variable.other.id.import-item.wit",
                  "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
                },
                {
                  "name": "meta.versioning.id.import-item.wit",
                  "match": "(\\@)((0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?)",
                  "captures": {
                    "1": {
                      "name": "keyword.operator.versioning.id.import-item.wit"
                    },
                    "2": {
                      "name": "constant.numeric.versioning.id.import-item.wit"
                    }
                  }
                }
              ]
            }
          },
          "patterns": [
            {
              "include": "#extern"
            },
            {
              "include": "#whitespace"
            }
          ],
          "end": "((?<=\\n)|(?=\\}))",
          "applyEndPatternLast": 1
        },
        {
          "name": "meta.include-item.wit",
          "comment": "Syntax for WIT like `include \"use-path\"`",
          "begin": "\\b(include)\\s+([^\\s]+)\\s*",
          "beginCaptures": {
            "1": {
              "name": "keyword.control.include.include-item.wit"
            },
            "2": {
              "name": "meta.use-path.include-item.wit",
              "patterns": [
                {
                  "include": "#use-path"
                }
              ]
            }
          },
          "patterns": [
            {
              "name": "meta.with.include-item.wit",
              "begin": "\\b(with)\\b\\s+(\\{)",
              "beginCaptures": {
                "1": {
                  "name": "keyword.control.with.include-item.wit"
                },
                "2": {
                  "name": "punctuation.brackets.curly.begin.wit"
                }
              },
              "patterns": [
                {
                  "include": "#comment"
                },
                {
                  "name": "meta.include-names-item.wit",
                  "match": "([^\\s]+)\\s+(as)\\s+([^\\s\\,]+)",
                  "captures": {
                    "1": {
                      "name": "variable.other.id.include-names-item.wit"
                    },
                    "2": {
                      "name": "keyword.control.as.include-names-item.wit"
                    },
                    "3": {
                      "name": "entity.name.type.include-names-item.wit"
                    }
                  }
                },
                {
                  "name": "punctuation.comma.wit",
                  "match": "(\\,)"
                },
                {
                  "include": "#whitespace"
                }
              ],
              "end": "(\\})",
              "applyEndPatternLast": 1,
              "endCaptures": {
                "1": {
                  "name": "punctuation.brackets.curly.end.wit"
                }
              }
            }
          ],
          "end": "(?<=\\n)",
          "applyEndPatternLast": 1
        },
        {
          "include": "#use"
        },
        {
          "include": "#typedef-item"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "interface": {
      "name": "meta.interface-item.wit",
      "comment": "Syntax for WIT like `interface \"id\" {`",
      "begin": "^\\b(default\\s+)?(interface)\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "storage.modifier.default.interface-item.wit"
        },
        "2": {
          "name": "keyword.declaration.interface.interface-item.wit storage.type.wit"
        },
        "3": {
          "name": "entity.name.type.id.interface-item.wit"
        },
        "8": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#interface-items"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "extern": {
      "name": "meta.extern-type.wit",
      "patterns": [
        {
          "name": "meta.interface-type.wit",
          "patterns": [
            {
              "begin": "\\b(interface)\\b\\s*(\\{)",
              "beginCaptures": {
                "1": {
                  "name": "keyword.other.interface.interface-type.wit"
                },
                "2": {
                  "name": "ppunctuation.brackets.curly.begin.wit"
                }
              },
              "patterns": [
                {
                  "include": "#comment"
                },
                {
                  "include": "#interface-items"
                },
                {
                  "include": "#whitespace"
                }
              ],
              "end": "(\\})",
              "applyEndPatternLast": 1,
              "endCaptures": {
                "1": {
                  "name": "punctuation.brackets.curly.end.wit"
                }
              }
            }
          ]
        },
        {
          "include": "#function-definition"
        },
        {
          "include": "#use-path"
        }
      ]
    },
    "interface-items": {
      "name": "meta.interface-items.wit",
      "patterns": [
        {
          "include": "#typedef-item"
        },
        {
          "include": "#use"
        },
        {
          "include": "#function"
        }
      ]
    },
    "typedef-item": {
      "name": "meta.typedef-item.wit",
      "patterns": [
        {
          "include": "#resource"
        },
        {
          "include": "#variant"
        },
        {
          "include": "#record"
        },
        {
          "include": "#flags"
        },
        {
          "include": "#enum"
        },
        {
          "include": "#type-definition"
        }
      ]
    },
    "use": {
      "name": "meta.use-item.wit",
      "comment": "Syntax for WIT like `use \"id\".`",
      "begin": "\\b(use)\\b\\s+([^\\s]+)(\\.)(\\{)",
      "beginCaptures": {
        "1": {
          "name": "keyword.other.use.use-item.wit"
        },
        "2": {
          "patterns": [
            {
              "include": "#use-path"
            },
            {
              "include": "#whitespace"
            }
          ]
        },
        "3": {
          "name": "keyword.operator.namespace-separator.use-item.wit"
        },
        "4": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "entity.name.type.declaration.use-names-item.use-item.wit",
          "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
        },
        {
          "name": "punctuation.comma.wit",
          "match": "(\\,)"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "use-path": {
      "name": "meta.use-path.wit",
      "patterns": [
        {
          "name": "entity.name.namespace.id.use-path.wit",
          "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
        },
        {
          "name": "meta.versioning.id.use-path.wit",
          "match": "(\\@)((0|[1-9]\\d*)\\.(0|[1-9]\\d*)\\.(0|[1-9]\\d*)(?:-((?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*)(?:\\.(?:0|[1-9]\\d*|\\d*[a-zA-Z-][0-9a-zA-Z-]*))*))?(?:\\+([0-9a-zA-Z-]+(?:\\.[0-9a-zA-Z-]+)*))?)",
          "captures": {
            "1": {
              "name": "keyword.operator.versioning.id.use-path.wit"
            },
            "2": {
              "name": "constant.numeric.versioning.id.use-path.wit"
            }
          }
        },
        {
          "name": "keyword.operator.namespace-separator.use-path.wit",
          "match": "\\."
        }
      ]
    },
    "type-definition": {
      "name": "meta.type-item.wit",
      "comment": "Syntax for WIT like `type \"id\" =`",
      "begin": "\\b(type)\\b\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\=)",
      "beginCaptures": {
        "1": {
          "name": "keyword.declaration.type.type-item.wit storage.type.wit"
        },
        "2": {
          "name": "entity.name.type.id.type-item.wit"
        },
        "7": {
          "name": "punctuation.equal.wit"
        }
      },
      "patterns": [
        {
          "name": "meta.types.type-item.wit",
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(?<=\\n)",
      "applyEndPatternLast": 1
    },
    "record": {
      "name": "meta.record-item.wit",
      "comment": "Syntax for WIT like `record \"id\" {`",
      "begin": "\\b(record)\\b\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "keyword.declaration.record.record-item.wit"
        },
        "2": {
          "name": "entity.name.type.id.record-item.wit"
        },
        "7": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#record-fields"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "record-fields": {
      "name": "meta.record-fields.wit",
      "begin": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b\\s*(\\:)",
      "beginCaptures": {
        "1": {
          "name": "variable.declaration.id.record-fields.wit"
        },
        "6": {
          "name": "keyword.operator.key-value.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.types.record-fields.wit",
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((\\,)|(?=\\})|(?=\\n))",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "2": {
          "name": "punctuation.comma.wit"
        }
      }
    },
    "flags": {
      "name": "meta.flags-items.wit",
      "comment": "Syntax for WIT like `flags \"id\" {`",
      "begin": "\\b(flags)\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "keyword.other.flags.flags-items.wit"
        },
        "2": {
          "name": "entity.name.type.id.flags-items.wit"
        },
        "7": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#flags-fields"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "flags-fields": {
      "name": "meta.flags-fields.wit",
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "variable.other.enummember.id.flags-fields.wit",
          "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
        },
        {
          "name": "punctuation.comma.wit",
          "match": "(\\,)"
        },
        {
          "include": "#whitespace"
        }
      ]
    },
    "variant": {
      "name": "meta.variant.wit",
      "comment": "Syntax for WIT like `variant \"id\" {`",
      "begin": "\\b(variant)\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "keyword.other.variant.wit"
        },
        "2": {
          "name": "entity.name.type.id.variant.wit"
        },
        "7": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#variant-cases"
        },
        {
          "include": "#enum-cases"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "variant-cases": {
      "name": "meta.variant-cases.wit",
      "begin": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b\\s*(\\()",
      "beginCaptures": {
        "1": {
          "name": "variable.other.enummember.id.variant-cases.wit"
        },
        "6": {
          "name": "punctuation.brackets.round.begin.wit"
        }
      },
      "patterns": [
        {
          "name": "meta.types.variant-cases.wit",
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\))\\s*(\\,)?",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.round.end.wit"
        },
        "2": {
          "name": "punctuation.comma.wit"
        }
      }
    },
    "enum": {
      "name": "meta.enum-items.wit",
      "comment": "Syntax for WIT like `enum \"id\" {`",
      "begin": "\\b(enum)\\b\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\{)",
      "beginCaptures": {
        "1": {
          "name": "keyword.other.enum.enum-items.wit"
        },
        "2": {
          "name": "entity.name.type.id.enum-items.wit"
        },
        "7": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#enum-cases"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "enum-cases": {
      "name": "meta.enum-cases.wit",
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "variable.other.enummember.id.enum-cases.wit",
          "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
        },
        {
          "name": "punctuation.comma.wit",
          "match": "(\\,)"
        },
        {
          "include": "#whitespace"
        }
      ]
    },
    "types": {
      "name": "meta.ty.wit",
      "comment": "Syntax for WIT types corresponding to the interface types specification",
      "patterns": [
        {
          "include": "#primitive"
        },
        {
          "include": "#container"
        },
        {
          "include": "#identifier"
        }
      ]
    },
    "primitive": {
      "name": "meta.primitive.ty.wit",
      "comment": "Syntax for WIT primitives like `'u8' | 'bool' | 'string'` and more",
      "patterns": [
        {
          "include": "#numeric"
        },
        {
          "include": "#boolean"
        },
        {
          "include": "#string"
        }
      ]
    },
    "numeric": {
      "name": "entity.name.type.numeric.wit",
      "comment": "Syntax for numeric types identifiers such as signed and unsigned integers and floating point identifiers",
      "match": "\\b(u8|u16|u32|u64|s8|s16|s32|s64|float32|float64)\\b"
    },
    "boolean": {
      "name": "entity.name.type.boolean.wit",
      "comment": "Syntax for boolean types such as bool",
      "match": "\\b(bool)\\b"
    },
    "string": {
      "name": "entity.name.type.string.wit",
      "comment": "Syntax for primitive types such as string and char",
      "match": "\\b(string|char)\\b"
    },
    "container": {
      "name": "meta.container.ty.wit",
      "comment": "Syntax for WIT containers like `tuple | list | result | handle`",
      "patterns": [
        {
          "include": "#tuple"
        },
        {
          "include": "#list"
        },
        {
          "include": "#option"
        },
        {
          "include": "#result"
        },
        {
          "include": "#handle"
        }
      ]
    },
    "tuple": {
      "name": "meta.tuple.ty.wit",
      "comment": "Syntax for WIT types such as tuple",
      "begin": "\\b(tuple)\\b(\\<)",
      "beginCaptures": {
        "1": {
          "name": "entity.name.type.tuple.wit"
        },
        "2": {
          "name": "punctuation.brackets.angle.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.types.tuple.wit",
          "include": "#types"
        },
        {
          "name": "punctuation.comma.wit",
          "match": "(\\,)"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\>)",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.angle.end.wit"
        }
      }
    },
    "list": {
      "name": "meta.list.ty.wit",
      "comment": "Syntax for WIT types such as list",
      "begin": "\\b(list)\\b(\\<)",
      "beginCaptures": {
        "1": {
          "name": "entity.name.type.list.wit"
        },
        "2": {
          "name": "punctuation.brackets.angle.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.types.list.wit",
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\>)",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.angle.end.wit"
        }
      }
    },
    "option": {
      "name": "meta.option.ty.wit",
      "comment": "Syntax for WIT types such as option",
      "begin": "\\b(option)\\b(\\<)",
      "beginCaptures": {
        "1": {
          "name": "entity.name.type.option.wit"
        },
        "2": {
          "name": "punctuation.brackets.angle.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.types.option.wit",
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\>)",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.angle.end.wit"
        }
      }
    },
    "result": {
      "name": "meta.result.ty.wit",
      "comment": "Syntax for WIT types such as result",
      "begin": "\\b(result)\\b",
      "beginCaptures": {
        "1": {
          "name": "entity.name.type.result.wit"
        },
        "2": {
          "name": "punctuation.brackets.angle.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.inner.result.wit",
          "begin": "(\\<)",
          "beginCaptures": {
            "1": {
              "name": "punctuation.brackets.angle.begin.wit"
            }
          },
          "patterns": [
            {
              "include": "#comment"
            },
            {
              "name": "variable.other.inferred-type.result.wit",
              "match": "(?<!\\w)(\\_)(?!\\w)"
            },
            {
              "name": "meta.types.result.wit",
              "include": "#types"
            },
            {
              "name": "punctuation.comma.wit",
              "match": "(?<!result)\\s*(\\,)"
            },
            {
              "include": "#whitespace"
            }
          ],
          "end": "(\\>)",
          "applyEndPatternLast": 1,
          "endCaptures": {
            "1": {
              "name": "punctuation.brackets.angle.end.wit"
            }
          } 
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((?<=\\n)|(?=\\,)|(?=\\}))",
      "applyEndPatternLast": 1
    },
    "handle": {
      "name": "meta.handle.ty.wit",
      "comment": "Syntax for WIT types such as handle",
      "match": "\\b(borrow)\\b(\\<)\\s*%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\>)",
      "captures": {
        "1": {
          "name": "entity.name.type.borrow.handle.wit"
        },
        "2": {
          "name": "punctuation.brackets.angle.begin.wit"
        },
        "3": {
          "name": "entity.name.type.id.handle.wit"
        },
        "8": {
          "name": "punctuation.brackets.angle.end.wit"
        }
      }
    },
    "identifier": {
      "name": "entity.name.type.id.wit",
      "comment": "Syntax for WIT types based on its identifier",
      "match": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b"
    },
    "resource": {
      "name": "meta.resource-item.wit",
      "comment": "Syntax for WIT like `resource \"id\" {`",
      "begin": "\\b(resource)\\b\\s+%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)",
      "beginCaptures": {
        "1": {
          "name": "keyword.other.resource.wit"
        },
        "2": {
          "name": "entity.name.type.id.resource.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#resource-methods"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((?<=\\n)|(?=\\}))",
      "applyEndPatternLast": 1
    },
    "resource-methods": {
      "name": "meta.resource-methods.wit",
      "begin": "(\\{)",
      "beginCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "name": "meta.constructor-type.wit",
          "begin": "\\b(constructor)\\b",
          "beginCaptures": {
            "1": {
              "name": "keyword.other.constructor.constructor-type.wit"
            },
            "2": {
              "name": "punctuation.brackets.round.begin.wit"
            }
          },
          "patterns": [
            {
              "include": "#comment"
            },
            {
              "include": "#parameter-list"
            },
            {
              "include": "#whitespace"
            }
          ],
          "end": "((?<=\\n)|(?=\\}))",
          "applyEndPatternLast": 1
        },
        {
          "include": "#function"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\})",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.curly.end.wit"
        }
      }
    },
    "function": {
      "name": "meta.func-item.wit",
      "comment": "This is a function item that includes its identifier. This starts with a variable name, succeded by a `func` keyword and ends with `new line`",
      "begin": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\s*(\\:)",
      "beginCaptures": {
        "1": {
          "name": "entity.name.function.id.func-item.wit"
        },
        "2": {
          "name": "meta.word.wit"
        },
        "4": {
          "name": "meta.word-separator.wit"
        },
        "5": {
          "name": "meta.word.wit"
        },
        "6": {
          "name": "keyword.operator.key-value.wit"
        }
      },
      "patterns": [
        {
          "include": "#function-definition"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((?<=\\n)|(?=\\}))",
      "applyEndPatternLast": 1
    },
    "function-definition": {
      "name": "meta.func-type.wit",
      "comment": "This is a function definition. This starts with a `func` keyword and ends with `new line`",
      "patterns": [
        {
          "name": "meta.function.wit",
          "begin": "\\b(static\\s+)?(func)\\b",
          "beginCaptures": {
            "1": {
              "name": "storage.modifier.static.func-item.wit"
            },
            "2": {
              "name": "keyword.other.func.func-type.wit"
            }
          },
          "patterns": [
            {
              "include": "#comment"
            },
            {
              "include": "#parameter-list"
            },
            {
              "include": "#result-list"
            },
            {
              "include": "#whitespace"
            }
          ],
          "end": "((?<=\\n)|(?=\\}))",
          "applyEndPatternLast": 1
        }
      ]
    },
    "parameter-list": {
      "name": "meta.param-list.wit",
      "begin": "(\\()",
      "beginCaptures": {
        "1": {
          "name": "punctuation.brackets.round.begin.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#named-type-list"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "(\\))",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "1": {
          "name": "punctuation.brackets.round.end.wit"
        }
      }
    },
    "result-list": {
      "name": "meta.result-list.wit",
      "begin": "(\\-\\>)",
      "beginCaptures": {
        "1": {
          "name": "keyword.operator.arrow.skinny.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#types"
        },
        {
          "include": "#parameter-list"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((?<=\\n)|(?=\\}))",
      "applyEndPatternLast": 1
    },
    "named-type-list": {
      "name": "meta.named-type-list.wit",
      "begin": "\\b%?((?<![\\-\\w])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*)(([\\-])([a-z][0-9a-z]*|[A-Z][0-9A-Z]*))*)\\b\\s*(\\:)",
      "beginCaptures": {
        "1": {
          "name": "variable.parameter.id.named-type.wit"
        },
        "6": {
          "name": "keyword.operator.key-value.wit"
        }
      },
      "patterns": [
        {
          "include": "#comment"
        },
        {
          "include": "#types"
        },
        {
          "include": "#whitespace"
        }
      ],
      "end": "((\\,)|(?=\\))|(?=\\n))",
      "applyEndPatternLast": 1,
      "endCaptures": {
        "2": {
          "name": "punctuation.comma.wit"
        }
      }
    }
  }
}
Language Sample
package acme:[email protected];

interface space-station {
  type astronaut-id = u64;
 
  variant pods {
    none,
    list<u32>,
  }

  flags locations {
    bridge,
    nacelle,
    jefferies-tubes,
  }

  enum error-code {
    access,
    deadlock,
  }
  
  record astronaut {
    id: astronaut-id,
    name: string,
    ship-access: locations,
    manager: option<astronaut>,
    addresses: pods,
  }

  record inventory {
    name: string,
    stock: u32,
    tags: list<string>,
  }

  resource planetary-scanner {
    read-via-stream: func() -> result<input-stream, error-code>;
  }
  
  /// Initiate scan on planet surface
  run-scan: func(in: borrow<planetary-scanner>);
}

interface directory {
  use space-station.{astronaut-id, astronaut};
  
  get-astronaut: func(id: astronaut-id) -> result<astronaut, u32>;
  update-astronaut: func(id: astronaut-id, changes: droid) -> result<astronaut, u32>;
}

world astronauts {
  import wasi:logging;

  export directory.{get-astronaut, update-astronaut};
}

world reporting {
    include astronauts;
    use types.{inventory};

    export get-inventory: func(item: option<string>) -> list<inventory>;
}

The sample is my own but the grammar comes from the vscode-wit extension here: https://github.com/bytecodealliance/vscode-wit/blob/main/syntaxes/wit.tmLanguage.json

They also have a boatload of test wit files in their repo if you need more: https://github.com/bytecodealliance/vscode-wit/tree/main/tests/grammar/integration

lachieh avatar Dec 08 '23 14:12 lachieh

@pomber I am also very much interested in custom Grammer. Currently the r Grammer is rather incomplete. I have also played around with the theme editor, which is very helpful, but before I continue with a custom theme I would first like to work on the custom Grammer. Generally speaking is there any kind of tool or process you know of that can be used to create custom a Grammer? It seems a rather manual process by trial and error.

FlippieCoetser avatar Dec 08 '23 14:12 FlippieCoetser

@FlippieCoetser the grammar for R comes from the shiki source which is a copy from the R VSCode extension source. I'd suggest making an issue and/or contributing a PR to the REditorSupport/vscode-R repo if you want to make improvements to the R grammar.

As for writing your own custom grammars, I have found the VSCode Syntax Highlight Guide to have a lot of helpful resources. Notably, the scope inspector tool is great for debugging.

lachieh avatar Dec 08 '23 15:12 lachieh

@lachieh thanks for the help and direction! I will do as suggested.

FlippieCoetser avatar Dec 08 '23 15:12 FlippieCoetser

I managed to get this working using the following steps. I added all these to my postinstall script in package.json

  1. Copy my custom grammar JSON file into node_modules/@code-hike/lighter/grammars
  2. Add entry to LANG_NAMES using the following: require("@code-hike/lighter").LANG_NAMES.push("polar"); in my next.config.js
  3. Run the following script to modify @code-hike/lighter's source code to reference the new grammar, there are a couple of internal objects that list all supported languages that need to be modified:
const fs = require("fs");

const sourceFile = "./node_modules/@code-hike/lighter/dist/index.cjs.js";
const lighterCode = fs.readFileSync(sourceFile, "utf8");
const lines = lighterCode.split("\n");

// Add polar to `aliasOrIdToScope` entry, which isn't exported
const aliasOrIdToScopeIndex = lines.findIndex(line => line.trim() === 'const aliasOrIdToScope = {');
if (aliasOrIdToScopeIndex === -1) {
  throw new Error("Target line not found");
}
lines.splice(aliasOrIdToScopeIndex + 1, 0, "'polar': 'source.polar',");

// Add polar to `scopeToLanguageData`, which isn't exported
const scopeToLanguageDataIndex = lines.findIndex(line => line.trim() === 'const scopeToLanguageData = {');
if (scopeToLanguageDataIndex === -1) {
  throw new Error("Target line not found");
}
lines.splice(scopeToLanguageDataIndex + 1, 0, "'source.polar':{id:'polar',path:'polar.tmLanguage.json',embeddedScopes: []},");

fs.writeFileSync(sourceFile, lines.join("\n"));

vkarpov15 avatar Jul 11 '24 14:07 vkarpov15