Evan Jones

Results 46 comments of Evan Jones

Thanks all! Just added support for grammar files (with examples) and updated the grammar syntax to add shell-style comments and allow empty lines between rules, as well as newlines inside...

> There is also https://github.com/1rgs/jsonformer where the input is a json schema Was planning to tackle this next. I've got it more or less working locally in a branch off...

> * Yes, this can become part of a `llama.cpp` or `ggml` sampling API, but I guess for now we can keep it as example and see what are the...

@SlyEcho > The parser doesn't understand UTF-8 so it will create rules that don't match as the user expects. Yeah, my rough plan for Unicode support was to store UTF-16...

> Is there some performance benefit for it? > Would it not be easier to parse into and use some kind of graph data structures using C/C++ structs? Yes, I...

Update on this: working on refactoring this to store the parsed grammar as structs. Also trying to think through the Unicode handling a bit more.

@SlyEcho updated to add a bit more structure to the in-memory grammar. Kept it somewhat flat still (each rule is an array of atomic elements) for memory locality and simplicity...

@mattpulver I reproduced the segfault and it appears the problem in this case is [left-recursive rules](https://en.wikipedia.org/wiki/Left_recursion) like `query-expression` -> `non-join-query-expression` -> `query-expression` or `query-term` -> `non-join-query-term` -> `query-term`. Since the...

@tucnak at this point the grammars are defined over code points, with no specific handling or recognition of grapheme clusters. A particular grammar could recognize grapheme clusters (and in effect...