FsLexYacc icon indicating copy to clipboard operation
FsLexYacc copied to clipboard

Not an issue more like an optimization

Open kam1986 opened this issue 3 years ago • 2 comments

In the code generated by the parser generator we see a function "tagOfToken" this can be deleted and replaced with GetHash(), which as far as I can see for all "empty" disjoint union types without attribute values just gives back the position (byte/integer) encoding just like an enum.

Havn't tested it, but I'm quit sure that a reference to some data is en general more effecient than a jump table/branching.

kam1986 avatar Apr 09 '21 22:04 kam1986

Hello, GetHashCode() does not exactly do what we want. Take the following type as an example:

type MyToken =
    | STRING of string
    | INT of int
    | NULL

We want the tag of STRING "foo" to be zero, just like the tag of STRING "bar"; in other words we only care about the kind of the token, not its content. GetHashCode() does not ignore the content. Additionally, there is no guarantee that it would return zero for STRINGs, one for INTs and two for NULLs. Nor there is a guarantee that there would be no collisions. That's why we need a specialized function to get the tag of a token.

What we can do however is use reflection to more efficiently implement tagOfToken, specificaly functions of the FSharp.Reflection.FSharpValue class. If you want to try it, feel free to submit a PR or I will do it myself.

teo-tsirpanis avatar Apr 10 '21 10:04 teo-tsirpanis

I see now. did not take into account that they could carry data. forgot about %token token(s).

kam1986 avatar Apr 10 '21 21:04 kam1986