antlr4cs Do not use string concatenation on _serializedATN in generated lexer file

Do not use string concatenation on _serializedATN in generated lexer file

Open PatrickHofman opened this issue 3 years ago • 8 comments

In a very large token file, the generated lexer file contains a lot of string concatenations in the _serializedATN string. This causes Visual Studio and MS Build to crash with a StackOverflowException (as reported and to be fixed here: https://github.com/dotnet/runtime/issues/76953).

Is it possible to use one long string instead of a long list of string concatenations? Any other options or ideas?

Repro Visual Studio solution containing the generated token file can be found here: https://github.com/dotnet/runtime/issues/76953#issuecomment-1276662286.

Oct 13 '22 08:10 PatrickHofman

The issue is not actual since ANTLR 4.10. It's int[] instead of big string for C# runtime. Meanwhile, this repository and target is outdated.

Is it possible to use one long string instead of a long list of string concatenations?

It's not frendly for text editors since generated string is very long.

Oct 13 '22 18:10 KvanTTT

It's not frendly for text editors since generated string is very long.

It's a binary encoding in string form. I'm not sure it's meant to be friendly :)

Oct 13 '22 18:10 CyrusNajmabadi

But if you turn on "word wrap", the file will look weird.

Oct 13 '22 18:10 KvanTTT

But if you turn on "word wrap", the file will look weird.

how so?

Oct 13 '22 19:10 CyrusNajmabadi

@KvanTTT I only see 4.6.6 on Nuget. Is there another way to get 4.10+?

Oct 17 '22 08:10 PatrickHofman

how so?

Like the following. Too much different line heights:

Screenshot from 2022-10-18 15-05-47

@KvanTTT I only see 4.6.6 on Nuget. Is there another way to get 4.10+?

Unfortunately not. It's a fork of the official repository. I can only advise you to use either official JavaScript target or wait for the official support of TypeScript. Also, you can try to ask @sharwell to update the fork.

Oct 18 '22 13:10 KvanTTT

Like the following. Too much different line heights:

I don't see any issue with that.

Oct 18 '22 14:10 CyrusNajmabadi

You should migrate to the new packages and newer ANTLR versions as #381

Sep 09 '23 19:09 lextm

antlr4cs antlr4cs copied to clipboard

Do not use string concatenation on _serializedATN in generated lexer file

antlr4cs
antlr4cs copied to clipboard