EditorSyntax Long line length string breaks highlighting in Atom and Github

trafficstars

Environment

macOS

Atom 1.34.0 (https://github.com/jrsconfitto/language-powershell)
VS Code 1.30.2

Issue Description

The included snippet of code highlights correctly in VS Code. It does not highlight at all on Github, and highlights incorrectly in atom. In Atom, the string highlighting is not terminated at the end of the string.

Screenshots

Expected Behavior

Highlighted correctly on Github. ),[IO.Compression.CompressionMode]::Decompress)),[Text.Encoding]::ASCII)).ReadToEnd() highlighted as code in Atom.

Code Samples

sal a New-Object;iex(a IO.StreamReader((a IO.Compression.DeflateStream([IO.MemoryStream][Convert]::FromBase64String('aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvLiAgIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCgpoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsbw=='),[IO.Compression.CompressionMode]::Decompress)),[Text.Encoding]::ASCII)).ReadToEnd()

Jan 23 '19 21:01 wesinator

Linguist (used for Github's highlighting) is at the current revision for this project and so is VS Code so that's weird.

I'll figure out what's going on with Atom but the Github issue may be related to Linguist specifically.

Jan 23 '19 21:01 omniomi

While VS Code shares Atom's textmate REGEX engine, I'd be willing to bet (but do not actually know) that Atom and Linguist share some common textmate portions, or at least share the same issue with handling textmate (maybe a line length buffer issue?), since they are both GitHub projects. VS Code has a limit on how long of a line it will allow to be scoped, for performance reasons, but its like 20,000 characters.

@wesinator, Does it help to break up the string(since it has no affect on FromBase64String anyway)?

Jan 24 '19 00:01 msftrncs

I think I answered my own question … for GitHub it appears to be a line length issue. (note I did resolve the aliases) (Its at exactly 1025 characters or longer, GitHub ignores the line)

Invoke-Expression ([IO.StreamReader]::new([IO.Compression.DeflateStream]::new([IO.MemoryStream][Convert]::FromBase64String(
'aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaG
VsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsb
G9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9
oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWx
sb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2
hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlb
GxvaGVsbG8uIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxva
GVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvLiAgIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxs
b2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVs
bG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KaGVsbG9
oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCgpoZWxsb2hl
bGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGx
vaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaG
VsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlb
GxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxv
aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsbw=='
),[IO.Compression.CompressionMode]::Decompress),[Text.Encoding]::ASCII)).ReadToEnd()

Jan 24 '19 00:01 msftrncs

Seems like a similar issue in Atom, using your example it works until you create a longer line length in the string

Jan 24 '19 04:01 wesinator

May be related to https://github.com/atom/atom/issues/1667

Jan 27 '19 21:01 wesinator

Is this related to the atom-grammer-token-length plugin? Does anybody know the fix for this plugin?

May 13 '19 23:05 DA6IjY6jgHT8Z

@DA6IjY6jgHT8Z, I don't think this limit can be exceeded. Atom, and its Linquist engine have a line length limitation and that limitation appears to be strict at 1024 characters. I can see how that might be of some use for a web application, but for practicality in a word wrapping code editor any limit is pointless.

The reason for limits on the processing of a line, or the fact that TextMate grammars only process one line at a time, is that regular expressions used to break down the language in to its parts could be poorly constructed and thus run very slowly, possibly consuming 10,000's of characters and hours of CPU time attempting to incorrectly break down one keyword or construct.

If I had a say on the matter I would use a processing steps constraint, such that if a single regular expression sequence exceeds a certain number of processing steps, then that step is aborted. This process would need to be integrated directly in to the regex engine, since only it knows the number of steps its consumed. I would then have no restriction to processing only 1 line at a time or even a line length limit. I would also allow the grammar to be able to hint the appropriate limit when the need arises, and the application would still have an upper bound on the limit, since the app knows better external constraints (CPU resources) it has.

I personally feel that syntax highlighting is too important to just arbitrarily apply limits, and programming languages are too diverse to try to use reason for line length limits. It is sad to see Atom's accepted solution to their self applied limit is to use a different grammar engine.

May 14 '19 03:05 msftrncs

Ok, thanks for clearing that up, I guess I'll just switch back to UltraEdit which has been doing this just fine for years. I am using this for web design and I need long paragraph support for content. I'm new to open source (and to programming in general) but was hoping there would be more flexibility. I may come back to this later when I have more training.

May 15 '19 17:05 DA6IjY6jgHT8Z

EditorSyntax EditorSyntax copied to clipboard

Long line length string breaks highlighting in Atom and Github

Environment

Issue Description

Screenshots

Expected Behavior

Code Samples

EditorSyntax
EditorSyntax copied to clipboard