EditorSyntax
EditorSyntax copied to clipboard
Long line length string breaks highlighting in Atom and Github
Environment
macOS
- Atom 1.34.0 (https://github.com/jrsconfitto/language-powershell)
- VS Code 1.30.2
Issue Description
The included snippet of code highlights correctly in VS Code. It does not highlight at all on Github, and highlights incorrectly in atom. In Atom, the string highlighting is not terminated at the end of the string.
Screenshots
Expected Behavior
Highlighted correctly on Github.
),[IO.Compression.CompressionMode]::Decompress)),[Text.Encoding]::ASCII)).ReadToEnd() highlighted as code in Atom.
Code Samples
sal a New-Object;iex(a IO.StreamReader((a IO.Compression.DeflateStream([IO.MemoryStream][Convert]::FromBase64String('aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvLiAgIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCgpoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsbw=='),[IO.Compression.CompressionMode]::Decompress)),[Text.Encoding]::ASCII)).ReadToEnd()
Linguist (used for Github's highlighting) is at the current revision for this project and so is VS Code so that's weird.
I'll figure out what's going on with Atom but the Github issue may be related to Linguist specifically.
While VS Code shares Atom's textmate REGEX engine, I'd be willing to bet (but do not actually know) that Atom and Linguist share some common textmate portions, or at least share the same issue with handling textmate (maybe a line length buffer issue?), since they are both GitHub projects. VS Code has a limit on how long of a line it will allow to be scoped, for performance reasons, but its like 20,000 characters.
@wesinator, Does it help to break up the string(since it has no affect on FromBase64String anyway)?
I think I answered my own question … for GitHub it appears to be a line length issue. (note I did resolve the aliases) (Its at exactly 1025 characters or longer, GitHub ignores the line)
Invoke-Expression ([IO.StreamReader]::new([IO.Compression.DeflateStream]::new([IO.MemoryStream][Convert]::FromBase64String(
'aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaG
VsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsb
G9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9
oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWx
sb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2
hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlb
GxvaGVsbG8uIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8uICBoZWxsb2hlbGxva
GVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvLiAgIGhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxs
b2hlbGxvaGVsbG8uICBoZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVs
bG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG8KCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG8KaGVsbG9
oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZ
Wxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCgpoZWxsb2hl
bGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGx
vaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaG
VsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvCmhlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlb
GxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsb2hlbGxv
aGVsbG9oZWxsb2hlbGxvaGVsbG9oZWxsbw=='
),[IO.Compression.CompressionMode]::Decompress),[Text.Encoding]::ASCII)).ReadToEnd()
Seems like a similar issue in Atom, using your example it works until you create a longer line length in the string
May be related to https://github.com/atom/atom/issues/1667
Is this related to the atom-grammer-token-length plugin? Does anybody know the fix for this plugin?
@DA6IjY6jgHT8Z, I don't think this limit can be exceeded. Atom, and its Linquist engine have a line length limitation and that limitation appears to be strict at 1024 characters. I can see how that might be of some use for a web application, but for practicality in a word wrapping code editor any limit is pointless.
The reason for limits on the processing of a line, or the fact that TextMate grammars only process one line at a time, is that regular expressions used to break down the language in to its parts could be poorly constructed and thus run very slowly, possibly consuming 10,000's of characters and hours of CPU time attempting to incorrectly break down one keyword or construct.
If I had a say on the matter I would use a processing steps constraint, such that if a single regular expression sequence exceeds a certain number of processing steps, then that step is aborted. This process would need to be integrated directly in to the regex engine, since only it knows the number of steps its consumed. I would then have no restriction to processing only 1 line at a time or even a line length limit. I would also allow the grammar to be able to hint the appropriate limit when the need arises, and the application would still have an upper bound on the limit, since the app knows better external constraints (CPU resources) it has.
I personally feel that syntax highlighting is too important to just arbitrarily apply limits, and programming languages are too diverse to try to use reason for line length limits. It is sad to see Atom's accepted solution to their self applied limit is to use a different grammar engine.
Ok, thanks for clearing that up, I guess I'll just switch back to UltraEdit which has been doing this just fine for years. I am using this for web design and I need long paragraph support for content. I'm new to open source (and to programming in general) but was hoping there would be more flexibility. I may come back to this later when I have more training.