Raw string format seems to not have grammar support
When I used r'...' in my code, the colour displayed wrong as if it didn't recognise the raw string format.
It's ok here. With or without the 'r' the string is highlighted in the same way.
Sorry that I missed a key piece of info. It's the escape sequence having wrong highlight. In raw string, escape characters should be displayed as regular strings but they're not.
Thanks
On Thu, May 15, 2014 at 1:26 PM, Marco Rougeth [email protected] wrote:
It's ok here. With or without the 'r' the string is highlighted in the same way.
Reply to this email directly or view it on GitHub: https://github.com/atom/language-python/issues/24#issuecomment-43165014
Could you show an example?
Sure, like below.
On Thu, May 15, 2014 at 1:49 PM, Marco Rougeth [email protected]:
Could you show an example?
— Reply to this email directly or view it on GitHubhttps://github.com/atom/language-python/issues/24#issuecomment-43165864 .
Sorry, the image might not have attached correctly... here it is.
On Thu, May 15, 2014 at 3:44 PM, Chao Li [email protected] wrote:
Sure, like below.
On Thu, May 15, 2014 at 1:49 PM, Marco Rougeth [email protected]:
Could you show an example?
— Reply to this email directly or view it on GitHubhttps://github.com/atom/language-python/issues/24#issuecomment-43165864 .
I cannot see it.
I can't see the image either, @lichaoir would you mind trying to attach it again to a comment in this issue? Thanks
Hi Kevin & Marco,
Don't know why you guys cannot see the attached image. Here is the step to reproduce it: try typing r'\nabc' in atom with default theme. You will see the escape sequence \n is darkened while abc is displayed as plain string. The thing is, since raw string content is treated as-is in python, the \n sequence should also be displayed as plain string, namely not darkened. Does this make sense?
Thanks
On Tue, May 20, 2014 at 6:24 AM, Kevin Sawicki [email protected]:
I can't see the image either, @lichaoir https://github.com/lichaoirwould you mind trying to attach it again to a comment in this issue? Thanks
— Reply to this email directly or view it on GitHubhttps://github.com/atom/language-python/issues/24#issuecomment-43553171 .
Just so that everyone is clear, I think this is what he is talking about

The first line has \n in purple, which instead should be yellow. In a raw string in python, special/escape characters should be treated as any other character i.e., in yellow.
I think I agree with @lichaoir
As the \n is a special character, I believe it is better to highlight it inside the string.
@rougeth In this case \n is not special!
From the docs.
>>> # Unless an 'r' or 'R' prefix is present, escape sequences in strings are interpreted
>>> r'\n' == '\\n' and r'\\' == '\\\\'
True
>>> # string quotes can be escaped with a backslash, but the backslash remains in the string
>>> r"\'" == r'\'' == "\\'"
True
As far as I can tell, the package assumes all raw strings are regexes (other regex syntax like []+* etc inside raw strings gets highlighted).
@lichaoir, @warunsl and @ThinkChaos are absolutely correct. The whole point of a raw string is to ignore the "specialness" of escape sequences. For example, I often use raw strings to set Windows-style paths (e.g. r'C:\Users\nmpeterson' and I do not want the \n within that being incorrectly highlighted as a "special" character.
Yes, this is a feature of the language grammar. It's trying to be helpful by highlighting raw strings as if they were regexes, since that is typically what they are used for.
@aroben That is a good feature for "regular" strings (i.e. u"..." & "...") but it is not appropriate to highlight \n in a raw byte string as a newline.
Note that these two are equivalent single-byte1 strings:
> ord("""
""")
10
> ord("\n")
10
However, this is clearly NOT the same thing:
ord(r"\n")
TypeError: ord() expected a character, but string of length 2 found
We can see that it is actually a 2-byte1 string:
[ord(i) for i in r'\n']
> [92, 110]
So I think this makes it pretty clear that highlighting character sequences in raw strings which would normally be considered "escape sequences" is an error that should be corrected.
1 They're actually more bytes than this (sys.getsizeof(r'\n') vs. sys.getsizeof('\n')), but you get my point.
Also, knowing nothing about the grammar mechanisms in Atom (though I'm trying to learn), what's this:
https://github.com/atom/language-python/blob/master/grammars/python.cson#L869-L870
Can this be exposed as a user preference?
@mattdeboard You're right that r"\n" is a two-byte (and two-character) string. But I do think it is useful for the grammar to highlight \n specially in this case. In regular expressions, the two-character sequence \n means "a newline character", just like \* means "an asterisk character", etc. Since raw strings are so commonly used for defining regular expressions, the grammar highlights them as such.
Ultimately it doesn't even matter, since the broken indentation of this language mode makes Atom unusable for python. This renders any debate about syntax highlighting moot. On Apr 27, 2015 8:45 AM, "Adam Roben" [email protected] wrote:
@mattdeboard https://github.com/mattdeboard You're right that r"\n" is a two-byte (and two-character) string. But I do think it is useful for the grammar to highlight \n specially in this case. In regular expressions, the two-character sequence \n means "a newline character", just like * means "an asterisk character", etc. Since raw strings are so commonly used for defining regular expressions, the grammar highlights them as such.
— Reply to this email directly or view it on GitHub https://github.com/atom/language-python/issues/24#issuecomment-96656900.
Doesn't it make more sense to have this package highlight according to the grammar of python instead of guessing what people are using raw strings for? I would vote for changing the syntax highlighting in accordance with this bug report and breaking out the regex highlighting into a separate package which can over-ride the highlighting for language-python if the user so desires.
@mattdeboard Have you seen https://atom.io/packages/python-indent?
I see that the reasoning behind this broken highlight is the belief that raw strings are only (or mainly) used for regular expressions.
I'd like to offer evidence to the contrary. Codebase I'm currently working on has lots of strings of the form r"c:\Program Files\Some Program\Bin\Program.exe". As you might guess, these are not regular expressions, but Windows file paths. Currently Atom highlights \P, \S and \B (with different colors too); this leads to confusion, because at the first glance it looks like an indication that Python would recognize the \ symbol as an escape, and that a double backslash \\ is needed instead. But double backslashes -- although colored by Atom like the correct thing -- is wrong, and leads to errors. Errors which are easy to overlook: never a good thing.
Please correct the hightlight by removing it from raw strings.
r''' doc string '''
breaks the grammar in the whole file...
any fix?
Would it be these lines that are causing the incorrect highlighting?
What complicates the matter further is that \" is special, in a double quoted raw string, when appearing at the end. In fact, it appears that raw strings have a lot of intricacies:
For example (quite normally):
print r"hello\"wo\rld"
# hello\"wo\rld
But this gives:
print r"Hello\"
# SyntaxError
And this:
print r"Hello\\", r"Hello\""
# Hello\\ Hello\"
So how does one make a raw string with a single backslash at the end?
P.S. Funnily, even Github's syntax highlighter gets it wrong!