Highlighting of SQL inside strings
The stock Python package performs syntax highlighting inside strings containing SQL commands. That's rather useful, even though the implementation might need some more work. In this example:
create = 'CREATE TABLE test (id INT, name TEXT);'
insert = 'INSERT INTO test VALUES (1, "Bob")'
in the second line, words recognized as SQL are highlighted, but not in the first one, where "CREATE TABLE" has the same color as the non-reserved names.
(Actually, as you can see, even Github's Markdown does it, with exactly the same behavior, so perhaps something is wrong with the first line, although it does execute properly)
Do you consider this a useful addition?
Could you specify whether you use Sublime Text or Atom? The philosophy of MagicPython has been to actually highlight various Python features (i.e. %- and {}-style formatting, regular expressions and docstrings), rather than trying to merge and blend Python with other languages. There are really just too many possibilities there: Python strings can easily be SQL, HTML, XML, JavaScript, not to mention Django or Jinja templates as well as TeX. So usually the best we can do is to try and not highlight highly questionable things (see #28, #23, #12). There is also a fundamental problem with highlighting other languages inside strings: it's very easy to break the highlighting of an entire source file by an incomplete expression inside a string. Try this in the stock package, I believe it should break the highlighter after the regexp string:
foo = spam(a=1)
bar = r'''regexp['''
baz = spam(a=1)
So far, the above issue seems to have some kind of a generic solution in Sublime Text, but not in Atom. We have some plans for making user-configurable options for what gets actually highlighted inside the strings. As a configurable option it might be useful to highlight SQL, although at the moment I'm not certain how reliable we could make it, so it wouldn't be a priority for now.
In terms of knowing what sort of string it is, one thing I've seen work decently is to only highlight SQL when it's fully uppercased. Just the keywords and phrases, like CREATE TABLE, INSERT INTO and such, since they tend to not come up much outside of specifically SQL statements.
But however you want to handle it. It would be very nice to have, if it's possible at some point.
Could you specify whether you use Sublime Text or Atom?
Sublime Text, and you are correct about that code fragment breaking the syntax highlighter there. Since both examples we've posted show the same irregularities in Sublime and here, on Github, it's probably safe to assume that they both delegate highlighting to the same 3rd party, maybe Pygments, and there is no special handling for these languages embedded in strings built into Sublime.
And I see that this is a tricky thing to implement, but it also does add a lot to the readability of the code if you use SQL (or any of those other languages). So it comes with costs and benefits, and it's your call whether the benefits are worth it.
Hmm... I've looked at the default SQL grammar that both Sublime Text and Atom seem to be using. The good news is that this particular grammar would not cause problems from within python strings (or, indeed, any other language strings that are delimited with quotes). So including SQL highlighting as a user-configurable option could be possible. Stay tuned.
@elprans Github uses TextMate-style grammars for highlighting, see github/linguist. For python they use this grammar, MagicPython.
@vpetrovykh One issue with matching SQL inside a string can be begin/end matches matching outside the string boundaries, for example:
'SELECT * FROM "table" -- comment'
I'm looking at solving this in TextMate by matching either the string content (for single line strings) or each line first then passing it off to the SQL grammar in a patterns array in the capture. Unfortunately I believe this feature is only available in Atom not Sublime Text. Unsure what form of grammars Visual Studio Code handles.
@infininight I understand that generally falling back onto an external grammar can cause issues with string boundaries. It seems that at the moment the only reliable way to use the same basic grammar across Atom, Sublime Text and Visual Studio Code is to include an SQL grammar that is safe to use inside strings (much like we do with regular expressions). This would have to be optional, though, as detecting SQL may cause annoying false positives for all users.
Sublime's default Python syntax seems to do a good job knowing when to highlight SQL keywords.
Default Python Tests

Default Python

Magic Python

Sublime uses not very sophisticated SQL indicator for triple quote highlighting:
\s*(?:SELECT|INSERT|UPDATE|DELETE|CREATE|REPLACE|ALTER|WITH)\b
Is it possible to implement this in MagicPython? Seems like a good feature, and used both in Atom and Sublime natively (does not work in VSCode).
# Triple-quoted raw string, unicode or not, will detect SQL, otherwise regex
- match: '([uU]?r)(""")'
captures:
1: storage.type.string.python
2: meta.string.python string.quoted.double.block.python punctuation.definition.string.begin.python
push:
- meta_content_scope: meta.string.python string.quoted.double.block.python
- match: '(?={{sql_indicator}})'
set:
- meta_scope: meta.string.python string.quoted.double.block.python
- match: '"""'
scope: punctuation.definition.string.end.python
set: after-expression
- match: ''
push: scope:source.sql
with_prototype:
- match: '(?=""")'
pop: true
- include: escaped-unicode-char
- include: constant-placeholder
- match: '(?=\S)'
set:
- meta_scope: meta.string.python string.quoted.double.block.python
- match: '"""'
scope: punctuation.definition.string.end.python
set: after-expression
- match: ''
push: scope:source.regexp.python
with_prototype:
- match: '(?=""")'
pop: true
- include: escaped-unicode-char
This would help my daily work a lot, since I have to use sql in pyspark. My text editor of choice is VSCode.
is this issue dead?
Well, we have no plans for adding SQL highlighting. There are many cons/pros about this feature, see the above discussion.
@1st1 That's one of the first features I missed when trying out VSCode. I'm pretty happy with the way it works in Sublime Text.
@1st1 is it possible to add a directive comment that will change the syntax highlighting for a multiline string? I'm in a weird situation where I need to write python code within a python script and I wanted to use syntax highlighting.
@task
def get_cookies():
import textwrap
# some directive to make the below docstring use python syntax highlighting (or SQL as above)
script = textwrap.dedent(f"""
import browser_cookie3
cj = browser_cookie3.chrome()
cookies1 = cj._cookies['domain1']['/']
cookies2 = cj._cookies['.domain2']['/']
print('VALUE1', cookies1['value1'].value)
print('VALUE2', cookies1['value2'].value)
print('VALUE3', cookies2['value3'].value)
""").strip()
shell(f"""
python3 -m venv .venv &&
.venv/bin/pip install browser_cookie3 &&
.venv/bin/python -c "{script}"
""")
I have the same problem in VSCode. Can anyone fix the issue?
@1st1 is it possible to add a directive comment that will change the syntax highlighting for a multiline string? I'm in a weird situation where I need to write python code within a python script and I wanted to use syntax highlighting.
Unfortunately not: highlighters are very limited in what they see / how they can act :( They are basically big regular expressions without any means to do any code analysis (even rudimentary one)
FWIW I'm currently using the python-string-sql, but it's a bit annoying that I have to litter my code with --sql & --end-sql to get highlighting. Issue with this extension is that I have to add --end-sql, otherwise all the Python code after the string ends is broken.
https://marketplace.visualstudio.com/items?itemName=ptweir.python-string-sql
Also, as another option, PyCharm was nice because I could explicitly set the highlighting with # language=SQL before my string, and it would highlight the code inside the string appropriately.
I don't mind explicitly setting the syntax highlighting with a comment—though auto detect would be nice—my main gripe is having to add --end-sql at the end with python-string-sql.