riff
riff copied to clipboard
Multiline strings
In C, you can juxtapose string literals, which the preprocessor will stitch together:
"Lorem ipsum dolor sit amet, consectetur adipiscing elit, "
"sed do eiusmod tempor incididunt ut labore et dolore magna "
"aliqua. Ut enim ad minim veniam, quis nostrud exercitation "
"ullamco laboris nisi ut aliquip ex ea commodo consequat.";
Or in Python, you can use triple quotes:
"""
Lorem ipsum dolor sit amet, consectetur adipiscing elit,
sed do eiusmod tempor incididunt ut labore et dolore magna
aliqua. Ut enim ad minim veniam, quis nostrud exercitation
ullamco laboris nisi ut aliquip ex ea commodo consequat.
"""
Didn't realize before, but Python's multiline string syntax is basically worthless. Leading whitespace isn't trimmed.
Current idea would be for the lexer to:
- Eat the
'''
token - Count number of whitespace bytes ( $n$ ) (not concerned with tabs vs spaces) until first non-whitespace character is encountered
- If on the same line as the opening
'''
and no non-whitespace characters are encountered before a newline, discard count and repeat step 2
- If on the same line as the opening
- For each subsequent line, unconditionally eat $n$ whitespace characters (unless closing
'''
is encountered) before saving string data- Whitespace is still significant if a line contains only the closing
'''
after $n$ bytes of whitespace is skipped
- Whitespace is still significant if a line contains only the closing
Interpolated expressions should be simple to support, just need to add a new lexer mode value to the enum
.
Note that lexer-concatenated juxtaposed string literals are also possible