riff icon indicating copy to clipboard operation
riff copied to clipboard

Multiline strings

Open darrylabbate opened this issue 2 years ago • 1 comments

In C, you can juxtapose string literals, which the preprocessor will stitch together:

    "Lorem ipsum dolor sit amet, consectetur adipiscing elit, "
    "sed do eiusmod tempor incididunt ut labore et dolore magna "
    "aliqua. Ut enim ad minim veniam, quis nostrud exercitation "
    "ullamco laboris nisi ut aliquip ex ea commodo consequat.";

Or in Python, you can use triple quotes:

    """
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, 
    sed do eiusmod tempor incididunt ut labore et dolore magna 
    aliqua. Ut enim ad minim veniam, quis nostrud exercitation 
    ullamco laboris nisi ut aliquip ex ea commodo consequat.
    """

darrylabbate avatar Nov 01 '22 03:11 darrylabbate

Didn't realize before, but Python's multiline string syntax is basically worthless. Leading whitespace isn't trimmed.

Current idea would be for the lexer to:

  1. Eat the ''' token
  2. Count number of whitespace bytes ( $n$ ) (not concerned with tabs vs spaces) until first non-whitespace character is encountered
    1. If on the same line as the opening ''' and no non-whitespace characters are encountered before a newline, discard count and repeat step 2
  3. For each subsequent line, unconditionally eat $n$ whitespace characters (unless closing ''' is encountered) before saving string data
    1. Whitespace is still significant if a line contains only the closing ''' after $n$ bytes of whitespace is skipped

Interpolated expressions should be simple to support, just need to add a new lexer mode value to the enum.

Note that lexer-concatenated juxtaposed string literals are also possible

darrylabbate avatar Dec 22 '22 22:12 darrylabbate