sourcepawn icon indicating copy to clipboard operation
sourcepawn copied to clipboard

Multiline strings WIP/POC

Open rtldg opened this issue 9 months ago • 3 comments

Multiline strings, that I don't have to escape double-quotes inside of, are something I really wanted. (I am hardcoding JSON objects in sourcemod plugins...)

Relevant issue: https://github.com/alliedmodders/sourcepawn/issues/785

I went with a similar string style to Python and C# with this PR (with character-escapes allowed).

@asherkin had suggested PHP-esque heredoc-style strings. I'm not a fan though of that though.

@assyrianic had suggested something similar to Golang's backtick quotes. I also wasn't a fan of that, but for SQL reasons.

My multiline string implementation rules: (ripped and edited from C#'s 'Raw string literals')

  • The opening quotes must be the last token on its respective line, and the closing quote must be the first token on its respective line.
  • Any whitespace to the left of the closing quotes is removed from all lines of a multiline string literal.
    • If a whitespace precedes the end delimiter on the same line, the exact number and kind of whitespace characters (e.g. spaces vs. tabs) must exist at the beginning of each content line. Specifically, a space does not match a horizontal tab, and vice versa.
  • The newline before the closing quotes isn't included in the literal string.
  • You can have an empty line -- zero whitespace and zero characters (other than a newline).

Some pros:

  • It's similar to Python's multiline/longstring syntax, which everyone knows!
  • It's similar to C#'s raw string literals, which many people know!
  • Sarrus has to suffer through updating the VSCode extension lexer.

Some cons:

  • I had to write lexer code for this. (it's trash 😇)
  • My syntax highlighting is ruined in Notepad++.

Some examples of examples of allowed syntax:

	PrintToServer("""
		Hello
		There
		""");
	// -> "Hello\nThere"

	PrintToServer("""
	  Hello
	  There
	""");
	// -> "  Hello\n  There"

	PrintToServer(
		"""""
		"""ASDF""" %d
		""""",
		5
	);
	// -> "\"\"\"ASDF\"\"\" %d" (which prints `"""ASDF""" 5`)

	PrintToServer("""test hello "world", how are you?""");
	// -> "test hello \"world\", how are you?"

	PrintToServer(""" test \
	wow!\
	""");
	// -> " test \twow!\t"

	// This one is funky!
	PrintToServer("""abc\
		"WOAH"\
		""");
	// -> "abc\t\t\"WOAH\"\t\t"

	PrintToServer("""
	""");
	// -> ""

	PrintToServer("""
	test

	""");
	// -> "test\n\n"

	PrintToServer("""
	
	a
	""");
	// -> "\na"

	PrintToServer("""
	
	""");
	// -> "\n"

	PrintToServer("""
	
	
	""");
	// -> "\n\n"

Some examples of incompatible things:

	// anything but a newline after the opening-quotes of a multiline string indicates that it's a
	//  singleline version, which doesn't work here (without a line-continuation '\\')
	/----------------v
	PrintToServer(""" // abc
	test
	""");

	// I don't know how to nicely handle empty multiline strings, plus they're stupid and useless!
	PrintToServer("""""");

TODO:

  • Better error reporting. Currently report(37)s (error 037: invalid string (possibly non-terminated string)) which isn't too useful for multiline strings.
    • We also break error reporting positions I think! Might need to check some current-token position somewhere or something...
  • Add tests. (a bit dependent on better error reporting from above...)
  • Handle line-continuations not skipping whitespace?
  • ~~Take match_char()s out of asserts probably...~~
  • keep track of closing-quotes so we don't have to double-parse. (and just use end more...)

rtldg avatar Feb 18 '25 23:02 rtldg

x64 cstrike came out sooner than I expected so I'll push this out before it's ready.

I'm sure some would like it to be more Python-like, but I haven't gotten around to investigating that yet.

So we'll just see how the responses are on this currently...

rtldg avatar Feb 18 '25 23:02 rtldg

I like the syntax choices here (I kind of have to, it's exactly what I suggested in the issue 😄)

asherkin avatar Feb 19 '25 00:02 asherkin

Sarrus has to suffer through updating the VSCode extension lexer.

The good news is that the changes are backward compatible!

There shouldn't be too much of a headache getting this into the textmate-grammar, lexer, parser and LSP 🤞

I really look forward to see this in SP!

Sarrus1 avatar Feb 19 '25 16:02 Sarrus1

Sorry for the delay on this. The current patch looks like it addresses my concerns from the previous one, so I'll go ahead and merge it.

dvander avatar Sep 13 '25 20:09 dvander