gccrs icon indicating copy to clipboard operation
gccrs copied to clipboard

Handle multiline strings

Open CohenArthur opened this issue 3 years ago • 3 comments

Multiline strings are allowed in Rust (playground link), however we currently do not handle them correctly:

test.rs:2:26: error: unended string literal
    2 |     let a = "whaaaaaat up
      |                          ^

This is the beginning of a patch to fix that, basically commenting the checks for a \n character:

diff --git a/gcc/rust/lex/rust-lex.cc b/gcc/rust/lex/rust-lex.cc
index ecf151dc778..c51b00fb5fe 100644
--- a/gcc/rust/lex/rust-lex.cc
+++ b/gcc/rust/lex/rust-lex.cc
@@ -1917,7 +1917,7 @@ Lexer::parse_string (Location loc)
   int length = 1;
   current_char32 = peek_codepoint_input ();
 
-  while (current_char32.value != '\n' && current_char32.value != '"')
+  while (/* current_char32.value != '\n' && */ current_char32.value != '"')
     {
       if (current_char32.value == '\\')
 	{
@@ -1949,14 +1949,15 @@ Lexer::parse_string (Location loc)
 
   current_column += length;
 
-  if (current_char32.value == '\n')
-    {
-      rust_error_at (get_current_location (), "unended string literal");
-      // by this point, the parser will stuck at this position due to
-      // undetermined string termination. we now need to unstuck the parser
-      skip_broken_string_input (current_char32.value);
-    }
-  else if (current_char32.value == '"')
+  // if (current_char32.value == '\n')
+  //   {
+  //     rust_error_at (get_current_location (), "unended string literal");
+  //     // by this point, the parser will stuck at this position due to
+  //     // undetermined string termination. we now need to unstuck the parser
+  //     skip_broken_string_input (current_char32.value);
+  //   }
+  if (current_char32.value == '"')
+    // else if (current_char32.value == '"')
     {
       current_column++;
 

However, that code is necessary for properly handling some documentation attributes, as pointed out by various test cases in our testsuite.

rustc does this in a different pass rather than the lexer, which is what I think we should do as well. We could for example add that check after parsing a doc_attr.

Here is the relevant rustc code which checks for certain characters:

                        if let Some(c) = doc_alias
                            .chars()
                            .find(|&c| c == '"' || c == '\'' || (c.is_whitespace() && c != ' '))
                        {
                            self.tcx
                                .sess
                                .struct_span_err(
                                    meta.span(),
                                    &format!(
                                        "{:?} character isn't allowed in `#[doc(alias = \"...\")]`",
                                        c,
                                    ),
                                )
                                .emit();
                            return false;
                        }

This issue is necessary for compiling certain versions of libcore properly, which do contain multiline strings.

CohenArthur avatar Jul 19 '22 14:07 CohenArthur

As a side-note, I haven't been able to understand the new system which can emits errors based on locale. I'll have to ask on the Rust zulip for an explanation or a PR link, as I couldn't figure out where that error was emitted without checking out the 1.49 release

CohenArthur avatar Jul 19 '22 14:07 CohenArthur

The error is emitted as tcx.sess.emit_err(errors::DocAliasBadChar { span, attr_str, char_: c }); where DocAliasBadChar is defined in compiler/rustc_passes/src/errors.rs as

#[derive(SessionDiagnostic)]
#[error(passes::doc_alias_bad_char)]
pub struct DocAliasBadChar<'a> {
    #[primary_span]
    pub span: Span,
    pub attr_str: &'a str,
    pub char_: char,
}

The actual error message is declared in compiler/rustc_error_messages/locales/en-US/passes.ftl as passes-doc-alias-bad-char = {$char_} character isn't allowed in {$attr_str}. The PR implementing this is https://github.com/rust-lang/rust/pull/95512.

bjorn3 avatar Jul 19 '22 14:07 bjorn3

I found the error message but couldn't figure out the Diagnostic or how it was emitted. Thanks a lot @bjorn3 :DD

CohenArthur avatar Jul 19 '22 15:07 CohenArthur