kcl [LFX] [Track] Provide Quick Fix for Parse Errors

Enhancement

Add Quick fixes for Parser errors.

pub enum ErrorKind {
    // Syntax Errors
    InvalidSyntax,
    TabError,
    IndentationError,
    IllegalArgumentSyntax,
}

Framework:

Changes:

Modify the ParseError enum to include a field for suggestions in both variants UnexpectedToken and Message.
The struct_span_error function is modified to include generate_suggestion to generate the appropriate suggestion.

fn struct_span_error(&self, msg: &str, span: Span) {
    let (suggestion, error_kind) = generate_suggestion(msg, span);
    self.add_parse_err(ParseError::Message {
        message: msg.to_string(),
        span,
        suggestion,
    }, error_kind);
}

analyse each function call to struct_span_error and struct_token_error to provide suggestion for quick fix.
The generate_suggestion function takes msg and span as parameters and uses a match statement to generate a specific suggestion based on the message string.

fn generate_suggestion(error_message: &str, code_span: Span) -> Option<(String, ErrorKind)> {
    match error_message {
        msg if msg.contains("unexpected token") => Some((
            // quick fix suggestion generation logic based on span,
            ErrorKind::InvalidSyntax,
        )),
        msg if msg.contains("inconsistent use of tabs and spaces in indentation") => Some((
            // quick fix suggestion generation logic based on span,
            ErrorKind::TabError,
        )),
        msg if msg.contains("mismatched indent level") => Some((
             // quick fix suggestion generation logic based on span,
            ErrorKind::IndentationError,
        )),
        // Add more patterns for other error messages as needed
        _ => None,
    }
}

Modify the add_parse_err function to include error_kind parameter and convert ParseError into a diagnostic based on the ErrorKind:

        // Convert the ParseError into a Diagnostic based on the ErrorKind
        let diag = match error_kind {
            ErrorKind::InvalidSyntax => err.into_invalid_syntax_diag(&self.0)?,
            ErrorKind::TabError => err.into_tab_error_diag(&self.0)?,
            ErrorKind::IndentationError => err.into_indentation_error_diag(&self.0)?,
            ErrorKind::IllegalArgumentSyntax => err.into_illegal_argument_syntax_diag(&self.0)?,
        };

Modify impl ParseError to have functions for converting ParseError into a diagnostic based on the ErrorKind.
Add function that extract the first line of code from a CodeSnippet and returns it as a String.
suggestion is passed as a Option<Vec<String>> to diagnostic which is then stored in the data field of LSP diagnostic in to_lsp.rs
Finally, in quick_fix.rs handle quick_fix for each error kind.

Mar 11 '24 16:03 shashank-iitbhu

please assign @Peefy

Mar 11 '24 16:03 shashank-iitbhu

I have some doubts on how to implement this:

Can we modify both the variants of ParseError enum Unexpectedtoken and Message to have a field of suggestions?
The general idea of generating a quick fix would be to match the message string or look for a substring in the message string.
Modifying the struct_span_error and struct_token_error functions to call a new function generate_suggestions which takes message string to generate specific fixes based on error_kind and message, this function returns suggestion along with the error_kind which are then passed to add_parse_err along with error_kind.
Based on the error_kind , error specific functions are called defined in impl ParseError are called for converting ParseError into diag.

cc @He1pa

Mar 15 '24 08:03 shashank-iitbhu

I'm not sure if all fixes can be expressed just in suggestions. and whether there is enough information to generate suggestions when the error is generated. Like I said before, you may be able to determine the fix solution after parsing the entire expr or stmt. Of course you can add a new field in ParseError, but I think it should not be a suggestions, but an abstract structure representing all the information used to analyze for fix, such as source code, more detailed error kind, or other information. This also easy to expand. And, analyse and generate suggestions when ParseError is converted to Diag

impl ParseError {
     pub fn into_diag(self, sess: &Session) -> Result<Diagnostic> {
          ...
         Ok(Diagnostic::new_with_code(
             Level::Error,
             &self.to_string(),
             None,
             (pos.clone(), pos),
             Some(DiagnosticId::Error(ErrorKind::InvalidSyntax)),
             -- None,
             ++ generate_suggestions(...)
         ))
}
}

Mar 15 '24 09:03 He1pa

Thanks for clarifying. I'm working on refining this also I've already classified some fixes. Once I refine this approach further, I plan to open a PR with the refactoring changes and then start working on each fix individually. I'll update here with any further developments.

Mar 17 '24 23:03 shashank-iitbhu

I'm not sure if all fixes can be expressed just in suggestions. and whether there is enough information to generate suggestions when the error is generated. Like I said before, you may be able to determine the fix solution after parsing the entire expr or stmt. Of course you can add a new field in ParseError, but I think it should not be a suggestions, but an abstract structure representing all the information used to analyze for fix, such as source code, more detailed error kind, or other information. This also easy to expand. And, analyse and generate suggestions when ParseError is converted to Diag
impl ParseError {
     pub fn into_diag(self, sess: &Session) -> Result<Diagnostic> {
          ...
         Ok(Diagnostic::new_with_code(
             Level::Error,
             &self.to_string(),
             None,
             (pos.clone(), pos),
             Some(DiagnosticId::Error(ErrorKind::InvalidSyntax)),
             -- None,
             ++ generate_suggestions(...)
         ))
}
}

Hey @He1pa I am refactoring Message like this:

#[derive(Debug, Clone)]
pub enum ParseError {
    UnexpectedToken {
        expected: Vec<String>,
        got: String,
        span: Span,
    },
    Message {
        message: String,
        span: Span,
        fix_info: Option<FixInfo>,
    },
}

#[derive(Debug, Clone)]
pub struct FixInfo {
    pub error_kind: ErrorKind, // or something specific to particular error.
    pub code_snippet: String,
}

Is this approach fine? or just simply this:

    Message {
        message: String,
        span: Span,
        error_kind: ErrorKind, // or something specific to particular error.
        code_snippet: String,
    },

Apr 08 '24 01:04 shashank-iitbhu

Need some help with this on how to implement quick fixes for these kind of errors: cc @Peefy @He1pa

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L327-L330 Reproduced by:

schema Person:
    name: str
    age: int

person = Person {
    name = "Alice"
    age = 18
)

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L364-L367 Reproduced by:

print("a is zero"}

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L403-L406 Reproduced by:

c = {"key1" = "value1", "key2" = None]

Apr 18 '24 23:04 shashank-iitbhu

Here I am listing a few Parse errors for which quick fixes can be made with simple text replacements

Quick fixes that can be implemented by simple text replacements

https://github.com/kcl-lang/kcl/blob/d6ba71a3118b8756bf85d29f8ac221052f893ded/kclvm/parser/src/lexer/mod.rs#L261-L264

Reprduced by:

b=1
a = !b

Quick fix: Replace ! by not. Split the code snippet by ! and then replace it by not.

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/parser/expr.rs#L52-L58 Reprduced by:

schema PersonA:
   name: str
 age: int

Quick fix: Replace whitespaces with \t character.

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/indent.rs#L142 Reproduced by: Quick Fix: Replace whitespaces with \t character.
https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L433-L436 Reproduced by:

b=1
longString = "Too long expression " + \
             "Too long expression " + \b
             "Too long expression "

Quick Fix: Remove unexpected character after line continuation character.

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L441-L444 Reproduced by:

b=1;

Quick Fix: Remove semi colon from the suggested replacement.

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/parser/stmt.rs#L602-L605 Reproduced by:

a = 10
if a == 0:
    print("a is zero")
else if a < 100:
    print("a < 100")
    print("maybe a is negative")
else:
    print("a >= 100")

Quick fix: Replace else if with elif in the text replacement

Apr 18 '24 23:04 shashank-iitbhu

Need some help with this on how to implement quick fixes for these kind of errors: cc @Peefy @He1pa

https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L327-L330

Reproduced by:
schema Person:
    name: str
    age: int

person = Person {
    name = "Alice"
    age = 18
)
https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L364-L367

Reproduced by:
print("a is zero"}
https://github.com/kcl-lang/kcl/blob/26b1b17c6a94d2239153a3a59256a8c5b2ebdff7/kclvm/parser/src/lexer/mod.rs#L403-L406

Reproduced by:
c = {"key1" = "value1", "key2" = None]

I don't understand your question. Here, only need to replace the mismatched parentheses, brackets and braces, and self.indent_cxt.delims record delim stack

Apr 19 '24 09:04 He1pa