kcl icon indicating copy to clipboard operation
kcl copied to clipboard

[WIP] Provide quick fix for invalid syntax error

Open shashank-iitbhu opened this issue 1 year ago • 9 comments

1. Does this PR affect any open issues?(Y/N) and add issue references:

  • [ ] N
  • [x] Y

fix #1125

2. What is the scope of this PR (e.g. component or file name):

kclvm/parser/src/session/mod.rs kclvm/tools/src/LSP/src/quick_fix.rs kclvm/tools/src/LSP/src/to_lsp.rs kclvm/error/src/lib.rs

3. Provide a description of the PR(e.g. more details, effects, motivations or doc link):

https://github.com/kcl-lang/kcl/assets/134289475/ba7cf8e3-0fcc-4aee-a58a-e3b2de2dc2be

  • This PR is not complete yet.
  • The quick fix is working fine. The approach was to get the line content where error was occurring and then provide appropriate quick fix.
  • Right now, i have stored the line_content obtained by get_line_content_from_span in kclvm/parser/src/session/mod.rs into the note field of Message struct.
  • This is experimental approach as introducing new field in the Message Struct would require extensive changes in the codebase.

4. Are there any breaking changes?(Y/N) and describe the breaking changes(e.g. more details, motivations or doc link):

  • [ ] N
  • [x] Y

I haven't handled the failing tests yet.

5. Are there test cases for these changes?(Y/N) select and add more details, references or doc links:

  • [ ] Unit test
  • [ ] Integration test
  • [ ] Benchmark (add benchmark stats below)
  • [ ] Manual test (add detailed scripts or steps below)
  • [ ] Other

shashank-iitbhu avatar Mar 13 '24 19:03 shashank-iitbhu

  • Currently, I have stored the line_content obtained by get_line_content_from_span in kclvm/parser/src/session/mod.rs into the note field of the Message struct, as seen in the into_diag function in kclvm/error/src/lib.rs.
  • This approach is causing several tests to fail, but I anticipate these issues will be resolved once I implement the suggested changes.
  • This method is experimental, as introducing a new field in the Message struct would require extensive changes throughout the codebase.

The quick fix functionality is working fine with this approach.

@Peefy, how should I proceed with this? Should I introduce a new field line_content in the Message struct and make the required changes to the codebase?

cc @He1pa

shashank-iitbhu avatar Mar 13 '24 19:03 shashank-iitbhu

I think this should be done through parser specific error generation expression instead of passing code snippets to LSP and then handing them over to LSP for judgment. cc @He1pa Can you help review this PR?

Besides, I think we need to first clarify the list of syntax error recovery and the specific syntax recovery framework. cc @He1pa @shashank-iitbhu

Peefy avatar Mar 14 '24 02:03 Peefy

some advices:

  • I think you should sort out all the syntax errors first (you can search for struct_span_error) and do some classification. For example, what can be fixed, code snippets before and after fix, and what cannot be fixed. You can put it at https://github.com/kcl-lang/kcl/issues/1125. This also makes it easier for us to track progress.
  • quick fix should be a basic capability of the compiler for use by other tools (LSP, Lint). Therefore, error handling and fix solutions should be generated during the compilation phase, that is, when struct_span_error() generates an error or when some node (such as a complete stmt) is parsed, a fix solution should be generated, instead of in LSP.
  • Because of the previous point, line_content is also unnecessary. You just need to store the fix results into suggested_replacement
  • fn add_syntax_error() seems to be unused, consider deleting it

He1pa avatar Mar 14 '24 07:03 He1pa

some advices:

  • I think you should sort out all the syntax errors first (you can search for struct_span_error) and do some classification. For example, what can be fixed, code snippets before and after fix, and what cannot be fixed. You can put it at [LFX] Provide Quick Fix for InvalidSyntaxError #1125. This also makes it easier for us to track progress.
  • quick fix should be a basic capability of the compiler for use by other tools (LSP, Lint). Therefore, error handling and fix solutions should be generated during the compilation phase, that is, when struct_span_error() generates an error or when some node (such as a complete stmt) is parsed, a fix solution should be generated, instead of in LSP.
  • Because of the previous point, line_content is also unnecessary. You just need to store the fix results into suggested_replacement
  • fn add_syntax_error() seems to be unused, consider deleting it

Thank you for the feedback! @He1pa I was already working on documenting the errors and had a draft ready. I just opened this PR to get a hands-on feel for handling one specific error.

  • Regarding your suggestion to sort out all the syntax errors and do some classification, I'm working on it. Besides struct_span_error, I've also noticed struct_token_error for parse errors. Are there any other specific functions or areas I should look into to ensure I don't miss anything for Parse errors?
  • I also understand that quick fix should be a basic capability of the compiler for use by other tools (LSP, Lint).
  • I'll work on generating error handling and fix solutions during the compilation phase and store the fix results into suggested_replacement as you suggested.

shashank-iitbhu avatar Mar 14 '24 20:03 shashank-iitbhu

I think this should be done through parser specific error generation expression instead of passing code snippets to LSP and then handing them over to LSP for judgment. cc @He1pa Can you help review this PR?

Besides, I think we need to first clarify the list of syntax error recovery and the specific syntax recovery framework. cc @He1pa @shashank-iitbhu

Regarding this: I was able to generate the line content for this specific error using let code_snippet = CodeSnippet::new(span, Arc::clone(&sess.sm));:

                let mut sb = StyledBuffer::<DiagnosticStyle>::new();
                let mut errs = vec![];
                let code_snippet = CodeSnippet::new(span, Arc::clone(&sess.sm));
                code_snippet.format(&mut sb, &mut errs);
                let code_line = extract_code_line(&sb);

had to write a separate function to extract the code line from styled buffer:

// Extract code line from styled buffer
fn extract_code_line(sb: &StyledBuffer<DiagnosticStyle>) -> String {
    let rendered_lines = sb.render();
    if let Some(line) = rendered_lines.get(0) {
        if let Some(code_snippet) = line.get(1) {
            code_snippet.text.clone()
        } else {
            String::new()
        }
    } else {
        String::new()
    }
}

This was the output of sb.render i.e rendered_lines (for debugging purposes):

[
    [
        StyledString {
            text: " --> /Users/shashankmittal/Documents/Developer/lfx/testkcl/test.k:5:6\n  | \n5 | ",
            style: Some(Url)
        },
        StyledString {
            text: "a,b=1,2\n",
            style: None
        },
        StyledString {
            text: "  | ",
            style: Some(Url)
        },
        StyledString {
            text: "     ",
            style: None
        },
        StyledString {
            text: "^",
            style: Some(NeedFix)
        }
    ]
]

shashank-iitbhu avatar Mar 14 '24 20:03 shashank-iitbhu

some advices:

  • I think you should sort out all the syntax errors first (you can search for struct_span_error) and do some classification. For example, what can be fixed, code snippets before and after fix, and what cannot be fixed. You can put it at [LFX] Provide Quick Fix for InvalidSyntaxError #1125. This also makes it easier for us to track progress.
  • quick fix should be a basic capability of the compiler for use by other tools (LSP, Lint). Therefore, error handling and fix solutions should be generated during the compilation phase, that is, when struct_span_error() generates an error or when some node (such as a complete stmt) is parsed, a fix solution should be generated, instead of in LSP.
  • Because of the previous point, line_content is also unnecessary. You just need to store the fix results into suggested_replacement
  • fn add_syntax_error() seems to be unused, consider deleting it

Thank you for the feedback! @He1pa I was already working on documenting the errors and had a draft ready. I just opened this PR to get a hands-on feel for handling one specific error.

  • Regarding your suggestion to sort out all the syntax errors and do some classification, I'm working on it. Besides struct_span_error, I've also noticed struct_token_error for parse errors. Are there any other specific functions or areas I should look into to ensure I don't miss anything for Parse errors?
  • I also understand that quick fix should be a basic capability of the compiler for use by other tools (LSP, Lint).
  • I'll work on generating error handling and fix solutions during the compilation phase and store the fix results into suggested_replacement as you suggested.

sorry,I just checked it, it should be this two functions: struct_token_error() and struct_span_error()

He1pa avatar Mar 15 '24 02:03 He1pa

I think this should be done through parser specific error generation expression instead of passing code snippets to LSP and then handing them over to LSP for judgment. cc @He1pa Can you help review this PR? Besides, I think we need to first clarify the list of syntax error recovery and the specific syntax recovery framework. cc @He1pa @shashank-iitbhu

Regarding this: I was able to generate the line content for this specific error using let code_snippet = CodeSnippet::new(span, Arc::clone(&sess.sm));:

                let mut sb = StyledBuffer::<DiagnosticStyle>::new();
                let mut errs = vec![];
                let code_snippet = CodeSnippet::new(span, Arc::clone(&sess.sm));
                code_snippet.format(&mut sb, &mut errs);
                let code_line = extract_code_line(&sb);

had to write a separate function to extract the code line from styled buffer:

// Extract code line from styled buffer
fn extract_code_line(sb: &StyledBuffer<DiagnosticStyle>) -> String {
    let rendered_lines = sb.render();
    if let Some(line) = rendered_lines.get(0) {
        if let Some(code_snippet) = line.get(1) {
            code_snippet.text.clone()
        } else {
            String::new()
        }
    } else {
        String::new()
    }
}

This was the output of sb.render i.e rendered_lines (for debugging purposes):

[
    [
        StyledString {
            text: " --> /Users/shashankmittal/Documents/Developer/lfx/testkcl/test.k:5:6\n  | \n5 | ",
            style: Some(Url)
        },
        StyledString {
            text: "a,b=1,2\n",
            style: None
        },
        StyledString {
            text: "  | ",
            style: Some(Url)
        },
        StyledString {
            text: "     ",
            style: None
        },
        StyledString {
            text: "^",
            style: Some(NeedFix)
        }
    ]
]

The rendered_lines is not source code in kcl files. It has been rendered as an error report, e.g. image so you can see --> , | and ^ If you just want to get source code for analyze, maybe you can refer to this image

He1pa avatar Mar 15 '24 03:03 He1pa

This is how i was getting the line_content at the first place here:

fn get_line_content_from_span(&self, span: Span) -> String {
        let source_file = self.0.sm.lookup_source_file(span.lo());
        let line_index = source_file.lookup_line(span.lo()).unwrap();

But later @Peefy mentioned that we can use CodeSnippet, that's why looked into it.

shashank-iitbhu avatar Mar 15 '24 09:03 shashank-iitbhu

https://github.com/kcl-lang/kcl/issues/1125#issue-2179525939 need some reviews here.

shashank-iitbhu avatar Mar 15 '24 09:03 shashank-iitbhu

Hello @shashank-iitbhu

Kind reminder, LXF 1 Term is one week away, are you still working on this PR?

Peefy avatar May 16 '24 07:05 Peefy

Hello @shashank-iitbhu

Kind reminder, LXF 1 Term is one week away, are you still working on this PR?

Yes, I am still working on this. Will complete the work before final evaluation.

shashank-iitbhu avatar May 16 '24 07:05 shashank-iitbhu

Made some changes in the latest commit, and will let you know when it is ready for review.

shashank-iitbhu avatar May 20 '24 21:05 shashank-iitbhu

https://github.com/kcl-lang/kcl/assets/134289475/995554b7-2d77-4a6f-9274-49dc1053b677

https://github.com/kcl-lang/kcl/assets/134289475/85b1faf0-fcd1-49cb-b0e4-b1295c32e11f

@Peefy need some reviews whenever you get time.

shashank-iitbhu avatar May 22 '24 15:05 shashank-iitbhu

Pull Request Test Coverage Report for Build 9194381538

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 139 of 205 (67.8%) changed or added relevant lines in 6 files are covered.
  • 974 unchanged lines in 31 files lost coverage.
  • Overall coverage decreased (-0.01%) to 71.052%

Changes Missing Coverage Covered Lines Changed/Added Lines %
kclvm/parser/src/session/mod.rs 68 75 90.67%
kclvm/parser/src/lexer/mod.rs 43 52 82.69%
kclvm/error/src/lib.rs 12 22 54.55%
kclvm/tools/src/LSP/src/quick_fix.rs 0 40 0.0%
<!-- Total: 139 205
Files with Coverage Reduction New Missed Lines %
kclvm/parser/src/file_graph.rs 1 97.06%
kclvm/tools/src/LSP/src/quick_fix.rs 1 37.67%
kclvm/ast/src/token.rs 1 58.9%
kclvm/error/src/lib.rs 1 64.71%
kclvm/parser/src/parser/stmt.rs 2 95.29%
kclvm/ast/src/ast.rs 2 83.51%
kclvm/query/src/lib.rs 2 92.86%
kclvm/runtime/src/value/api.rs 2 0.31%
kclvm/evaluator/src/context.rs 3 81.46%
kclvm/parser/src/parser/module.rs 3 93.48%
<!-- Total: 974
Totals Coverage Status
Change from base Build 9124126156: -0.01%
Covered Lines: 54004
Relevant Lines: 76006

💛 - Coveralls

coveralls avatar May 24 '24 04:05 coveralls

@shashank-iitbhu Hello, I've added some comments. @He1pa Could you please add more comments.

Peefy avatar May 24 '24 05:05 Peefy

https://github.com/kcl-lang/kcl/assets/134289475/3c65be78-0ff5-44d5-bdda-22a65a052b53

Will be adding more to this PR.

shashank-iitbhu avatar May 24 '24 14:05 shashank-iitbhu

https://github.com/kcl-lang/kcl/assets/134289475/3c65be78-0ff5-44d5-bdda-22a65a052b53

Will be adding more to this PR.

Thank you. Besides, I think the position information in the quick fix msg can be ommitted.

Peefy avatar May 24 '24 14:05 Peefy

@shashank-iitbhu I've merged this PR, there are some issues cc @He1pa

  1. 'a = 1;' will be modified as a = 1 instead of a = 1, it may have a space.
  2. 'a = !1' will be modified as a = not !1 instead of a = not 1
  3. Please add constant definitions for these error messages.

Peefy avatar May 26 '24 06:05 Peefy