feat: support better error reporting
Motivation
See the issue and the comment.
Currently in case of parsing fail we sometimes get not very informative error messages (see issue comment for examples). This PR mostly introduces parser_state.rs and error.rs changes that allow to:
- Store information about expected tokens
- Store rules stack calls (from which user can later create custom error messages)
Main changes
Currently in case parent rule contains a sequence of rules/chars/strings (sensetive and insensetive) we only track the farthest rule. Pest store it and it's position in pos_attempts and attempt_pos but ignores information about successful or unsuccessful parse of strings and chars.
This MR mostly introduces additional logic inside of rule, match_stirng and match_insensetive function calls in order to support tracking of expected tokens.
Moreover logic (i'd say fake) of tracking rule calls is added. Imagine parsing have failed on position x of input string. E.g. we will have rule_1, rule_2 and rule_3 failed on the equally far position. Current MR also introduces logic of tracking most top parent of those rules. The idea is that with such (parent, deepest_failed_rule) pair we could generate more informative messages.
Summary by CodeRabbit
-
New Features
- Introduced a comprehensive SQL grammar parser for enhanced SQL command parsing capabilities.
- Added detailed error reporting to assist with SQL syntax and parsing errors.
-
Enhancements
- Improved internal parsing state management for better tracking of parsing attempts and errors.
-
Documentation
- Updated test cases to reflect new error handling features.
Walkthrough
The changes encompass the introduction of an SQL parsing module with advanced error handling capabilities. These improvements enhance error diagnostics for SQL parsing, boosting the reliability and resilience of the parser.
Changes
| File Path | Change Summary |
|---|---|
.../sql.pest, .../src/lib.rs |
Added SQL parsing module with error handling functions and updated test functions. |
pest/src/error.rs |
Enhanced error tracking with parse_attempts in ParsingError and updated test cases. |
pest/src/parser_state.rs |
Expanded types for tracking parsing attempts; ParserState updated with new fields and methods. |
🐰✨ In the realm of code, a rabbit does play, Parsing SQL with precision, errors in dismay. Detailed diagnostics, a dance so grand, With each parsing leap, a better stand! 🎉 ✨🐰
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
Tips
Chat
There are 3 ways to chat with CodeRabbit:
- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
-
I pushed a fix in commit <commit_id>. -
Generate unit-tests for this file. -
Open a follow-up GitHub issue for this discussion.
-
- Files and specific lines of code (under the "Files changed" tab): Tag
@coderabbitaiin a new review comment at the desired location with your query. Examples:-
@coderabbitai generate unit tests for this file. -
@coderabbitai modularize this function.
-
- PR comments: Tag
@coderabbitaiin a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:-
@coderabbitai generate interesting stats about this repository and render them as a table. -
@coderabbitai show all the console.log statements in this repository. -
@coderabbitai read src/utils.ts and generate unit tests. -
@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
-
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.
CodeRabbit Commands (invoked as PR comments)
-
@coderabbitai pauseto pause the reviews on a PR. -
@coderabbitai resumeto resume the paused reviews. -
@coderabbitai reviewto trigger a review. This is useful when automatic reviews are disabled for the repository. -
@coderabbitai resolveresolve all the CodeRabbit review comments. -
@coderabbitai helpto get help.
Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
CodeRabbit Configration File (.coderabbit.yaml)
- You can programmatically configure CodeRabbit by adding a
.coderabbit.yamlfile to the root of your repository. - The JSON schema for the configuration file is available here.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation:
# yaml-language-server: $schema=https://coderabbit.ai/integrations/coderabbit-overrides.v2.json
CodeRabbit Discord Community
Join our Discord Community to get help, request features, and share feedback.
@coderabbitai review
Update: it's been a long time since my last comment and I'm sorry. I currently try to optimize SQL expressions parsing using Pratt parser. After finishing this task, I hope to return to current MR.
P.S. Thanks for the comments!
the SQL grammar looks ok, maybe it can be added in a separate PR?
the better error reporting change is good, but it'd be a semver-breaking change... not sure if there's a way to add it in a semver-compatible way? if not, maybe we can put it under a feature-guard?
@tomtau, I am currently tinkering changes I've previously suggested in this MR. The main stumbling block was that these changes were breaking semver. May you please consult me if solution below is more applicable and whether it's semver compatible?
Previously semver was broken because I've added new parse_attempts field in public ErrorVariant::ParsingError enum variant. If people were matching it with ErrorVariant::ParsingError { positives, negatives }, their code would be broken after update because of new field.
What if I add PRIVATE parse_attempts: Option<ParseAttempts<R>> field into the Error struct and add new public method pub fn get_parse_attempts(&self) -> Option<ParseAttempts<R>> to it? It seems that any users' code wouldn't be broken after this patch. Seems like the only thing I'd have to fix is internal tests (e.g. in meta/src).
I've taken into account your comment about not making mod parser_state public and tested changes proposed above -- everything works good.
What if I add PRIVATE parse_attempts: Option<ParseAttempts<R>> field into the Error struct and add new public method pub fn get_parse_attempts(&self) -> Option<ParseAttempts<R>> to it? It seems that any users' code wouldn't be broken after this patch. Seems like the only thing I'd have to fix is internal tests (e.g. in meta/src).
yes, I guess that could work
What if I add PRIVATE parse_attempts: Option<ParseAttempts> field into the Error struct and add new public method pub fn get_parse_attempts(&self) -> Option<ParseAttempts> to it? It seems that any users' code wouldn't be broken after this patch. Seems like the only thing I'd have to fix is internal tests (e.g. in meta/src).
yes, I guess that could work
Thanks for reply! I'll update this PR with latest changes in the coming days