parser
parser copied to clipboard
Feature Request: recover
Hey, I'd like to propose having the ability to recover
from a parser. This would be similar to oneOf
but allows you to capture the context/problem of parser that failed.
TLDR: Add a function to recover from a parser that looks like:
recover :
(context -> problem -> value)
-> Parser context problem value
-> Parser context problem value
Use case:
My particular use case for such a feature is writing an error tolerant Elm parser. For example, say we want to parse import MyModule exposing (hello, $myInvalidValue$, World(..))
. In this case, we want to capture both the context/problem of the invalid value while still capturing the other valid values (like the fact that it's importing fromMyModule
exposing hello
and World(..)
. To achieve this, there are two options that I see. Either capture this data as state in the parser, (like how column, row, and indent are stored, kind of like a warning in the elm compiler) and keep parsing or store the context/problem as successfully parsed data.
The former is tricky, because extending Parser
to hold that state would require exposing it's constructor which makes changing the internals of this library more likely to be a breaking change. I also can't re-implement this parser and add this feature outside of the elm github organization because it uses infix operators and a kernel module to make it faster. For the infix operators, I could use named functions and pipelines, but there is no solution that I see to the kernel module.
The second option, which is the feature proposed, would be to add a recover function. We'll use the import MyModule exposing (hello, $myInvalidValue$, World(..))
string as an example.
If we structured our data for the import statement like this:
type alias Parser value =
Parser Context Problem value
type Problem = ...
type Context = ...
type ModuleImport =
ModuleImport ModuleName ExposingList
type ModuleName = ...
type ExposingList
= ExposingExplicit (List ExposedValue)
| ExposingAll
type ExposedValue
= ExposedValue ...
| ExposedConstructor ...
| ...
We can parse the import statement like this:
moduleImport : Parser ModuleImport
moduleImport =
Parser.succeed ModuleImport
|. ...
|= moduleName
|. ...
|= exposingList
moduleName : Parser ModuleName
moduleName = ...
exposingList : Parser ExposingList
exposingList =
Parser.map ExposingExplicit
(Parser.sequence
{ start = Parser.token "("
, end = Parser.token ")"
, item = exposingValue
, spaces = ...
, trailing = Parser.Optional
}
)
exposingValue : Parser ExposingValue
exposingValue = ...
exposingList
will parse exposed items in a list, but if one item fails then the whole parser will fail. We could make exposingValue
optional like so:
exposingList : Parser ExposingList
exposingList =
Parser.map ExposingExplicit
(Parser.sequence
{ ...
, item =
Parser.oneOf
[ Parser.map Ok exposingValue
, Parser.succeed (Err "Invalid list item")
// Parse until next list item
|. Parser.chompUntil (\c -> c /= ',' && c /= ')')
]
}
)
And this works. If there's an invalid value then we ignore it and move on to the next one. However, this looses the context/problem in failed parser. The function I'm proposing would have the type signature:
recover :
(context -> problem -> value)
-> Parser context problem value
-> Parser context problem value
With this, we could rewrite exposingList
and extend ExposedValue
to recover from the failure and transform the context/problem into a successfully parsed value.
type ExposedValue
= ...
| ExposedValueProblem Context Problem
exposingList : Parser ExposingList
exposingList =
Parser.map ExposingExplicit
(Parser.sequence
{ ...
, item =
Parser.recover (\context problem -> ExposedValueProblem context problem)
exposingValue
}
)
Now, we capture the reason that the value failed, while continuing to parse the other values! I understand that this use case is pretty specific, however I think that the ability to recover from a parser could be helpful in other cases beyond this one.
I'm sorry if this issue is a bit wordy, I thought it would be best to layout a clear and specific example of this feature and how it would be helpful. If there is a different way to solve this problem that I'm not seeing, please let me know! I'm super willing to PR this feature, but wanted to get feedback before doing so!
Some discussion on recovery here:
https://discourse.elm-lang.org/t/parsers-with-error-recovery/6262/15
I'm making a new package for it here:
https://github.com/the-sett/parser-recoverable