Earley
Earley copied to clipboard
Capture result as well as matched tokens
trafficstars
It would be nice to be able to capture the result of a production as well as the text matched by it, for example:
match :: Prod r e t a -> Prod r e t ([t], a)
Ignoring NonTerminals it could look like this:
match :: Prod r e t a -> Prod r e t ([t], a)
match = \case
Pure a -> Pure ([], a)
Terminal p c ->
Terminal (\t -> (t, ) <$> p t) (biliftA2 (flip (:)) id <$> match c)
-- NonTerminal :: !(r e t a) -> !(Prod r e t (a -> b)) -> Prod r e t b
NonTerminal _ _ -> error "match: NonTerminal"
Alts as c ->
-- Alts (match <$> as) (((\(ts, f) (t, x) -> (t ++ ts, f x))) <$> match c)
Alts (match <$> as) (biliftA2 (flip (++)) id <$> match c)
Many a c -> Many
(match a)
((\(t, f) (unzip -> (ts, xs)) -> (concat (ts ++ [t]), f xs)) <$> match c)
Named p n -> Named (match p) n
Edit: corrected implementation
That sounds like a useful feature. A PR would be welcome.
An alternative approach might be to add a primitive
position :: Prod r e t Int
which always succeeds, giving you the current position in the input string.
You could then get the position before and after a production and use that to get the parsed slice of the input string.