parsec
parsec copied to clipboard
identifiers in Text.Parsec.Language not polymorphic enough
we have
emptyDef :: LanguageDef st
type LanguageDef st = GenLanguageDef String st Identity
makeTokenParser :: Stream s m Char => GenLanguageDef s u m -> GenTokenParser s u m
This means that this can only be used for String
parsers, and the following is ill-typed (whatever the contents of { ... }
lexer :: GenTokenParser ByteString () Identity
lexer = makeTokenParser $ emptyDef { ... }
I was also hit by this when updating an old library using String
to use Text
. Would be great to have this.
Why not to open a PR fixing it?
well there are some design choices, e.g.:
- should LanguageDef get another type parameter? (this breaks code)
- should there be some LanguageDef' with the more general type? (but then how should it be called)
- should emptyDef (etc.) get the more general type, or should it be emptyDef',
- or should the generalized things keep names, but go into another module (which?)
Here's one way to fix this:
- Keep the current
LanguageDef
andTokenParser
definitions (or deprecate them eventually if they appear not be useful anymore). - Generalize the exports of
Text.Parsec.Language
toGenLanguageDef s u m
orGenTokenParser s u m
.
This should allow at least some users to upgrade without needing to update their code.
OTOH, considering that there are several other problems with Text.Parsec.Language
(#89, #93), it might make more sense to develop the current contents of Text.Parsec.Language
in a separate package to avoid friction with parsec's otherwise slowish churn rate.
What do you think, @hvr?
Also, @mrkkrp, is there any megaparsec
-based code that corresponds to (some of) Text.Parsec.Language
.
See lexer modules, that’s closest you get. But I think they are written from scratch, I didn’t like Parsec’s take on this.
Adding a specific use case where this caused problems for me. I ran into this issue trying to use the Language
module to parse strings in use with s-cargot
. cc @aisamanra I'm not sure if there is a "better" way I should be doing. Specifically, the following code doesn't compile because of String
/Text
mismatch:
data Atom = ASymbol T.Text deriving (Show, Eq)
-- s-cargot parser
myParser = mkParser parseString
parseString :: Parser Atom
parseString = AString . T.pack <$> stringLiteral lexer
-- `String` here needs to be `Text`, but that doesn't match the return type of `makeTokenParser` so doesn't compile.
lexer :: GenTokenParser String u Identity
lexer = makeTokenParser haskellDef
Sorry I'm not going to be much use coming up with a solution, but thanks every who has spent brain cycles on this! :)
@mrkkrp searching for "haskell text.parsec lexer modules" didn't uncover anything that looks obviously relevant. Do you have a specific link in mind?
@xaviershay See Megaparsec.
ah thanks. I wonder if that's API compatible with s-cargot? Will have a play. I mean, I could just implement sexpr parsing myself (or presumably copy an existing Megaparsec implementation), shouldn't be too difficult.