brittany
brittany copied to clipboard
Support extension UnicodeSyntax
brittany
discards Unicode symbols that GHC supports. For example, it will format something like
f :: A → B → C
as
f :: A -> B -> C
I don't use that extension, but it would certainly be nice to have it supported in brittany.
Related additional features might also include some flags/commands to refactor/convert between unicode and non-unicode versions. (Current behaviour is like const False
; you ask for id
and we might want to allow the user to choose between those two and const True
applied to "unicodeness".) I am not sure how much unicode syntax there is or how consistently its users apply it.
I agree - it would be nice if the "Unicoding" could be automatically done by brittany
.
As far to "how much," brittany
already keeps Unicode characters for a bunch of things. My guess would be that it is currently screwing up the ones in the UnicodeSyntax extension list, probably because those are the only ones hardcoded in GHC itself (as opposed to being provided by other packages).
So I just wrote a patch that makes brittany
automatically transform UnicodeSyntax
tokens into their Unicode equivalent whenever -XUnicodeSyntax
is passed as a GHC option, which I think is reasonable. I'll think about a pull request - it requires some paperwork at work.
Now for a little bit more detail, in case those who are more familiar with the code have better ideas: the problem is that brittany
, when building function signatures, expressions, and such, writes down hardcoded strings for the following tokens: ::
, ->
, forall
, <-
, and =>
. It does that in various places in the following files:
-
src/Language/Haskell/Brittany/Internal/Layouters/Decl.hs
-
src/Language/Haskell/Brittany/Internal/Layouters/Expr.hs
-
src/Language/Haskell/Brittany/Internal/Layouters/Pattern.hs
-
src/Language/Haskell/Brittany/Internal/Layouters/Stmt.hs
-
src/Language/Haskell/Brittany/Internal/Layouters/Type.hs
Changing the hardcoded strings to parameterized ones is not a big deal, it took me about an hour, and that was the first time I saw the code.
But a better idea would be to keep the proper tokens. I think the AST has the information there (for example, AnnDcolonU
for Unicode, versus AnnDcolon
for ASCII). That requires diving into the code to an extent I'll probably won't find the time to do. Hopefully it is not too hard for those who know the code inside-out!
But a better idea would be to keep the proper tokens. I think the AST has the information there (for example, AnnDcolonU for Unicode, versus AnnDcolon for ASCII).
Right, and brittany already has the utils to query for the presence of those annotations for relevant nodes (via hasAnnKeyword
). For example, see This test for the AnnSimpleQuote
annotation thingy.
You need to be sure to run this test on the correct node, but by using --dump-ast-full
this should not be too hard to get right.
If someone would like to contribute a patch to make brittany keep the Unicode syntax (→
, ∷
, ←
, etc.), where in the code would that someone best start?