roadmap icon indicating copy to clipboard operation
roadmap copied to clipboard

Source code formatting tool

Open paf31 opened this issue 9 years ago • 51 comments

paf31 avatar Feb 17 '16 17:02 paf31

I can think of a few tools that I would like to use and build that might have similar prerequisites to this one.

I'd also like it if they could be written in PureScript.

Is there some intermediate representation that could be output from the compiler that would be sufficient to achieve this? Perhaps an AST as JSON?

paulyoung avatar Mar 02 '16 12:03 paulyoung

This would probably best be done in Haskell for two reasons: you would have access to the IR already used in psc, and it would be able to be easily pulled into psc if it becomes stable and popular enough.

michaelficarra avatar Mar 02 '16 15:03 michaelficarra

I'm a fan of hindent, it allows you to reformat declarations and IMO this is better than full source formatters.

I use it when I'm writing PureScript as well, usually adding a few $ operators is enough to make code parseable with haskell-src-exts but I'd love to have a similar tool with full PureScript support.

ozanmakes avatar Mar 02 '16 20:03 ozanmakes

Good points, @michaelficarra.

paulyoung avatar Mar 04 '16 20:03 paulyoung

For what it's worth, hindent is a full source formatter.

One of the things I don't like about it is that it supports multiple styles. I think the main benefit of a formatter is that it sets the community style, like Go's gofmt or Python's PEP8. I'm a fan of Elm's style guide, which can be enforced with elm-format. I would love it if PureScript had something similar.

tfausak avatar Apr 22 '16 14:04 tfausak

how bad is an approach like this? bear in mind that it's very ugly code atm since it's only a quick proof of concept

https://gist.github.com/archaeron/27c7bdef909c626d7c1c95490243a920

archaeron avatar Apr 25 '16 19:04 archaeron

The main issue with pretty printing is inserting minimal parentheses. The AST does not indicate where they should go.

paf31 avatar Apr 25 '16 19:04 paf31

shouldn't it insert exactly the parenthesis the user wanted? i.e. not change them?

sometimes I use parenthesis where I know I wouldn't need them, just to make the code clearer

archaeron avatar Apr 25 '16 19:04 archaeron

Even then, they are lost after desugaring. So this could work, but only if you used the data straight out of the parser.

Edit: oh, I see that's what you're doing, sorry 😄

paf31 avatar Apr 25 '16 19:04 paf31

No problem.

Do you think this approach could work? Or is it better to try a different one? Totally legit to say it won't work :)

Deciding that you could simplify the parens in your code is probably best left to a linter.

archaeron avatar Apr 25 '16 20:04 archaeron

I think it can definitely work.

paf31 avatar Apr 25 '16 20:04 paf31

Looks like we started hacking on the same thing @archaeron https://github.com/kRITZCREEK/ps-pretty :D

kritzcreek avatar Apr 25 '16 20:04 kritzcreek

@kRITZCREEK awesome! I'm sure your version will be better :) Very good job on your tooling by the way

I just wanted this really bad, so I tried something.

archaeron avatar Apr 25 '16 21:04 archaeron

@archaeron Looking at your code so far I seem to be losing hard :D It seems you've got the instance stuff for ansi-wl-pprint down. Lets continue to work on this!

kritzcreek avatar Apr 25 '16 21:04 kritzcreek

@kRITZCREEK how do you want to proceed? I can make my code public. Or I could try to help you with yours.

Not sure if my approach is the best. GHC complains about orphan instances :D

I have one wish for the formatter. I'd very much like if there was an option to format so that it doesn't align to previous lines.

archaeron avatar Apr 25 '16 21:04 archaeron

I'll respond tomorrow, I need to sleep for today ;) Am 25.04.2016 11:29 nachm. schrieb "archaeron" [email protected]:

@kRITZCREEK https://github.com/kRITZCREEK how do you want to proceed? I can make my code public. Or I could try to help you with yours.

Not sure if my approach is the best. GHC complains about orphan instances :D

I have one wish for the formatter. I'd very much like if there was an option to format so that it doesn't align to previous lines.

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/purescript/roadmap/issues/23#issuecomment-214529150

kritzcreek avatar Apr 25 '16 21:04 kritzcreek

fair enough. good night

archaeron avatar Apr 25 '16 21:04 archaeron

here is my try if you want to test it out: https://github.com/archaeron/purescript/tree/psc-format

archaeron avatar Apr 26 '16 09:04 archaeron

This line might need to change to make psc-format useful 😉

tfausak avatar Apr 26 '16 13:04 tfausak

@tfausak should I change it to: https://github.com/kRITZCREEK/ps-pretty/blob/master/src/Lib.hs#L14 :grin:

archaeron avatar Apr 26 '16 13:04 archaeron

:D I think it's quite clear we're just exploring here ;)

kritzcreek avatar Apr 26 '16 13:04 kritzcreek

update:

@archaeron and I have been to the ZuriHac haskell hackathon and decided to work on psc-format.

We have a very basic implementation ready and we can "format" a simple file and it will compile again after we restructured it (although it probably won't compile every PureScript file yet, since we still miss a bunch of stuff, but we have made good progress)

(1) Right now there is still no real formatting going on, but it should be easy enough to implement from here on out. We are happy to hear some opinions on how things should be formatted.

As a first step we propose to define a format which will be hardcoded. Further down the read, we will try to add the possibility that the user can set more options to customize the formatting.

(2) Also, if anyone can take a look at the source to check whether this seams to be a feasible option, that would be greatly appreciated. Since this is the product of a 2 day hackathon, we will have to do some cleanup work soon, so the idea right now is to figure out whether the general direction makes sense or not.

(3) Last, there is also the question of whether we should leave this in the PureScript repo or move it to a separate project.

To try it out: clone repo https://github.com/archaeron/purescript, checkout branch psc-format , build with stack install and then run psc-format --input Main.purs --output Main2.purs to test

JorisM avatar Jul 24 '16 14:07 JorisM

This looks great! I'll be sure to check it out later this week.

I'd love to distribute something like this alongside the compiler actually.

paf31 avatar Jul 25 '16 20:07 paf31

I sort of resurrected this and got it compiling again, and pushed the result to a new psc-format branch on purescript/purescript. One thing that strikes me is that we now have two separate approaches to pretty-printing within the compiler: the approach taken by Language.PureScript.Pretty, which uses the boxes library, and the approach taken in this branch, using ansi-wl-pprint. Is there something about ansi-wl-pprint that makes it better than boxes for this? Do you remember why you went for ansi-wl-pprint?

Also, to what extent is the Language.PureScript.Pretty hierarchy only intended for printing partially desugared code? For example, I tried it out on examples/passing/Console.purs and it got into an infinite loop (although I haven't yet worked out for certain whether this is happening inside the new psc-format code or in the existing Language.PureScript.Pretty code).

edit: I've answered my own questions: having read the paper, the approach taken by ansi-wl-pprint does seem to me to be a bit closer to what we would want for a source code formatter (although perhaps it's too early to tell). Also, the infinite loop was indeed caused by Language.PureScript.Pretty.Values. Since the formatter will only deal with entirely non-desugared code I think it does make sense to have a separate section of the library devoted to it.

hdgarrood avatar Jan 05 '17 04:01 hdgarrood

@hdgarrood awsome news :) I've been thinking of taking it up again, but sadly I can't use PureScript at work at the moment :(

ansi-wl-pprint was chosen more or less at random at that time. Altough now that I know more about prettyprinting I'd choose it again over boxes. With ansi-wl-pprint you can annotate nodes when prettyprinting, meaning that you can output a highlighted AST.

archaeron avatar Jan 05 '17 19:01 archaeron

Cool, thanks! If anyone wants to keep up with what I've done, I've started pushing commits to the psc-format branch on hdgarrood/purescript instead so that I don't generate so many messages in the IRC channel.

hdgarrood avatar Jan 05 '17 20:01 hdgarrood

@hdgarrood I'm unable to compile your psc-format branch, is it just me or if not can you get it to compile so I can have a play?

shmish111 avatar Jan 25 '17 11:01 shmish111

@shmish111 I've rebased and cleaned up the commit history a bit. In short I've come to the conclusion that wl-pprint-text isn't going to be suitable (see https://github.com/hdgarrood/purescript/commit/bd2040787b769449369382b46086e989507e057e), so I've started looking at a new approach. I haven't yet got as far as getting it all to compile though.

If you want to play with what I had before, which does compile and does sort of work (in a very, very loose sense), just check out the commit right before that one, https://github.com/hdgarrood/purescript/commit/73c0fd555924b08a29fc1fb514985e6a84368bc5.

hdgarrood avatar Jan 25 '17 17:01 hdgarrood

thanks @hdgarrood

WRT the actual style, who will decide this, is there any code format guidelines already anywhere (I couldn't find any). I personally found the elm style to be very good, it seems a bit verbose at first but it's all based around making commit diffs easier, i.e. avoiding changing a line if it doesn't need to be.

shmish111 avatar Jan 27 '17 11:01 shmish111

There's this, which is pretty good: https://github.com/ianbollinger/purescript-style-guide

garyb avatar Jan 27 '17 12:01 garyb