WurstScript icon indicating copy to clipboard operation
WurstScript copied to clipboard

Pretty printer alpha

Open karlek opened this issue 7 years ago • 10 comments

This is the alpha implementation of the pretty printer; I want to say thank you to @peq who wrote the initial structure of the printer, and to @Frotty who has answered my many specific questions about the wurst parser.


The current state of the pretty printer is that it correctly reconstructs wurst files. However, it doesn't retain neither explicit newlines inside statements nor line comments. Line comments aren't saved during parsing, which needs to be implemented first. Retaining explicit newlines is a little more difficult to implement, one thought is by using WPos of the element and compare it to the WPos of its parent and thereby inserting explicit newlines.


Future features:

  • Variable alignment
  • Precedence aware ExprBinary parsing, i.e: insert whitespace to highlight algebraic precedence. Examples: (1*5 + 3) and (3 + 1)
  • Newline preserving prettying.
  • Needs more tests

Some more questions:

Why is WurstDoc a modifier? It feels wrong compared to other modifiers such as public, constant, etc.

How does multiple module usages look like?

// Like this?
use A, B
// Or like this:
use A
use B

I wrote a script that got the AST for each file in the WurstStdlib2 and didn't encounter any of the following tokens:

  • ModuleInstanciation
  • WPackages, can you have more than one package in a file?
  • WBlock is it something like this:
// Statements.
{
   // Statments.
}
  • VisibilityPublicread
  • VisibilityDefault

karlek avatar Feb 03 '18 18:02 karlek

Hi,

thanks for your work!

I added a suggestion for tracking comments on this branch: https://github.com/wurstscript/WurstScript/tree/pretty-printer-comments

Maybe you can work with this as a starting point?

The approach there is to store all comments in a list and then add them in the printer based on their position (line/column) in the source code. For example, if we print an element with position 100, the printer will first print all comments before position 100.

One problem with this is, that the printer cannot change the order of elements, since that would mess up the comment order. To support reordering, some smarter mapping from comments to AST elements could be done in a preprocessing step.

Another problem is, that comment positions are not kept precisely, e.g. it is not possible to distinguish 1 /* comment */ + 2 and 1 + /* comment */ 2. If we want this, we probably need a different approach. Maybe we can have a look at how http://scalameta.org/ does it, since they apparently support this. However, I am a bit afraid that a more detailed representation of the source code would lead to a higher memory usage and some Wurst users already have reported performance problems related to that.

peq avatar Mar 01 '18 23:03 peq

Hi,

It has been really fun working on this and I've grown more accustomed to wurst as a pleasant side effect.

Well, there's a few parts in the pretty printer that will often change the token order, the best example is ClassDef. Since ClassDef exposes .getConstructors(), .getInnerClasses(), etc. as methods we lose the ordering of the their tokens.

I tried your branch, but there seems to be some problems with the parser.

2018-03-06-141445_2560x1440_scrot

karlek avatar Mar 06 '18 13:03 karlek

I tried your branch, but there seems to be some problems with the parser.

Oh, I somehow missed a continue. Should be fixed now on the branch.

However, after thinking a bit more about it, I guess we should use the approach to store the tokens attached to the AST nodes. That is more precise and does not have the reordering problem. To keep memory usage low, we can make the parser configurable and only generate the extra information if we need it for detailed pretty printing. I think I'm gonna try that variant on the weekend.

peq avatar Mar 06 '18 22:03 peq

Okay so what is our plan with this? We want to keep comments, so in its current state it's somewhat unacceptable. @peq @karlek

Frotty avatar Jun 10 '18 11:06 Frotty

Pull Request Test Coverage Report for Build 519

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • 605 unchanged lines in 7 files lost coverage.
  • Overall coverage decreased (-0.5%) to 54.492%

Files with Coverage Reduction New Missed Lines %
src/main/java/de/peeeq/wurstio/WurstCompilerJassImpl.java 13 58.58%
src/main/java/de/peeeq/wurstio/languageserver/WurstLanguageServer.java 14 0.0%
src/main/java/de/peeeq/wurstscript/RunArgs.java 27 65.29%
src/main/java/de/peeeq/wurstscript/WurstParser.java 27 64.0%
src/main/java/de/peeeq/wurstio/languageserver/WurstTextDocumentService.java 48 0.0%
src/main/java/de/peeeq/wurstio/Main.java 98 0.0%
src/main/java/de/peeeq/wurstscript/attributes/prettyPrint/PrettyPrinter.java 378 0.0%
<!-- Total: 605
Totals Coverage Status
Change from base Build 518: -0.5%
Covered Lines: 16703
Relevant Lines: 30652

💛 - Coveralls

coveralls avatar Jun 10 '18 11:06 coveralls

@Frotty, currently WurstScript lacks complete parsing of comments (line comments, and block comments).

As @peq pointed out:

However, after thinking a bit more about it, I guess we should use the approach to store the tokens attached to the AST nodes. That is more precise and does not have the reordering problem. To keep memory usage low, we can make the parser configurable and only generate the extra information if we need it for detailed pretty printing...

However, if @peq doesn't have the time, maybe I can take a stab at it. However, I don't have any experience with ANTLR so I'll need to ask a lot of questions if I do.

Edit: There's also some parts of the code that will change place after a pretty print. For example at the moment there's no way of knowing in which order class definitions where (i.e class variables, constructors, ondestroy etc).

I don't think this is a problem, the pretty printer is just opinionated and that's that. So we just say that class variables are first, then constructors, etc.

karlek avatar Jun 11 '18 03:06 karlek

I would suggest the following:

  1. I adapt the parser, so that it (optionally) stores comments and maybe other tokens as well.
  2. @karlek you adapt the pretty printer to include the comments

peq avatar Jun 11 '18 17:06 peq

I started adding the comments to the AST on the karlek-pretty-printer branch: https://github.com/wurstscript/WurstScript/tree/karlek-pretty-printer

To get the comments, you can use the functions getCommentsBefore and getCommentsAfter as I did here: https://github.com/wurstscript/WurstScript/blob/karlek-pretty-printer/de.peeeq.wurstscript/src/main/java/de/peeeq/wurstscript/attributes/prettyPrint/PrettyPrinter.java#L68-L100

See testComments in PrettyPrintTest for a testcase that is now partially working. The test also prints the AST to see where the comment tokens are stored.

peq avatar Jun 14 '18 22:06 peq

Any chance of you looking into this again in the near Future @karlek ?

Frotty avatar Jul 19 '18 09:07 Frotty

@Frotty No, not until I get back to Sweden in late August.

karlek avatar Jul 20 '18 06:07 karlek

To finally get this done I merged the two branches and fixed a bunch of issues such as binary expression formatting breaking parentheses and added support for differentiating between wurst, jurst and jass files. Then I implemented Jass formatting, which I find more useful because Wurst code is usually already formatted, while Jass code, e.g. from the war3map.j file, is not. Some issues still remain with the Wurst formatting, which should be addressed in future PRs, but at least this can be merged now.

Frotty avatar Oct 26 '23 10:10 Frotty