gen [POC] Prune Unused Methods in Default Mode

In default mode, (where the user just annotates the type and doesn't specify exactly which methods should be generated by using tags) only the methods that are used should be generated, and gen should automatically detect these instances.

Here, I demonstrate a proof-of-concept for how this could work. I apologize for the wall-o-text, there's a TL;DR at the end.

Note: This PR is a very rough Proof-Of-Concept, and I don't expect it to be merged. Specifically, the last commit (c32665f) is quite incomplete. This is based on commits from PR #64 because it has functionality that I think is necessary for gen to work on a changing codebase with this feature.

In commit f508d49 I add the method typewriter.Package.GetSelectorsOn(types.Type). In the AST, selector nodes are all the dot-expressions. They have the form x.Y where x is any expression (often an identifier, but not necessarily), and Y is an identifier. Note that there doesn't exist a 'method-call node', instead, a method call is just a function call where the function to call is determined by the result of the selector node instead of a raw identifier. (Think (x.Method)() not x(.Method()).) It turns out we don't want to consider function calls at all, just because we don't want to miss cases where the user saves the result of a selector and calls it later, like: fn := x.Method; fn(); (which is perfectly valid).

GetSelectorsOn returns a []string of all the unique identifiers that were selected on expressions of the same type as the passed types.Type. It works by walking the AST while maintaining the current types.Scope so the expression part of every ast.SelectorExpr can be evaluated to a specific type. A mapping from ast.Node to types.Scope is provided by passing a types.Info struct to the types.Config.Check method. Unfortunately, this mapping contains only the scope nodes themselves so a scope hierarchy must be maintained while walking the tree. Worse, the scope node of a given node is not always an ancestor. For example, the scope of ast.FuncDecl.Body statements is actually under ast.FuncDecl.FuncType. This isn't impossible to work around, it just adds a bit of special case handling, and I haven't tested all possible scope node types for these special cases.

Also, I changed genwriter.evaluateTags in commit c32665f to use the new GetSelectorsOn method to prune the generated methods to only the ones that are used.

The current, not ideal, workflow for new projects looks like this:

Start writing your project files, annotate a type with // +gen and use the not-yet-generated plural type and methods as needed.
Run gen. Because the plural type doesn't yet exist, type checking cannot determine which selectors are called on it yet, so it generates all methods.
Run gen again immediately afterwards. Now that the plural exists, more detailed typechecking can occur. Now it gets a list of exactly what methods were used throughout the project, and it generates those methods.
You may need to run gen a couple more times to fully prune the project. I'm not sure why this is necessary. (I told you, rough edges :smile:)

Ideally, steps 2-4 would be collapsed down to one step where the user runs gen only once (which might execute itself in the background multiple times). The workflow for existing projects starts at step 2 above.

TL;DR: Here's a test file with instructions for how it would work. The current procedure is less than ideal and could be greatly improved.

Jul 11 '14 21:07 infogulch

This is awesome, I think. Could we somehow automate this pruning process by running through the gen process automatically several times?

Jul 11 '14 21:07 freeeve

This is awesome, I think.

lol.

Yes, surely it's possible. If you notice the line just above the TL;DR, I mention something similar.

Jul 11 '14 21:07 infogulch

Lawdy. Awesome indeed. Gimme some time to digest.

Jul 12 '14 02:07 clipperhouse

Absolutely, take your time. I tried to write down everything I learned, but it's very dense and certainly not light reading material.

Jul 12 '14 19:07 infogulch

A few thoughts as I’ve let this digest a bit. I see several use cases here, the test gist is really helpful in splitting them out.

1: use of a plural type before generation. This is part of the broader problem/solution where the generation can continue in the presence of errors coming from types.Check: if the type info is OK, it doesn’t matter if there are semantic problems elsewhere in the code. I’d probably put it behind a -force flag, though @infogulch makes the argument continuing in the presence of errors might be a safe default behavior.

2: pruning. Detect which methods are used and prune those that aren’t. This could be generalized for anything, no? It’s dead code detection.

3: auto-generation of newly-used methods. Detect methods called on gen types that are not yet gen’d, and gen them.

In the case of 2 and 3, they are two sides of the same coin. They remind me of goimports, which I use.

Since I am heading toward a more explicit API (likely in the form of a SliceWriter which supersedes genwriter), I would like to see those discovered/pruned methods appear/disappear in the +gen directive. So if you use Each it magically appears/disappears in +gen slice:"Sort,Each".

I’d be inclined to implement gen -auto -watch flags. The former does the goimports/pruning magic, the latter watches for file changes. They can be used independently or together.

Jul 14 '14 03:07 clipperhouse

Good breakdown. A few clarifications.

You said use case 1 is "use of plural type before generation," but it's a bit more. After the plural type itself is generated, it allows new methods to be called on it even if they had been pruned in the past. This means that use case 3 requires 1 to work. If you've pruned the implementation to only have the used methods, and then the user calls a new method, that method will (by definition) not exist yet, and therefore cause a type error. If 1 is missing, gen will give up before it is able to search the AST for that new method. That's why I included the code for issue #64 (which is essentially use case 1) here.

The idea for making things appear in the // +gen comment is interesting, but I wonder if there would be conflicts. E.g. someone wants to manually add a new method and gen just overwrites and removes it.

Jul 14 '14 15:07 infogulch

I agree with Joe about adding to the +gen comments. I think those should exist to override the default 'magic' behavior. Also, I'd appreciate the ability to generate all methods regardless so that my auto complete engine can work on methods I haven't used. This is not possible if gen always prunes all unused methods.

Jul 14 '14 16:07 Logiraptor