Rubberduck
Rubberduck copied to clipboard
Implementing extract method
This is going to be first iteration of restoring Extract Method
functionality that was disabled from RD, this time NOT regex'ing stuff.
Because there are numerous cases to test, we want to get start with most simplest, dumbest implementation.
Basically, the first iteration should aim to:
- expand the selection to complete statements a) verify it's within a single procedure b) verify for complicating factors (e.g. GoTos, Exits, ....)
- collect all local variables used
- copy the statements
- show the dialog a) user choose whether variables are parameters or local b) user names the method
- add the
Dim
statements for local variables - insert the method at end of module
- replace the original selection with a call to the method, passing in parameters
Note that with 1b, we will disallow selections where it includes flow of control statements that can go to some places not necessarily within the user's selection. That possibly could also include error handling, and of course any Exit
, GoTo
, GoSub
.
Maybe the initial iteration can indicate why it can't extract method (e.g. Exit statement exists in the selection
somewhere) so at least it's easy for user to manually fix while we make it smarter in subsequent iterations?
Thoughts? Suggestions?
I'm not sure there's such a thing as as dumb implementation of this refactoring, but I agree it's a very complex one to tackle, with a ton of traps along the way. An iterative implementation seems like a very good idea.
The original implementation was rather naive, but did a few things pretty well.
Selection
The refactoring works off the user's selection in the active code pane... like all refactorings, really. The difference is that here we aren't working with a selected Declaration
or IdentifierReference
, we're working with a multi-line QualifiedSelection
that encompasses multiple statements.
All refactorings need an ICommand
implementation, which is constructor-injected into the corresponding CommandMenuItem
- there's a naming convention here: FooCommand
gets injected into FooCommandMenuItem
, so ExtractMethodCommand
would go into ExtractMethodCommandMenuItem
. The command's CanExecute
implementation determines under which conditions the command can be executed; when the method returns false
, the corresponding menu item is disabled. Because every single command needs to evaluate its CanExecute
result just before the code pane's context menu is displayed, the CanExecute
method must be optimized for performance as much as possible, so as to avoid introducing a noticeable delay in the VBE displaying its context menu.
While it definitely sucks that a disabled command doesn't explain why it's disabled, I think consistency matters, and if all commands are contextually disabled, then Extract Method should be no different; IMO it's the wiki's job (or the website's? the RD News blog's? Stack Overflow?) to explain how to use a given command and why it's disabled under such or such condition.
So when should the refactoring be enabled?
-
Selection begins and ends inside the body of a single member procedure. The
DeclarationFinder
already contains a method that returns the selected declaration - we could write something similar to cache/return the member procedure that's selected, if applicable (returnnull
if it's not applicable), so validating this point would amount to nothing more than anull
-check. -
Selection does not contain references to line numbers or line labels outside of it. That's trickier, but then again the
DeclarationFinder
should be helpful, given line numbers and labels each have a correspondingDeclaration
- we can inspect theirReferences
collection and compare theirSelection
to the user's selection usinguserSelection.Contains(labelRef.Selection)
anduserSelection.Contains(labelDeclaration.Selection)
to validate that all line/label jumps are self-contained. -
Selection does not jump out of the procedure scope. So
Exit Sub
/Function
/Property
in the selection would disable the refactoring. I don't see a way to get these to work, even in future iterations. -
Selection does not include partial blocks. We can't extract the selection into its own method if the selection includes an
If
without anEnd If
, or aFor
without aNext
, or aWith
without anEnd With
.
I think that covers it.
Variables (and constants) in scope
The refactoring needs to know about what's in-scope for the selection:
- Locals declared and used inside the selection, but not outside of it, should be moved to the extracted procedure.
- Locals declared inside the selection but only used outside of it, should be moved closer to usage, i.e. not be extracted, just moved somewhere underneath the selection, as close as possible to their first use.
- Locals declared before the selection, only used inside the selection, should be moved to the extracted procedure.
- Parameters used inside the selection should be passed as parameters to the extracted procedure. If passed
ByRef
and assigned within the selection, should be passedByRef
as well.
Mechanics
So the refactoring looks at what's in-scope, and before it can pop a preview box and prompt for a name, it needs to:
-
Determine the possible inputs; these would be things declared outside the selection, and referenced within.
- Inputs that are assigned and then used in the original/calling procedure after the selection, must be passed by reference (
ByRef
). - Inputs should otherwise be passed
ByVal
by default.
- Inputs that are assigned and then used in the original/calling procedure after the selection, must be passed by reference (
-
Determine the possible outputs; these would be things assigned within the selection, and referenced after it.
- If there's only one possible output value, we extract a
Function
procedure. - If there are multiple output values, we extract a
Function
procedure, pick/infer which to return (let the user select from a dropdown), and pass other return valuesByRef
. - If there is no possible output, we make a
Sub
.
- If there's only one possible output value, we extract a
- Determine the body of the extracted method; the method's signature, the locals that move into the new procedure, the selected content. The resulting procedure should be indented accordingly with the indenter settings.
As the user specifies a name for the extracted method, and picks/confirms inputs and outputs, the preview box updates accordingly.
Okaying the preview actually creates the previewed member immediately underneath the calling procedure it was extracted from, and replaces all lines in the selection with a call to that extracted method. For a function, the selected return value determines which local variable receives the returned value. A re-parse is then requested on the parser state.
The extracted method is always Private
.
Oh, also - the refactoring needs to be able to work off a QualifiedSelection
, so that we can invoke it programmatically, say, from a hypothetical quick-fix that suggests extracting a GoSub...Return
subroutine into its own method... ref. issue #13 (that's right - one from the original wishlist!)
Edge case for Exit Sub
: if Exit Sub is a the end of the selection, we can probably exclude it from our selection. Not sure whether it's worth the hassle though.
If the Exit Sub is in a conditional, we would also exclude the conditional block, so long as the Exit is the only statement in the block.
Any other placement of Exit [..] can not be heuristically removed
Thanks for the additional points; good to think about them.
I suppose the preview could look something similar to this:
Just to be clear, the first iteration will be dumb in the sense it won't try to refactor the variables from original procedure. It'll collect all identifiers and thus leave up to the user to decide whether it's a new local variable, a new input, or a new output.
Subsequent iterations will be a bit more smarter, automatically deleting variables from the original procedure that are now unused after extraction, auto-guessing which's output and which's parameters or handling edge cases of Exit
s. But I think the first version should be manual. Easier to automate clutches when they exist. :)
The dialog seems to have disappeared, but this is what it looked like in v1.4.3:
As for variables in the original method, as long as the refactoring doesn't leave the code uncompilable, it's fine: a variable that's declared but never assigned or referred to will be the target of an inspection result after reparse anyway.
Agreed, especially on the point that we must never write uncompilable code. That's just bad juju.
Two questions.
-
Given that we must be quick'n'nimble in
CanExecute
, is it allowable to short-circuit some of the checks which may mean that the menu item is enabled but may fail to execute when we do additional analysis? Not saying that I will do that but just want to know if this is an option in case the analysis requires too much bit twiddling. If it's allowable, do we have a convention for that scenario? -
Just confirming that
QualifiedSelection
would allow me to expand the selection to the entire statements? Think about how block comment work - it's not necessary to select the actual start / end; you can start somewhere in middle of the first line, highlight n lines, and stop somewhere in middle of the last line. Commenting or uncommenting would then expand to the start/end. I think it's much easier doing that way (even though dem:
could kink up the works)
- Let's try to include all checks, and if there's a problem, then we can consider moving the less-common checks to the
Execute
implementation, and display a message box that explains why the selection isn't valid. - A
QualifiedSelection
is nothing more than aSelection
with aQualifiedModuleName
, IOW it tells you selection coordinates and what module it's in (theSelection
by itself isn't concerned with what module it belongs to).
To be honest, the part about identifying code blocks is the hard part. We might even need to implement some caching somewhere (seems doing that in the DeclarationFinder
would be abusing its purpose), so that we have a very fast/efficient way of telling, given a QualifiedSelection
, whether the selection is encompassing a whole block or not.
I wouldn't expand the user's selection: we should work with what the user is giving us.
If it's invalid, it's invalid.
e.g.
x If foo < bar Then
x 'do stuff
x '....
End If
Should not be expanded to:
x If foo < bar Then
x 'do stuff
x '....
x End If
However convenient that might be. At least not in the first iteration, when other more important cases aren't handled.
In order to figure out code blocks, we'll need a parse tree listener (to pick up starting & ending line positions of all code blocks) that runs during one of the resolver passes (could be together with the "identifier references pass" I guess), and something to abstract the blocks; perhaps a QualifiedSelection
would be enough, but if additional metadata is needed then perhaps a small CodeBlock
class or struct would do the trick.
Actually, I wasn't talking about expanding selections vertically - only horizontally. Not going to try to add/remove extra lines; only to expand the start to the column 0 and the end to the column
I do think we have to implement a way to know when it's a statement for extract method to be meaningful. It simply won't any sense if we don't work at the statement level. That's also why I think selection has to be expanded at least to the statement. Otherwise, it'll be asking too much from the user to tell them that they must select everything from If
to End Ifand not
footo
End`
Oh - in my mind the selection columns are simply ignored; we work with the StartLine
and EndLine
of the user's selection, simple as that :wink:
@retailcoder calling for statement separator code weirdnesses with users... should be strongly documented what we assume
@Vogel612 :+1: good point - we can keep that in mind for a future iteration!
@Vogel612 assign to me, add to epic