csharplang icon indicating copy to clipboard operation
csharplang copied to clipboard

Champion "static local functions" (16.3, Core 3)

Open jcouv opened this issue 7 years ago • 50 comments

The proposal is to declare a local function as non-capturing. The keyword to indicate that would be static.

Note: if we do this, then we should also consider allowing parameters and locals of the local function to shadow parameters and locals from the containing method.

  • [ ] Proposal added
  • [ ] Discussed in LDM
  • [ ] Decision in LDM
  • [ ] Finalized (done, rejected, inactive)
  • [ ] Spec'ed

Related discussion thread, including a discussion of "capture lists" (a more elaborate feature): https://github.com/dotnet/csharplang/issues/1277

Note: EE scenarios are a concern.

Do we want to allow some other modifiers (like unsafe)?

LDM history:

  • https://github.com/dotnet/csharplang/blob/master/meetings/2018/LDM-2018-09-10.md
  • https://github.com/dotnet/csharplang/blob/master/meetings/2019/LDM-2019-01-16.md (shadowing in lambdas)

jcouv avatar May 24 '18 21:05 jcouv

To be clear, would this fit @benaadams's definition of static?

A static method in a class hides its enclosing scope; you can't access and of its instance data or call any of its instance methods - only call static functions and access static data.

Likewise a static local function shouldn't have access the the parent functions local state or the classes instance state, much like a regular static function.

The static local function is providing scoping so as not to pollute the class with a static function which is only called in a single method.

bondsbw avatar May 24 '18 22:05 bondsbw

@bondsbw Yes, that's what I'd have in mind. The local function would not be able to access variables from parent method or instance fields. It could only access variables that it declares (parameters or locals) and static fields.

jcouv avatar May 24 '18 23:05 jcouv

Is there a big reason to do this for local functions, aside avoiding accidental mutations? Aren't local functions significantly more efficient with captures?

HaloFour avatar May 25 '18 01:05 HaloFour

@HaloFour The primary motivation for this proposal is readability/maintainability. Knowing that some code is isolated makes it easier to understand, review and re-use. That's the same general argument as with any kind of isolation like object encapsulation or functional programming.

There is also a practical motivation. There are good reasons why parameters or locals of local functions should not shadow variables from the enclosing scope. But in practice, there are many pieces of pure logic that make sense to extract as local function, but where this restriction becomes inconvenient (prevents you from using obvious variable names). Having static local functions and additionally allowing static local functions to declare their own variables regardless of context solves that problem (while avoiding the pitfalls of shadowing).

jcouv avatar May 25 '18 02:05 jcouv

What’s the Benefit over a private static function? Local functions benefit is exactly capturing

popcatalin81 avatar May 25 '18 07:05 popcatalin81

@popcatalin81 more fine-grained access control, primarily

orthoxerox avatar May 25 '18 07:05 orthoxerox

@HaloFour it isn't just mutations; it is also to incredibly useful to avoid accidentally creating the capture machinery, especially in hot paths. This can have a surprising amount of overhead, especially in scenarios like async. Again, on hot paths, a common pattern in some async pipes is to try to avoid the async machinery completely if it will often be synchronous, falling back into an async completion if it turns out to be incomplete - which means all the local state (including the capture machinery) is going to end up on the heap (because of how await works when tasks are incomplete); doing this manually can have significant savings for performance critical code, and it turns out that local functions as long as you don't capture are a very tidy way of doing it

@popcatalin81 scoping and expressiveness. I actually strongly disagree with you that the benefit of local functions is "exactly capturing" - I've written plenty of local functions, and the ones that make use of capture are by far the minority. Well, let me rephrase: the ones where I intentionally used capture are the minority. But yes, functionally they're just like regular private static methods; in fact, my usual "did I capture without realising it?" check is exactly to move the local function out of the method (so: a private static), see if the code still compiles, then move it back (and remove the modifiers).

mgravell avatar May 25 '18 08:05 mgravell

@mgravell

If the goal is to avoid creating async machinery for capturing then I don't see why it would be limited to avoiding capture on local functions where said machinery requires significantly less overhead.

And, like most proposals that involve limiting functionality/syntax for the purpose of improving code generation and performance, why is this not suitably managed by an analyzer? Most people will not care about this degree of optimization.

I've never reached for a local function unless I wanted to capture. I see zero point to them otherwise.

HaloFour avatar May 25 '18 12:05 HaloFour

@jcouv

Knowing that some code is isolated makes it easier to understand, review and re-use. Having static local functions and additionally allowing static local functions to declare their own variables regardless of context solves that problem (while avoiding the pitfalls of shadowing).

IMO, those two statements are in direct contradiction. Allowing a local function to shadow variables would negatively impact the readability of said function. Now anytime you see a local function you have to double-back to the signature to see what that identifier refers to.

HaloFour avatar May 25 '18 12:05 HaloFour

@HaloFour Any design involves some trade-offs and tensions. I recognize it is a fair concern to discuss and experiment with. From experience with static methods, I've found them to reduce the cognitive load rather than increase it, despite their introducing a "mode". When reading a static local function with variables defined independently of the parent scope, I suspect the same modality will kick in so it won't be confusing.

jcouv avatar May 25 '18 15:05 jcouv

@jcouv

Any design involves some trade-offs and tensions. I recognize it is a fair concern to discuss and experiment with.

I agree. Consider this just part of that discussion. 😁 Despite my arguments I'm fairly ambivalent about the proposal.

Where this feature tries to address performance of certain aspects of generated code, particularly around async method machinery, it feels like this could be easily (and appropriately) addressed via analyzers. By itself this feature seems like it would barely scratch that surface. Local function capture, by itself, is not the performance problem. It's only in conjunction with those other features like async methods or delegates where the compiler resorts to the less efficient machinery. So why target a language change that barely scratches the actual problem?

I'm much more opposed to the notion of the shadowing. I agree that having to come up with a lot of different names for identifiers can be restrictive. The team was very quick to dismiss that argument in the context of leaky out declaration variables and pattern variables.

From experience with static methods, I've found them to reduce the cognitive load rather than increase it, despite their introducing a "mode".

Current static methods don't have to deal with other locals being potentially in scope, so I don't feel that this comparison is appropriate. The rules regarding shadowing fields don't change between instance or static methods.

HaloFour avatar May 25 '18 15:05 HaloFour

@HaloFour

Current static methods don't have to deal with other locals being potentially in scope

You can conceptually map captures to instance fields,

class C
{
   int name = 0;
   void Method1() => Use(name); // OK
   static void Method2() => Use(name); // Error
}

void M()
{
   int name = 0;
   void Local1() => Use(name); // OK
   static void Local2() => Use(name); // Error
}

Also, methods can shadow fields with parameters and locals regardless of being static.

alrz avatar May 25 '18 15:05 alrz

@HaloFour To me, shadowing is a feature. I occasionally shadow deliberately in order to prevent uncontrolled access to a field; var controlledAccess = Interlocked.Exchange(ref this.controlledAccess, null); for example.

I've long been wishing I could declare local function parameters that shadow locals that would otherwise have been captured, in order to prevent their capture and accidental use out-of-band, while still capturing others or simply while declaring a fully-static helper method closer to where it is relevant.

jnm2 avatar May 25 '18 15:05 jnm2

@HaloFour

And, like most proposals that involve limiting functionality/syntax for the purpose of improving code generation and performance, why is this not suitably managed by an analyzer?

For me this has nothing to do with improving code generation, it's all about improving code readability.

Local functions are making it much more common to have top level functions which are several hundred lines long. The logic gets split between the top level method body and a number of local functions whose behavior is specific to the top level method. Before local functions such code would be split into a number of top level method bodies.and the helper functions would be needlessly polluting the containing types scope. Now the entirety of the logic can be contained with in the top level method body.

Such local functions are harder to review than they would be as a top level function. The silent capture of state can affect how they execute and must be accounted for. There is nothing inherently wrong with this and many times it's hugely beneficial to the developer. In the cases where no capture is done, or when the developer specifically wants to avoid it, the static modifier helps significantly with readability. It tells me as a reviewer that I can review this local function in isolation. No need to scan for subtle capture problems.

For small functions this really isn't that important. But in larger methods it can make a huge difference.

It does have the additional benefit of ensuring local functions are capture free and hence will be a tiny bit more efficient as we don't even have to pass a struct closure around and have local indirection. If that were the primary motivation of this feature I'd be against it. Not worth the cost IMHO.

I'm much more opposed to the notion of the shadowing.

I'm still a bit on the fence about this. In the absence of a static modifier I'd be 100% against shadowing. It creates too much opportunity subtle behavior changes. Imagine you delete a parameter that shadows a local in the top level method, oops now you just introduced a capture you probably didn't intend.

Restricting shadowing to local functions marked as static though removes this problem and most of the issues around shadowing. It also makes local functions significantly more usable. I didn't really appreciate this problem until I started using local functions a lot in C#.

jaredpar avatar May 25 '18 16:05 jaredpar

@alrz

You can conceptually map captures to instance fields

I don't think that's the intuition that most people have regarding captured locals.

Also, methods can shadow fields with parameters and locals regardless of being static.

Yes, but never without a way to explicitly refer to that shadowed field, and you're always required to explicitly qualify the field anytime such a local is in scope.

@jaredpar

Local functions are making it much more common to have top level functions which are several hundred lines long.

Sounds like a good argument for removing local functions from the language. 😉

But seriously, cramming hundreds of lines into a single method definition doesn't sound like a good idea, with or without local functions. In my opinion that's exactly when those chunks of logic should be top-level functions, specifically so that they can maintained and tested in isolation from one another.

I can't fathom a local function that's more than maybe 4-5 LOC, and I would count that against the LOC of the function in which its contained.

Such local functions are harder to review than they would be as a top level function.

Which is probably why they shouldn't be local functions, regardless of whether they capture. Locality isn't an excuse to hide a massive ball of logic behind a blackbox.

Restricting shadowing to local functions marked as static though removes this problem and most of the issues around shadowing.

The technical problems, maybe. But I still see the problem of having to mentally parse through the contents of the local function and upon seeing any given identifier having to double-back to the signature to see whether the lexical scoping has been reset. IPU forbid having to deal with that when nesting local functions (which I hope ain't becoming a thing.)

It's not that a might refer to multiple things, it's that you can't look at a single LOC within the body of a top-level function and have any idea what a is.

HaloFour avatar May 25 '18 19:05 HaloFour

@HaloFour

But seriously, cramming hundreds of lines into a single method definition doesn't sound like a good idea, with or without local functions

I used to believe the same. Then I spent a bunch of time coding in F# which has local functions and gradually I opened up to the advantages it provides. There is value in keeping all the logic necessary for an operation located together.

Locality isn't an excuse to hide a massive ball of logic behind a blackbox.

The reverse is polluting a type with a number of methods that are either a) useful to a single method or worse b) only legal to be called from a specific method. Over times types grow and local functions help avoid polluting types with a number of members that simply aren't generally useful. When I type . into the editor I want to see the members that are useful to me, not some one time helper that I should never be calling.

It's not that a might refer to multiple things, it's that you can't look at a single LOC within the body of a top-level function and have any idea what a is.

That is true today without local functions. In addition to a parameter or local the value a may refer to instance field, instance member, static field, or static member all of which can come from some combination of the type hierarchy, containing types or members brought in by a using static directive. Lookup is anything but simple.

jaredpar avatar May 25 '18 21:05 jaredpar

This has MANY speed advantages. We could declare a static lambda like this:

Object slow = "Any Object Here"

Action f = () => Console.WriteLine(slow); //obj is shadowed and slow in critical paths

Action f = static () => Console.WriteLine(slow); //Error slow not declared
Action f = static (x) => Console.WriteLine(x); //We dont capture, we pass as argument, which does not need to to have the exact same reference, and we dont capture more than we need.

dangi12012 avatar May 26 '18 00:05 dangi12012

@dangi12012

The proposal above doesn't touch on lambdas, although they are probably worth bringing up. You really wouldn't be able to avoid the allocation of the delegate itself, particularly if the delegate instance escaped the method in which it was declared.

HaloFour avatar May 26 '18 01:05 HaloFour

@HaloFour

This has MANY speed advantages.

Only for lambdas which this proposal doesn't currently cover. It's a natural extension to consider but the focus for now is on local functions. For local functions there really isn't a significant speed up static vs. non static in the general case.

jaredpar avatar May 26 '18 02:05 jaredpar

I think this thread presents a better case for improved capturing mechanisms of local functions rather than static modifier.

Example: instead of capturing in a compiler generated class the following could be viable

  • Generate a static method with parameters if variables are not mutated, the method is not assigned to a delegate and the number of parameters is small.

  • Generate ref parameters and ref returns if the variables are mutated and the method is not captured in a delegate and the number of parameters is small.

This does not give perfect control to those who whish it, but makes refactoring a lot easier compared to static non static local function.

As a second benefit, even local functions that are not marked static will benefit performance wise, by eliminating captures when possible.

popcatalin81 avatar May 26 '18 03:05 popcatalin81

Static delegates are semi-related (both to the static lambda discussion above and to the static local functions here): https://github.com/dotnet/csharplang/blob/master/proposals/static-delegates.md

tannergooding avatar May 26 '18 05:05 tannergooding

What about when you want to disable capturing but still have access to instance fields?

JamesNK avatar May 26 '18 10:05 JamesNK

@JamesNK I can see the merit of a "no capture, but instance" scenario... You're right in that this aspect of the proposal is being influenced - perhaps incorrectly - by the tempting convenience of that static keyword. Maybe that is the wrong keyword. But I'm at a loss for a better one.

Thinking pragmatically, perhaps static with the proposed semantic is fine - if you need it, just pass in this explicitly as an argument? Since a local function can't be overridden (polymorphism), virt-call seems an unnecessary step anyway.

mgravell avatar May 26 '18 10:05 mgravell

@popcatalin81

If a local function is not assigned to a delegate, a non-allocating capturing mechanism is already used. Though it's not done the way you suggest, instead, a struct with the state is created and passed to the local function by ref.

svick avatar May 26 '18 12:05 svick

@svick If it is implemented the way you say then there really is no need for a static Delegate. The only advantage would be that you cant create a heavyweight captured class with static by accident.

dangi12012 avatar May 26 '18 21:05 dangi12012

@mgravell

I can see the merit of a "no capture, but instance" scenario... You're right in that this aspect of the proposal is being influenced - perhaps incorrectly - by the tempting convenience of that static keyword. Maybe that is the wrong keyword. But I'm at a loss for a better one.

Let's just go through few of the options:

  1. Disallow this and pass it explicitly when you need it as noted in your post.
  2. Allow this and lose the ability to decide whether you need it but beyond that it might also confuse people.
  3. Introduce the concept of a capture list which was discussed and iirc was rejected.
  4. Use something like attributes but I think that it goes against the design of the language.
  5. Introduce a new keyword into the language but as far as I can tell people are struggling to come up with something that makes sense.

Looking at the options the first one looks like the most sensible approach to take.

iam3yal avatar May 26 '18 23:05 iam3yal

What about when you want to disable capturing but still have access to instance fields?

So here is a real world example of where no capturing + shadowing of variables would be useful, while still having access to instance fields is required.

https://github.com/JamesNK/Newtonsoft.Json/blob/ab67fd558f7f4cbfb43550e1379931fe54f7c183/Src/Newtonsoft.Json/Linq/JObject.Async.cs#L64-L74

JamesNK avatar May 27 '18 01:05 JamesNK

@JamesNK Good and all but what's the issue with something like this?

static async Task AwaitProperties(JPropertyKeyedCollection properties, Task task, int i, JsonWriter writer, CancellationToken cancellationToken, JsonConverter[] converters)
{
    // Do something with properties
}

AwaitProperties(_properties, ...)

iam3yal avatar May 27 '18 02:05 iam3yal

@eyalsk totally agree - the simplicity and obviousness of "1" (which is the original suggestion in the cross-reference issue) is a major selling point compared to all other options. Not perfect in every conceivably way, but a good balance of utility and simplicity.

mgravell avatar May 27 '18 08:05 mgravell

I have two questions related to this issue:

  1. What happens if IL generation / Jit is improved in the future and static local functions reasoning might go away? will this feature be left and then documented in the language as being there for historical reasons?

  2. And (I hate this question deeply, but I'm going to ask it for clarification reasons) Why is an analyzer insufficient for the scenarios listed as arguments for static local functions?

popcatalin81 avatar Jun 28 '18 13:06 popcatalin81