ruby-style-guide
ruby-style-guide copied to clipboard
`extend self` and `module_function` are not the same
extend self and module_function are not the same. The guide says:
Favor the use of module_function over extend self when you want to turn a module's instance methods into class methods. https://github.com/bbatsov/ruby-style-guide#module-function
Unfortunately, extend self keeps the instance methods public, but module_function makes them private.
# `extend self` makes the copies public
module Pub
extend self
def x; end
end
class PubClass
include Pub
end
PubClass.new.x #=> nil
# `module_function` makes the copies private
module Priv
def x; end
module_function :x
end
class PrivClass
include Priv
end
PrivClass.new.x
# NoMethodError: private method `x' called for #<PrivClass:0x007fba1c0e7e90>
Is making them private the point?
Per the ruby documentation:
The instance-method versions are made private. http://ruby-doc.org/core-2.2.2/Module.html#method-i-module_function
There is another difference that might matter:
No method copying: If you want to modify a method's behavior via meta-programming, you only need to do this in one place
(from http://idiosyncratic-ruby.com/8-self-improvement.html#advantages-of-extend-self)
Can someone from maintainers give an answer on this question? Why do you need to prefer one functionality over another. How does it go together?
I'm still hoping to know why as well.
Confusing me too
I don't see any situation where extend self would actually be the correct choice.
- You have a module that provides a utility method such as
Math.pi - You want to use
piin the instance methods of your own class without having to typeMath.piall the time
Is there any other situation where you would want to take a module method like Math.pi or FileUtils.mkdir_p as an instance method in your own class?
A: You create a local method def pi; Math.pi; end
- Constant lookup + Double method lookup on use
- It makes no sense to have your instances offer
.piso you make it private, now you have the same situation as with module_function but with extra lookups.
B: You include Math
def self vs extend self vs module_function
using def self.pi
include Math will not give you any sort of accessor to .pi, not YourClass.pi, not instance.pi, nothing. You must use Math.pi and that's it.
using extend self with public methods
include Math will give your instances a public instance method .pi.
using extend self with private methods
include Math will give you a private instance method .pi but Math.pi now says "private method pi called"
using module_function :pi
include Math will give your instances a private .pi method and Math.pi still works just fine. You can do something like def pi; super + 1; end in your class to overload it or make it public.
When to use what?
- You want to provide a module that adds instance methods to whatever is including it:
module Foo
private # if you want them to be private
def foo
end
end
- You want to provide a module that adds instance methods and has some module level utility methods of its' own that should not end up into any instances of classes that include the module:
module Foo
# won't appear as YourClass.foo or your_instance.foo when you `include Foo`
def self.foo
"foo"
end
def bar
"the bar is open"
end
end
- You want to provide a module that offers singleton/utility methods. If someone happens to include your module, it does pretty much what you would expect it to do. This is what stdlib does.
module Foo
# Foo.all_good? works, including Foo will give instances a private all_good? method
module_function
def all_good?
true unless false
end
end
This got a bit lengthy and off topic. After writing this, my solution to the original issue would be to change it to "never use extend self".
@kke but what if I need private methods in the module?
@ifokeev you shouldn't need instance methods in the module. Private module methods can probably be achieved with ~~private and module_function~~ and your code will still work if included.
Edit: nope, use module_function and private_class_method
@kke show me, please, the correct way of using module_function if I have one public and 10 private methods
@ifokeev You need to use private_class_method in addition.
module Bedtime
def now?
after_midnight?
end
module_function :now?
def after_midnight?
Time.now.hour < 12
end
module_function :after_midnight?
private_class_method :after_midnight?
def work_in_the_morning?
(0..4).cover?(Time.now.wday)
end
module_function :work_in_the_morning?
private_class_method :work_in_the_morning?
end
class Test
include Bedtime
def print
puts now?
end
end
# Test.new.print => works
# Test.new.now? => private method now? called
# Test.now? => undefined method 'now?'
# Bedtime.after_midnight? => private method after_midnight? called
# Bedtime.now? => works
Btw, FileUtils in stdlib has defined private_module_function that does that with single line: https://apidock.com/ruby/v1_9_3_392/FileUtils/private_module_function/class
@kke your code doesn't look good. That's why I prefer extend self. I didn't find any other solution.
And your code doesn't work correctly :)
You can write it a bit shorter:
module Foo
private_class_method def bar(foo)
puts "yo #{foo}"
end
module_function :bar
end
Maybe:
module PrivateModuleMethod
def module_private(sym)
module_function sym
private_class_method sym
end
end
module Foo
extend PrivateModuleMethod
module_private def foo(bar)
puts bar
end
end
I suppose that with some trickery it could even be def module_private.foo(bar)
But that's another story. The story here is that extend self almost never does what you actually wanted
(or should want) to do. It's pretty but incorrect.
After 14 months, this issue has not attracted the attention of any maintainers, so I'm closing it. Thanks to everyone who provided feedback. I think if someone were to summarize the discussion here, a useful PR could be written.
After 14 months, this issue has not attracted the attention of any maintainers, so I'm closing it. Thanks to everyone who provided feedback. I think if someone were to summarize the discussion here, a useful PR could be written.
The maintainers are extremely busy. PR welcome!
module_function does not work with attr_accessor, visibility modifiers and methods which define new ones (ex, ActiveSupport's Module#delegate). So module_function is useful only for very simple cases, while extend self provides robust features.
@jaredbeck @bbatsov Should not this rule be removed?
I wouldn't mind resurrecting this thread.
tldnr: I would reverse that rule, i.e. favor extend self and avoid module_function, or at least not provide any definite recommendation and list the caveats of each if there is disagreement.
I ❤️ extend self.
First, it adds zero cognitive load. Everyone knows what extend does (or should), everyone knows what self is within a Module (or should). module_function requires extra knowledge and having a precise mental model of what it does is difficult.
Failures
A major plus for extend self: it works 🎉
module_function does not work as expected in many cases:
module MyMod
def foo
end
module_function # called too late to work for :foo
include OtherMod # won't work
attr_reader :bar # won't work
alias baz foo # won't work
delegate ... # won't work
def_delegator ... # won't work either
end
Visibility
To illustrate visibility issues, here's a typical example in the wild:
module Concurrent
def dataflow(*inputs, &block)
dataflow_with(Concurrent.global_io_executor, *inputs, &block)
end
module_function :dataflow
def dataflow_with(executor, *inputs, &block)
call_dataflow(:value, executor, *inputs, &block)
end
module_function :dataflow_with
private
def call_dataflow(method, executor, *inputs, &block)
# ...
end
module_function :call_dataflow
end
extend self
What if instead of the bunch of module_function, the author had used extend self?
The only downside of extend self is that if a user actually include Concurrent, then dataflow and dataflow_with are public.
I don't see this as an important downside because I fail to see when or why anyone would call some_instance.dataflow instead of Concurrent.dataflow, but if it happens, it's the same public code being called anyways.
For anyone who really cares, it's easily remedied too:
class MyClass
include Concurrent
private *Concurrent.instance_methods
end
# or the module can do it itself:
module Concurrent
extend self
def self.included(base) # Mark our methods as private for anyone including us
base.private *instance_methods
end
end
module_function gotchas
Did you spot the problem in the example code?
The main issue is that one can actually call Concurrent.call_dataflow, even though that method is meant to be private. This feels like a more important issue than someone calling a public method through an instance.
I find the multiple calls to module_function ugly and error prone (best not forget a single one!). Note that if you instead call module_function at the beginning, it won't work either, because confusingly module_function ignores private methods:
module Mod
module_function
private def private_1 # Singleton method will be created, but will be public
end
private
def private_2 # No singleton method created at all!
end
end
Note: private_1 is not ignored because it is first created as public and then right after is made private, while private_2 is created private and is thus ignored unless one calls module_function :private_2 explicitly 🤯
Conclusion
I personally never use module_function and I had to review what it did exactly before commenting here. I will now proceed to forget its confusing intricacies and continue using extend self
@bbatsov did you get a chance to read this?
Just now.
While I no longer remember the original reasoning I do recall it was a combination of several factors:
- some of the points made by @kke
- the fact that with
extend selfyou end up with two version of each method, which is rarely what you need in a utility type of module - the
extend selfidiom is confusing for many people; obviously that's subjective and impacts the most people without much Ruby experience. On the other handmodule_functionclearly communicates the intent, at least IMHO.
Looking at the examples, however, it seems that many people are discussing modules that go beyond a pure utility module where module_function excels, so at the very least we should clarify this aspect of the recommendation and the reasoning behind it. I definitely won't argue there are cases where extend self will be needed or will be the preferable approach.
First, it adds zero cognitive load. Everyone knows what extend does (or should), everyone knows what self is within a Module (or should). module_function requires extra knowledge and having a precise mental model of what it does is difficult.
Well, in my experience that has rarely been the case. :laughing:
The only downside of extend self is that if a user actually include Concurrent, then dataflow and dataflow_with are public.
Yeah, that's what I meant in the second point.
Btw, these days you can also use module_function like this:
module_function def dataflow(*inputs, &block)
dataflow_with(Concurrent.global_io_executor, *inputs, &block)
end
@jaredbeck
Is making them private the point?
Yep, that was the point. :smile:
As remarked earlier I didn't expect people would be including utility modules, as those are typically used stand-alone. Obviously, if also plan to re-use some methods in classes the module_function advice doesn't make much sense.
Yep, [making them private] was the point. 😄
Cool. Yeah, are we ready to turn this discussion into a PR, and add an explanation to this rule?
I have a case with modules inheritance, without classes.
module BaseSpecHelper
def environment_specs_coefficients
{
-> { ENV['CI'] } => 1,
-> { RUBY_PLATFORM == 'java' } => 3,
-> { Gem::Platform.local.os == 'darwin' } => 1
}
end
extend self
end
# some specs using `BaseSpecHelper.environment_specs_coefficients`
module CLISpecHelper
include BaseSpecHelper
def environment_specs_coefficients
super.merge(
lambda do
RUBY_PLATFORM == 'java' && ENV['CI'] && is_a?(Filewatcher::Spec::ShellWatchRun)
end => 2
)
end
extend self
end
# some specs for CLI
I don't need classes here, just scoped static methods.
I'd be glad to resolve it with module_function, class << self or something else, but I didn't find any way do this except extend self.
So, extend self seems pretty uniq and if it's so — there is should not be any cop with suggestion to replace it, at least without surrounding code analysis.
I agree with the above. I have an issue with Module B including Module A. I then have Class A including Module B.
See this example:
module ModuleA
private_class_method
module_function
def insert
puts "inserting"
end
end
module ModuleB
include ModuleA
private_class_method
module_function
def select
puts "selecting"
end
end
class ClassA
include ModuleB
# lets consider this a base class
def common_base_class_method
puts 'base class'
end
end
class ClassB
include ClassA
def run
select
insert #fails here with NoMethodError!!!
end
end
I can't use Module A function in Class A with module_function. I need extend self.
If you're trying to prevent people from creating 2 copies of methods, move this rule to RuboCop-performance, please.
I think there's enough of an argument that extend self and module_function are NOT the same and doesn't work in some cases.
Please undo this or provide a way to disable this in the rubocop.yml file.
or provide a way to disable this in the rubocop.yml file.
You can do it:
Style/ModuleFunction:
Enable: false
Or even do inline a code:
# rubocop:disable Style/ModuleFunction
extend self
# rubocop:enable Style/ModuleFunction
@AlexWayfer Here you go!
module BaseSpecHelper
module_function
def environment_specs_coefficients
1
end
end
module CLISpecHelper
extend BaseSpecHelper
module_function
def environment_specs_coefficients
super + 1
end
end
BaseSpecHelper.environment_specs_coefficients # => 1
CLISpecHelper.environment_specs_coefficients # => 2
I'm not insisting to burn extend self with fire. @marcandre's arguments show a number of serious culprits of module_function.
Still, I believe both extend self and module_function have their usages.
@pirj: Your example is error prone:
class Anything
include CLISpecHelper
end
Anything.new.environment_specs_coefficients # => Error, no `super` method
both extend self and module_function have their usages.
Which ones are they?
It's one datapoint, but I've survived all my Ruby programming years so far without module_function but I can't with extend nor with self.
Again - I'll clarify the this guideline about module_function was always intended for the case of a module with functions (class methods) that are only meant to be used directly (with the module as the receiver). I see here that most people are discussing including modules in classes and what happens then, which is completely orthogonal IMO. It may be that my experience was such that I never wanted to mixin class methods as instance methods, but I do understand such use-cases.
One thing is clear - we need more precises wording at this point. :-)
@marcandre
Anything.new.environment_specs_coefficients
NoMethodError: private method `environment_specs_coefficients' called for #<Anything:0x000056243acd4768>
and this is intentional. environment_specs_coefficients is intended to be called on CLISpecHelper/BaseSpecHelper, not to be included.
Just tested on 2.5 and 3.0. I was frightened this might have changed at some point of time.
The purpose of this is to pass BaseSpecHelper or CLISpecHelper as an argument, so those methods can be called on it. Not to be included so that those methods are called on self.
Sorry, forgot that they would be private. So a method from Anything calling environment_specs_coefficients would trigger an error.
class Anything
include CLISpecHelper
def foo
environment_specs_coefficients
end
end
Anything.new.foo # => no super defined
Again - I'll clarify the this guideline about
module_functionwas always intended for the case of a module with call functions that it's meant to be used directly.
It's not clear to me what use case there is. For example, are there any example of uses of module_function in the RuboCop source where extend self wouldn't be at least as good if not better?
If they are truly meant to be used directly, and only directly, why not define them as singleton methods directly?
module Example
class << self
def call_me_directly
#...
end
end
end