jruby icon indicating copy to clipboard operation
jruby copied to clipboard

Including a module using Include_package does not immediately bring names in it to new context

Open jmiettinen opened this issue 6 years ago • 9 comments

Environment

jmiettin% bin/jruby --version                                          
jruby 9.2.5.0 (2.5.0) 2018-12-06 6d5a228 Java HotSpot(TM) 64-Bit Server VM 25.201-b09 on 1.8.0_201-b09 +jit [darwin-x86_64]
jmiettin% uname -a
Darwin jmiettin.local 18.2.0 Darwin Kernel Version 18.2.0: Thu Dec 20 20:46:53 PST 2018; root:xnu-4903.241.1~1/RELEASE_X86_64 x86_64

Nothing relevant installed, just using the bare-bones JRuby installed with ruby-build.

Expected Behavior

  • My expectation is that using include_package within a module brings all the classes in that package into that module and when that module is then included elsewhere, those names are also brought there.

Actual Behavior

  • The classes added by include_package to Kernel are not immediately available. They show up available after they're first used through the module they're included into.
  • The script to reproduce this is shown in the end.

The following script prints:

Unable to find AtomicInteger
Able to find AtomicInteger

When I'd expect it to print one of those messages twice.

The script to reproduce:

# encoding: utf-8
module Foo
  include_package 'java.util.concurrent.atomic'
end

module Bar
  include Foo
end

begin
  Bar::AtomicInteger.new(2).get
  puts 'Able to find AtomicInteger'
rescue NameError => _
  puts 'Unable to find AtomicInteger'
  # Should this perhaps work?
end
Foo::AtomicInteger.new(2).get
# This works now. Should it?
if Bar::AtomicInteger.new(2).get
  puts 'Able to find AtomicInteger'
else
  puts 'Unable to find AtomicInteger'
end

jmiettinen avatar Feb 06 '19 15:02 jmiettinen

The behavior you're seeing (unfortunately) fits what I would expect.

Bar::AtomicInteger fails the first time because the AtomicInteger constant has not been defined, and include_package defined its const_missing on Foo, not on Bar.

Foo::AtomicInteger works as expected.

Bar::AtomicInteger now works the second time because Foo::AtomicInteger is now defined, and Bar includes Foo.

It would be great if we could just fill in all the constants with classes from a given package. Sadly, there's not actually a way to do what you (and we) would like this to do.

Because new classes may be loaded into the JVM at any time, there's no API to get a complete listing of all classes in a given package. That's why this works using const_missing; we don't know until a class is requested whether it exists or not.

There are ways to hack around it, like reading every single entry in every jar file that's loaded and inferring from .class files that there are Java classes...but they're not super reliable and usually require maintaining a cache somewhere (since building that list of classes can be expensive).

There may also be more capabilities for querying a package in Java 9+, at least for packages that are properly described and modularized. However I don't think we'll see that as a typical case for many years.

headius avatar Feb 06 '19 15:02 headius

I can understand that from the implementation point of view. In JVM, classes can be created out of thin air (same for Ruby-classes).

It's just the that semantics of include_package are not very clear to me. It's implemented through const_missing, but is that the definition? Would it be surprising if such changes (new places to look constants from) would transfer to other contexts where such a module is included as that's what happens if we've already used the constant through the original module?

I am not sure if there are any similar methods in MRI Ruby.

jmiettinen avatar Feb 06 '19 15:02 jmiettinen

We certainly could define different semantics for include_package, but we are somewhat limited by the available hook mechanisms. I have not researched the state of the art in querying for all JVM classes in a given package...there may be a solution that direction.

The behavior you are seeing does match const_missing because that's really all it does. Here's the code for include_package: https://github.com/jruby/jruby/blob/master/core/src/main/ruby/jruby/java/core_ext/module.rb#L10-L50

So, there's two directions this could go. If we can find a way to discover all classes, we could just define the constants. If we can't do that, then perhaps there are behavioral tweaks that make the behavior of include_package less surprising?

headius avatar Feb 06 '19 15:02 headius

FWIW, even javac works this way; importing all classes from a package with * just forces javac to search all starred packages in turn for any unresolved class names you use. Unlike Ruby, they have a compile phase where they can make that determination.

headius avatar Feb 06 '19 15:02 headius

This will sound snarky to newer users but I sometimes think we should remove include_package because it just represents something difficult to do with Java. It creates confusion and by the time we explain it they just explicitly pull in the classes they use. All new users assume this is an analogue to import something.* but if you don't understand how javac resolves that stuff you think there is some reflective capability in Java to get this info.

enebo avatar Feb 06 '19 15:02 enebo

As a followup to my last comment I think it exists mostly because we really wish it was available and people assume there is a way to do it. The fact is misses expectations ends up creating some amount of frustration. Even if we were really good at documenting stuff, I think, people would assume behavior based on the name alone withoutreading what it actually does. Anyways some random thoughts since we see this confusion happen over and over.

enebo avatar Feb 06 '19 16:02 enebo

@enebo Perhaps a better name? include_package does make it sound like it pulls in everything, like an include. Maybe something like delegate_to_package or lazy_package_import?

headius avatar Feb 06 '19 16:02 headius

@headius yeah I don't know. lazy definitely would imply something different than what include_package implies. To be honest some people love to use include_package for tiny scripts which are just java classes. OTOH people then also get confused when they to it from Object and cannot figure out the naming conflicts with String/Integer/etc...

enebo avatar Feb 06 '19 16:02 enebo

I built up a jruby application via (I figured out I have to use java_import rather than import):

java_import javax.swing.JButton
java_import javax.swing.JFrame
java_import javax.swing.JLabel
java_import javax.swing.JPanel
java_import javax.swing.JTextArea
java_import javax.swing.JScrollBar
java_import javax.swing.JTextField
java_import javax.swing.JSpinner

I thought I could simplify this to that:

java_import javax.swing.*

That works in vanilla java yes? That we can use a *.

Instead jruby told me to use include_package. Hmmm.

Would it be possible to use '*' as an automatic call to include_package internally? We now kind of have to remember 3 different ways for jruby ... the old import variant... then java_import ... and include_package.

So perhaps that should be simplified in the long run.

rubyFeedback avatar Dec 20 '21 00:12 rubyFeedback