did_you_mean
did_you_mean copied to clipboard
exponential growth in did-you-mean based on object graph depth * width
I’m seeing exponential growth in time when generating message
. Reproduction and timings here:
https://gist.github.com/zenspider/4fbf1a5d237095c52184a57c86cdb5eb
Thanks for reporting. I was actually able to reproduce this issue without did_you_mean
:
diff --git a/trip_dispatcher.rb b/trip_dispatcher.rb
index 09ddc8e..56166ec 100644
--- a/trip_dispatcher.rb
+++ b/trip_dispatcher.rb
@@ -35,7 +35,7 @@ end
p width
(10..50).step(10).each do |depth|
t0 = Time.now
- Build.new(depth, width).go rescue $!.message # HACK: this MUST be accessed
+ Build.new(depth, width).inspect
puts "%3d = %7.3f" % [depth, (Time.now - t0)]
end
puts
Alternatively, you could also use the --disable-did_you_mean
option to disable it:
$ ruby --disable-did_you_mean trip_dispatcher.rb
The issue is that there is a complex circular reference between Graph
and Node
, and the #inspect
method on the Graph
object ends up creating an extremely large string object. #inspect
is smart enough to handle simple circular references, but in this particular case, it's not quite working as expected.
What is unexpected here though is that DYM is making the situation worse by calling the #message
method twice. This is a bit trycky to deal with as DYM's #to_s
method would have to pass the message object around to avoid extra memory allocation.
Apparently, JRuby handles this better than MRI but crashes once it runs out of memory. I'll look to see if the C function that gets invoked under the hood could be improved.
@yuki24 did you find any solution for this problem?
Unfortunately not. It's a bit hard to detect a circular reference in DYM and I'm not sure if there's anything we can do other than making DYM faster (which is also hard at this point).
I don't think that is a problem o DYM though, we can reproduce that without it just by using inspect
. I was just wondering if you find the source of the problem at least.
I think I'm running into this same issue with an object graph that contains a lot of back-references:
# Breakpoint set just before calling trace.message on a NameError
trace.receiver.inspect.length
# five minutes later...
# => 3542020365
# 3.54GB of text!
This is the line of code that hangs when it calls trace.message
: https://github.com/mike-bourgeous/mb-util/blob/891225432f51e9448a639e3a7d1f6af04ba9b187/lib/mb/util/text_methods.rb#L83
I'm going to try working around the issue by defining a simpler #inspect
on all of my classes.
Obviously this isn't the fault of DoYouMean, but does DYM need to use inspect
, or could it use another approach, e.g. concatenating instance_variables
and methods
and so on?
Should we file an issue with Ruby to get an implementation of inspect
that doesn't explode like this?
Also, could a means be provided for bypassing DYM at runtime?
Ruby's #inspect
is already smart enough to detect circular references:
$ ruby -v
ruby 2.7.3p183 (2021-04-05 revision 6847ee089d) [arm64-darwin20]
a = {}
a[0] = a
a.inspect # => "{0=>{...}}"
class Foo
attr_accessor :bar
end
foo = Foo.new
foo.bar = foo
foo.inspect # =>"#<Foo:0x000000014a37f668 @bar=#<Foo:0x000000014a37f668 ...>>"
The only places we use #inspect
are in:
https://github.com/ruby/did_you_mean/blob/9c4ccac47de4722ed22d1a33791e641bc4ba9664/lib/did_you_mean/spell_checkers/key_error_checker.rb#L10-L18
which I do not believe is as common as NameError
or NoMethodError
. As noted above, there is very little we can do to optimize this behaviour.
I just think that if you already know your objects have circular or back references, it would be your responsibility to make sure the #inspect
method wouldn't explode because of the object references.
Ruby's #inspect is already smart enough to detect circular references
huh... that was also me, but like in the 1.8 days... So maybe something has regressed performance wise in #inspect
since then?
For now I have removed the use of #inspect
method to avoid running into this situation.
@zenspider Would you mind creating a new gist for the code you uploaded 6 years ago if you still have it? I know it's been too long, but there may be something we could do to make #inspect
aware of circular references more accurately.
@yuki24 unfortunately, I don't think I have any of the context to reproduce the gist... Tho I might have backups somewhere that go back to then... lemme poke around, but I wouldn't hold my breath.