Introduce guessed receiver types
Motivation
This PR adds the experiment of guessed receiver types, where we try to guess the type of receivers based on their identifier.
Implementation
The relevant part of the implementation is all in TypeInferrer, everything else is just displaying to users why we picked a certain type.
The idea is to try to guess the types like this:
- Take the raw receiver slice
- Sanitize that name to be camel case and discard
@symbols - First, try to resolve the name inside the current nesting. If we find something, return that
- Otherwise, search for the first type that matches the unqualified name of the identifier
More details in the Markdown documentation.
Validation
I used Spoom's access to the Sorbet LSP to compare the guessed types vs the actual types informed by Sorbet. I also compared 4 approaches:
In the Ruby LSP repo, these are the accuracy results for each approach
- First resolve then fallback to unqualified name: 15% of correct types
- Unqualified only: 11%
- Resolve with nesting only: 9%
- Fuzzy search: 2% (in addition to being the worse accuracy, fuzzy search was also unbearably slow)
In Core, the analysis script took way too long to finish, so I sampled a subset of the codebase. The results there were worse than in the Ruby LSP codebase, peaking at about 5% of correct types.
Surely, the level of accuracy will vary a lot between different codebases. That said, I still believe the experiment would be worth the try and would love to hear feedback from users about the usefulness of this.
Script:
# typed: strict
# frozen_string_literal: true
require "spoom"
require "ruby_lsp/internal"
class Visitor < Prism::Visitor
extend T::Sig
sig { returns(T.nilable(RubyLsp::Document)) }
attr_accessor :document
sig { returns(Integer) }
attr_reader :total, :correct
sig { returns(T::Hash[String, T.nilable(String)]) }
attr_reader :comparison
sig { params(inferrer: RubyLsp::TypeInferrer, lsp_client: Spoom::LSP::Client).void }
def initialize(inferrer, lsp_client)
@inferrer = inferrer
@lsp_client = lsp_client
@total = T.let(0, Integer)
@correct = T.let(0, Integer)
@document = T.let(nil, T.nilable(RubyLsp::Document))
super()
end
sig { params(node: Prism::CallNode).void }
def visit_call_node(node)
receiver_loc = node.receiver&.location
return super unless receiver_loc
receiver = node.receiver
unless receiver.is_a?(Prism::CallNode) || receiver.is_a?(Prism::LocalVariableReadNode) ||
receiver.is_a?(Prism::InstanceVariableReadNode)
return super
end
hover = @lsp_client.hover(T.must(@document).uri.to_s, receiver_loc.start_line - 1, receiver_loc.start_column)
if hover
hovered_type = if /returns\((.*)\)/ =~ hover.contents
T.must(T.must(hover.contents.match(/returns\((.*)\)/))[1])
else
hover.contents
end
return super if hovered_type == "T.untyped" || hovered_type == "T::Private::Methods::DeclBuilder"
loc = T.must(node.message_loc)
node_context = T.must(@document).locate_node(
{
line: loc.start_line - 1,
character: loc.start_column,
},
node_types: [Prism::CallNode],
)
type = @inferrer.infer_receiver_type(node_context)
@total += 1
if type
parts = type.split("::")
parts.reject! { |e| e.include?("<Class:") }
corrected_type = parts.join("::")
if hovered_type.include?(corrected_type)
@correct += 1
end
end
end
super
end
end
index = RubyIndexer::Index.new
index.index_all
inferrer = RubyLsp::TypeInferrer.new(index)
workspace_path = Dir.pwd
client = Spoom::LSP::Client.new(
Spoom::Sorbet::BIN_PATH,
"--lsp",
"--enable-all-experimental-lsp-features",
"--disable-watchman",
)
client.open(workspace_path)
begin
visitor = Visitor.new(inferrer, client)
files = Dir.glob("#{workspace_path}/**/*.rb")
RubyVM::YJIT.enable
Signal.trap("INT") do
puts "Total: #{visitor.total}"
puts "Correct: #{visitor.correct}"
puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
client.close
exit
end
files.each_with_index do |file, index|
document = RubyLsp::RubyDocument.new(
source: File.read(file),
version: 1,
uri: URI::Generic.from_path(path: File.expand_path(file)),
)
visitor.document = document
Prism.parse_file(file).value.accept(visitor)
print("\033[M\033[0KCompleted #{index + 1}/#{files.length}")
end
puts "Total: #{visitor.total}"
puts "Correct: #{visitor.correct}"
puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
ensure
client.close
end
Automated Tests
Added tests.
Manual Tests
Type any existing class name as a variable. After typing a dot, you should see completion options for that type (e.g.: pathname.).
Another thing we could do, especially for the benefit of tests, is to match on a type name followed by a number, e.g. product_1.
Do you think we're able to package the script into a flag, like ruby-lsp --report-guess-type-accuracy? (return early if spoom is not available)
It will help us continuously evaluating this feature in the future, and we can ask some community users who also use Sorbet to give us result too.
Talked to Stan and we agreed to ship this and follow up with an executable to estimate the type accuracy of guessed types.