v6d
v6d copied to clipboard
Improve the implementation of `find_most_precise_match`
Describe your problem
Improve the implementation of find_most_precise_match
using a trie-like data structure, rather than traverse a sorted dict in reverse order every time.
https://github.com/v6d-io/v6d/blob/43537d34a3bd598da66ac5fc28d5d400503a1594/python/vineyard/core/utils.py?_pjax=%23js-repo-pjax-container%2C%20div%5Bitemtype%3D%22http%3A%2F%2Fschema.org%2FSoftwareSourceCode%22%5D%20main%2C%20%5Bdata-pjax-container%5D#L20-L36
Additional context
Add any other context about the problem here.
Maybe I can try this. After this week of final exam, ε=ε=ε=┏(゜ロ゜;)┛
Maybe I can try this. After this week of final exam, ε=ε=ε=┏(゜ロ゜;)┛
Hi @Y-jiji Thanks for being interested in this issue! I would add more detail about the issue later today.
The find_most_precise_match
in the following way:
Image we have two object types vineyard::python::Tensor
and vineyard::python::DataFrame
(note that they have the same prefix/namespace vineyard::python
). Now we have a generic python value resolver that has the following signature:
def python_value_resolver(obj: vineyard.Object) -> object:
And has been registered as
get_resolver_context().register('vineyard::python', python_value_resolver)
Now we have a specialized resolver which takes a vineyard::python::Tensor
and resolves it to pytorch.Tensor
, then we could have the following resolver:
def pytorch_tensor_resolver(obj: vineyard.Object) -> object:
get_resolver_context().register('vineyard::python::Tensor', pytorch_tensor_resolver)
Consider we have a vineyard object with type vineyard::python::Tensor
, then we need to find the most specified resolver for it (the pytorch_tensor_resolver
in this case). We currently achieve the goal by use a sorted dict as the registry, and traverse the dict in the reversed order to find the first matched resolver (the resolver key is the prefix of the object type name), that is exactly how current find_most_precise_match
works.
You can see that there would be many redundant comparison in the process and it can be optimzied by some string processing/search tricks to make it faster.