marisa-trie
marisa-trie copied to clipboard
Needs find-longest key method
Or we could just call it a "longest" method - or "prefix" method (singular). Is there an efficient way to find the longest key that is a prefix of a string that I'm overlooking? It could be done with the current implementation by simply using prefixes, then finding the longest match, but there should be a much more efficient way possible by taking advantage of the Trie properties.
I can give it a go if you like - unless it's already implemented and I'm simply missing it.
Not sure if I want the same thing, but it sounds like it -
given
Trie([u'some.long.name', u'some.short.name'])
I'd like a way to get back
[u'some.long', u'some.short'])
I'm not sure if this is possible using the public API of marisa-trie
the C++ library. Need to check and then maybe forward the issue upstream.
When I originally posted that, I for some reason didn't realize 'prefixes' returned them in order (duh), so what I needed was basically already implemented satisfactorily.
What you stuaxo want is something else. I'm not completely sure what you want though...
In the end, I implemented this using a different library, + what I needed turned out to be slightly different - I needed something like this:
t = Trie([u'some.long.name', u'some.short.name'])
t.longest_unique_prefixes()
u['some.']
t = Trie([u'some.long.name', u'some.short.name', 'blah.1', 'blah.2'])
t.longest_unique_prefixes()
u['some.', 'blah.']
^ I guess what I ~~wanted~~ needed here is the 'roots' of the Trie.