marisa-trie icon indicating copy to clipboard operation
marisa-trie copied to clipboard

Needs find-longest key method

Open KCzar opened this issue 10 years ago • 5 comments

Or we could just call it a "longest" method - or "prefix" method (singular). Is there an efficient way to find the longest key that is a prefix of a string that I'm overlooking? It could be done with the current implementation by simply using prefixes, then finding the longest match, but there should be a much more efficient way possible by taking advantage of the Trie properties.

I can give it a go if you like - unless it's already implemented and I'm simply missing it.

KCzar avatar Oct 12 '14 18:10 KCzar

Not sure if I want the same thing, but it sounds like it -

given

Trie([u'some.long.name', u'some.short.name']) I'd like a way to get back

[u'some.long', u'some.short'])

stuaxo avatar Apr 28 '16 09:04 stuaxo

I'm not sure if this is possible using the public API of marisa-trie the C++ library. Need to check and then maybe forward the issue upstream.

superbobry avatar May 05 '16 15:05 superbobry

When I originally posted that, I for some reason didn't realize 'prefixes' returned them in order (duh), so what I needed was basically already implemented satisfactorily.

What you stuaxo want is something else. I'm not completely sure what you want though...

KCzar avatar May 05 '16 16:05 KCzar

In the end, I implemented this using a different library, + what I needed turned out to be slightly different - I needed something like this:

t = Trie([u'some.long.name', u'some.short.name'])
t.longest_unique_prefixes()
u['some.']
t = Trie([u'some.long.name', u'some.short.name', 'blah.1', 'blah.2'])
t.longest_unique_prefixes()
u['some.', 'blah.']

stuaxo avatar May 05 '16 16:05 stuaxo

^ I guess what I ~~wanted~~ needed here is the 'roots' of the Trie.

stuaxo avatar May 05 '16 16:05 stuaxo