tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

`TokenizerManager` name is a bit misleading

Open fmassot opened this issue 2 years ago • 3 comments

The TokenizerManager is, in fact, more a TextAnalyzerManager:

pub struct TokenizerManager {
    tokenizers: Arc<RwLock<HashMap<String, TextAnalyzer>>>,
}

I would be in favor of a renaming though I don't fully understand the implications.

What do you think?

fmassot avatar Mar 20 '23 20:03 fmassot

Admittedly nothing to do with the name, but since it already is part of the dependency closure, I wonder whether ArcSwap<HashMap<String, TextAnalyzer>> is a better fit for this mostly-initialization-then-just-reading data structure?

adamreichold avatar Mar 20 '23 20:03 adamreichold

Ok with me.

fulmicoton avatar Mar 21 '23 01:03 fulmicoton

I think we should change that to

pub struct TokenizerManager {
    tokenizers: ArcSwap<HashMap<String, Box<dyn Tokenizer>>>,
}

While TextAnalyzer is actually a TokenizerBuilder

PSeitz avatar Mar 21 '23 01:03 PSeitz