mochi

Results 19 comments of mochi

Using the modified code to test the new version of pgvecto.rs, I found that the QPS is very low. I need to investigate further to determine the reason.

I have another question to ask. After version `v1.1.0`, qdrant can use `quantization_config` to create an index. Below is the index configuration I used. The dataset used is still 5...

When I disable the `optimizers_config` option, qdrant consumes 22GB of memory for a 5 million 768-dimensional vector datasets with only the `quantization_config` configuration. Is this expected? ```json "quantization_config": { "type":...

@PSeitz Thank you for your response. I tried using the `raw tokenizer` locally, but I encountered a problem. The `raw tokenizer` can only be applied to text columns without spaces....

I added some `println` statements in the `tokenizer` of Tantivy, and I noticed that the `index_writer` indeed does not tokenize the string during the writing process. However, when executing a...

Thank you very much for your answer, my problem has been solved.💗❤️

Translation: I want to know the difference between these two pieces of code: `parse_query("\"Alick a01\"")` and `parse_query("Alick a01")`. Can it be understood that, under any tokenizer condition, `parse_query("\"Alick a01\"")` will...

@PSeitz Additionally, I've noticed that when indexing the following two phrases, searching for one of them yields related results for both sentence 1 and sentence 2. However, if the word...

Here, I have provided a reproducible code snippet, where `str_vec` stores two sentences, both containing the word `'a'`. During the search process, when searching for the first sentence, the `query...

So how should I apply stop words during the text indexing stage and searching stage? I have tried using the `en_stem` tokenizer for indexing, but the results are still consistent...