Wolf Garbe comments

Results 61 comments of


                                            Wolf Garbe

docker container 0.12.11 & 0.12.15: 100% cpu usage

@ricdtech I can't replicate the issue. I get 0% CPU usage with Docker Desktop on Windows 11. docker stats ``` CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM...

request for comprehensive documentation and examples for library usage

You are right, there is much room for improvement. But unfortunately, the day has only 24 hours. We are currently working on the documentation, but this time on the REST...

request for comprehensive documentation and examples for library usage

> I didn't know how to use advanced queries, such as returning all matching queries. When I used Length=int.Max in the query, the server crashed, and 0 resulted in no...

multi-language tokenization

English, German, and Russian are supported. Japanese, Korean, and Chinese are currently only supported if both documents and queries are pre-tokenized by a tokenizer like https://github.com/messense/jieba-rs in a pre-processing step....

multi-language tokenization

[SeekStorm v0.11.0](https://github.com/SeekStorm/SeekStorm/releases/tag/v0.10.0) has been released. The new tokenizer UnicodeAlphanumericZH implements Chinese word segmentation. ![image](https://github.com/user-attachments/assets/c697b4c4-e27b-472f-b30a-746a87dfed2a)

multi-language tokenization

@inboxsphere What would be your use case? Prefix/substring search? Or something else? For word segmentation a specialized word segmenting algorithm is more efficient than n-gram tokenizing. I'm afraid the index...

multi-language tokenization

@inboxsphere I see. The Chinese tokenizer (UnicodeAlphanumericZH) already handles mixed Chinese/Latin text. We could extend this so that when unknown (not in the Chinese dictionary) and non-Latin words (different Unicode...

Distributed key value store backend support

I'm planning to do something similar with an S3 object storage compatible index (cloud-native split of storage and compute) https://github.com/SeekStorm/SeekStorm?tab=readme-ov-file#roadmap Seekstorm has both an index (inverted index that stores posting...

Distributed key value store backend support

> Given SeekStorm's architecture of writing 50MB blocks and the constraint of 100KB max value size in FoundationDB, do you think SeekStorm could be adapted to split its index into...

Distributed key value store backend support

To add support for distributed key-value stores as a backend for SeekStorm, we need to solve two tasks: 1. **Write/read** the 50MB data blocks (per index level, per index segment)...