milli icon indicating copy to clipboard operation
milli copied to clipboard

Improve and simplify operations on external documents ids

Open Kerollmops opened this issue 4 years ago • 1 comments
trafficstars

We need to remove the soft and hard external documents ids maps because it is just a lot of code to maintain that is not worth it. That is only interesting on the first additions of a small number of documents.

Kerollmops avatar Jan 25 '21 14:01 Kerollmops

Now that we added soft deletion into milli, this ExternalDocumentsIds struct is even useless. I would like to remove it and just replace it with a single fst::Map. It would be easier to understand and work with than a double soft + hard combo.

One requirement is a performance test just to make sure it doesn't impact the indexation time when adding or removing documents.

Kerollmops avatar Sep 27 '22 15:09 Kerollmops

I'd like to try this

yenwel avatar Oct 12 '22 09:10 yenwel

Hello @yenwel

Thanks for your interest in this project 🔥 You are definitely more than welcome to open a PR for this!

FYI, we prefer not assigning people to our issues because sometimes people ask to be assigned and never come back, which discourages the volunteer contributors from opening a PR to fix this issue. We will accept and merge the first PR that fixes correctly and well implements the issue following our contributing guidelines.

We are looking forward to reviewing your PR 😊

curquiza avatar Oct 12 '22 10:10 curquiza

Hi I'm having problem getting started following the contributing guidelines. If I try to do cargo run --release I get the error "error: a bin target must be available for cargo run"

yenwel avatar Oct 13 '22 11:10 yenwel

Hum weird, I don't have the issue

Two possibilities

  • update your rust version: rustup update
  • maybe you have launched cargo run --release in a sub folder of the project. Run it at the root of Meilisearch

curquiza avatar Oct 13 '22 11:10 curquiza

I run this at the root of the project Milli. If I run in the subproject of e.g. cli this works fine.

yenwel avatar Oct 13 '22 11:10 yenwel

Oh sorry, I realized I confused this repo with another one! In this repo, we indeed not have any binary to create, you need to launch directly the tests with cargo test, I will update the CONTRIBUTING!

curquiza avatar Oct 13 '22 11:10 curquiza

Oh snap I should have launched a PR myself to get some hacktoberfest points

yenwel avatar Oct 13 '22 11:10 yenwel

Oh sorry! You are right! I wanted to be quick to avoid other confusion, but I should have asked you if you wanted to do it However you have a lot of issues that are still unsolved in our organization if you are interesting in contributing 😄 https://github.com/search?q=org%3Ameilisearch+label%3Ahacktoberfest&state=open&type=Issues

curquiza avatar Oct 13 '22 11:10 curquiza

@Kerollmops do you want to completely remove the struct or replace its internal dual structure of soft and hard maps with just one map. Just deleting it's seems not trivial since it also now has a member for the softdeletedids.

yenwel avatar Oct 13 '22 14:10 yenwel

@yenwel I would like to replace it with a single map, please.

Kerollmops avatar Oct 13 '22 15:10 Kerollmops

@yenwel I would like to replace it with a single map, please.

Here's the replacement of soft and hard with a single map: https://github.com/meilisearch/milli/pull/666

tbd: need to figure out benchmarking in https://github.com/meilisearch/milli/tree/main/benchmarks

msvaljek avatar Oct 17 '22 19:10 msvaljek