libdict icon indicating copy to clipboard operation
libdict copied to clipboard

Publish on crates.io?

Open baskerville opened this issue 5 years ago • 13 comments

This crate doesn't seem to be published on crates.io?

baskerville avatar Apr 19 '19 19:04 baskerville

This crate never passed the implementation draft state, mostly due to lack of time. If development would resume, deploying it to crates.io would be an option. Are you using it?

humenda avatar Apr 19 '19 21:04 humenda

Are you using it?

I was planning to.

baskerville avatar Apr 20 '19 07:04 baskerville

Are you using it? I was planning to. Ok. There are still a few rough edges and the memory-mapped access still needs sorting out. Ideally you could start using it from Git and we fix issues in the API as you encounter them. We would deploy a release afterwards.

humenda avatar Apr 20 '19 08:04 humenda

@baskerville You've worked on the crate quite a bit. It looks fine to me. I'm not sure whether the user should have the choice to use memory mapped files from the Dictionary struct or whether this should be transparent. But otherwise, we could make a release.

humenda avatar Oct 30 '19 10:10 humenda

Once #15 and #16 are merged, I think it would be fine to publish from the master branch. I haven't looked thoroughly into the mmap branch yet.

baskerville avatar Oct 30 '19 11:10 baskerville

I have rebased the mmap branch, but did not push it yet. I could not decide whether the user of the library should have the choice to use or not to use memory-mapped I/O for file access. I started working on a generic version that would allow both mmaped I/O and normal file access, but this feels like overdoing it. What do you think?

humenda avatar Oct 31 '19 12:10 humenda

I don't think it is worth trying to encapsulate both approaches.

What are the disadvantages of the mmap approach?

baskerville avatar Oct 31 '19 13:10 baskerville

There are two disadvantages when using memory mapped files:

  1. It requires a virtual address space, potentially a problem for dict servers with many databases on a 32 bit system. My quick calculations however state that this in a non-issue :).
  2. If the file is deleted or changed (e.g. an upgrade of the database), the corresponding signals need to be catched. I couldn't bother to implement this. But it shouldn't be too hard.

humenda avatar Oct 31 '19 13:10 humenda

It might be worth writing a benchmark (maybe a randomized lookup of all the terms in the example dictionary?). I'm worried about the performance improvements of mmap being unperceivable.

Won't the memory usage increase when mmap is used?

baskerville avatar Nov 01 '19 08:11 baskerville

It might be worth writing a benchmark (maybe

Sorry, I am lacking time for this. It is deemed to be faster since seeking in a file is a system call and hence a privilege switch. In contrast, mapped files get mapped by the kernel transparently and in more than 80 % of all cases, seeking and reading is not a system call and hence dramatically faster.

Won't the memory usage increase when mmap is used?

This is depends on your application and your point of view. On constraint systems this might indeed be the case. However, most modern operating systems cache frequently used files in RAM anyway so this wouldn't be an issue for those. However, when optimising a dict server for the common case, it might desirable to not memory-map infrequently used files. So thanks for the point, the library user should be able to decide which strategy to pick.

Are you willing to pick this up? I've got some code that I could tidy up and upload to the mmap branch or you could start from scratch on your own.

Thanks

humenda avatar Nov 01 '19 11:11 humenda

If the file is deleted or changed (e.g. an upgrade of the database), the corresponding signals need to be catched. I couldn't bother to implement this. But it shouldn't be too hard.

I don't think this is possible via signals. The only option that I see is using advisory file locks but that requires both parties to try and lock the file before doing anything to it. I'm currently doing it with advisory file locks on my end.

mscofield0 avatar Apr 14 '22 11:04 mscofield0

Could you please publish it to crates.io if you accept #18 and #19? @humenda

mscofield0 avatar Apr 14 '22 13:04 mscofield0

Could you please publish it to crates.io if you accept #18 and #19? @humenda

Sure.

humenda avatar Apr 14 '22 20:04 humenda