UrsaDB process gets killed on big set of files
Environment information
- Mquery version (from the /status page): 1.2.0
- Ursadb version (from the /status page): 1.4.2+afe5144
- Installation method:
- [ ] Generic docker-compose
- [ ] Dev docker-compose
- [x] Native (from source)
- [ ] Other (please explain)
Reproduction Steps I have successfully installed mquery on bare-metal machine with the following config: OS: Ubuntu 18.04 CPU: 4 cores, 8 threads RAM: 4GB Storage: ~1TB
It works fine and smooth on small sets of files, but I tried to test it on ~16.000 PE samples (~25GB) and after some time UrsaDB process gets killed (I will attach a screenshot below).
Expected behaviour
Mquery successfully indexes 25GB set of samples and allows to run queries on it.
Actual behaviour the bug
UrsaDB process gets killed some time after pressing "reindex" button, so it is impossible to index such set of samples.
Screenshots

Additional context
Maybe the problem is with the configuration of a machine itself (not enough RAM, etc.), then I would be grateful if you could point at possible issue here, or provide minimal requirements for proper work of mquery. I have also read about utils.index method and will test it, but it would be great to use the standard way of reindexing though.
UPD: process of indexing batch files with utils.index also gets killed in UrsaDB.
Sorry for not responding to this issue earlier. I didn't know how to reproduce it (and forgot about it later), but that's not an excuse for ignoring it. I was also not active in this project for some time.
Some thoughts:
- I (or other people) routinely index millions of files with mquery/ursadb. So it's certainly possible
- How much RAM do you have? It's possible that you can trim down ursadb's configuration a bit (by default it assumes quite a lot of RAM is available)
- I use utils.index script for most of my indexing needs (it's nice for large datasets, because - in contrast to the raw indexing method, which is transactional - it can be stopped in the middle and resumed)
I realise it's probably not important for you anymore. In this case, if the problem turns out to be non-reproducible, I think I'll have to close the issue unresolved.