infinity icon indicating copy to clipboard operation
infinity copied to clipboard

[Question]: About the maximum amount of data that can be stored and processed

Open emocat17 opened this issue 9 months ago • 2 comments

Recently, I'm considering using a database for RAG and AI Search, and the amount of data to be stored might be extremely large (at the PB level), with high requirements for data retrieval accuracy. So, I have a few questions to ask:

  • What is the maximum amount of data that this database can currently handle?
  • If deployed locally, can this database make corresponding expansions and migrations when space is insufficient?
  • Can this database store multiple file formats? For example, TXT, XLS/XLSX/CSV, PDF, JPEG/JPG/PNG, BMP, DOC/DOCX, JSON, HTML?

I'm already aware of some performance comparisons between this database and Elasticsearch, but I still want to know if, in the case of large-scale data as I described earlier, it can handle the above issues better, more conveniently, and with higher accuracy than Elasticsearch?

THANKS!!!

emocat17 avatar Mar 07 '25 03:03 emocat17

  1. Depending on the disk and memory your machine. Infinity doesn't limit the capacity.
  2. We are developing the backup and restore function. Before of that, you can export the data as CSV/Parquet/JSONL format of files. But the indexes are not involved.
  3. This database stores the data of vector/full-text/tensor, but not the file.

The benchmark comparison of Infinity and ES we provided are tested on the same hardware configuration. On your question, we think the answer is YES: Infinity will be better.

JinHai-CN avatar Mar 10 '25 03:03 JinHai-CN

TNANKYOUVERYMUCH

emocat17 avatar Mar 10 '25 03:03 emocat17