erlexec
erlexec copied to clipboard
feat: Integration of ClickHouse storage
Goals:
ClickHouse allows quick insertion and searches over a large quantity of data. We use Clickhouse internally for search over a large corpus of embeddings. As Docarray provide a great pythonic API to manipulate unstructured data we decided to integrate it with CLickHouse for our internal needs. But I hope this PR will be also valuable for other engineers.
Empirically all functionality of Docarray works properly with ClickHouse. But ClickHouse is not transactional DB, therefore it does not provide atomicity of operations. Because of this many tests in tests/unit/array/test_advance_indexing.py do not pass for now. I'm looking for a way how to overcome this.