patelprateek
patelprateek
sorry for being unclear . My question was regarding - distributed build of indices : since indexing takes quite long for 100M docs or cases where we have streaming elements...
@psobot Curios if you are still working on this ? With this change , do you see performance impact on the old fp32 distance computation ?
@psobot : IIUC for the new storage types (uint8 , uint16 ...) ,it seems we rely on compiler vectorization , we dont have support for explicit vectorized code like fp32...
@yurymalkov I also have a use case , can you point me to the c++ code interface method as well , thanks
@yurymalkov I have similar requirements as well since i have a pipeline where we receive embeddings in a streaming online way. Is my understanding correct that this approach is equivalent...
Thanks for the info. the use case i am thinking is evaluating some expression like distance < 0.xyz and metadata_color = "red" . I was under the impressions that the...
May be my question wasn't clear. I understand what theta sketches are , but trying to understand how you build auto sharding for some high cardinality segments when constructing theta...
AFAIk FP16C allows to convert between fp16 and fp32 should be available on most AVX CPUs. AVX512-fp16 provides the ability to perform maths on fp16 directly, but is only supported...
I also use LZ4 compression btw for all levels ( i thought LZ4 can easily crunch through few GB/sec and should not be a bottleneck)
@mdcallag : yes i am doing a bulk load of ~250gb or more data , so i turn off all compaction , do a final compaction when all my ~250gb...