extensions icon indicating copy to clipboard operation
extensions copied to clipboard

[MEDI] Clear up the key type in VectorStoreWriter

Open roji opened this issue 2 months ago • 2 comments

VectorStoreWriter currently has a random GUID generated chunk ID that's stored as a string in the database; Qdrant has a special hack to make it use GUID types, as it doesn't support string keys.

In general, wherever possible, using Guid rather than string should always be preferred; databases typically store Guids in a compact, efficient way, and allow efficient indexing over them.

One approach (probably the better one) is to simply ensure all MEVD providers support Guids - storing them as strings where not natively supported #12182; this would free the MEDI layer from having to deal about this.

We should clear this up before GA. Aside from the problematic hard-coded exception for Qdrant, changing the column type from string to GUID would be a pretty impactful breaking change later.

roji avatar Oct 27 '25 18:10 roji

One approach (probably the better one) is to simply ensure all MEVD providers support Guid

This would be great! And would play nicely with auto key generation (because we can always generate Guid even if the storage engine can't do it for us).

adamsitnik avatar Oct 28 '25 16:10 adamsitnik

Yeah, I think we'll end up going this way (had a brief discussion with @westey-m on it - I'll look into it soon).

roji avatar Oct 28 '25 21:10 roji