quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

UUID support in search stream API endpoint

Open noonat opened this issue 1 year ago • 2 comments

Is your feature request related to a problem? Please describe. I am using Quickwit to ingest logs, and would like to make use of the ClickHouse integration to find associated data in ClickHouse. However, the join field is a UUID, which is currently being stored as a string in Quickwit. The stream endpoint is limited to returning int64 or uin64 values, which means there isn't a way for us to use this with UUIDs.

Describe the solution you'd like It would be ideal if either:

  • UUID was supported as a first class type in the index and allowed as a stream field, returned in ClickHouse native UUID format. (This is a binary format where the UUID is split into 2 uint64 values.)
  • or if strings were allowed as a stream field

Describe alternatives you've considered I can't see any alternatives, as storing the UUID as an integer would require 128-bit integer support. The current workaround I am using is to use a custom HTTP server to emulate the stream endpoint and do paged requests to the search endpoint, which is obviously far less performant than using the stream endpoint would be.

noonat avatar Feb 20 '24 18:02 noonat

Would it solve your problem if we supported streaming two ids instead of only one? You would have a to have two u64 fields instead of a single one, take care of formatting the doc for ingestion on your own, and see how it can be plugged to clickhouse that way.

fulmicoton avatar Feb 22 '24 01:02 fulmicoton

Yes, that seems like it would be a valid workaround. It would require a little more work to query on the value -- or I suppose we could persist the full UUID string alongside the split u64s.

If this were the workaround, do you think it would be feasible to eventually aggregate on a pair of u64s as well?

noonat avatar Feb 23 '24 19:02 noonat