db-benchmark icon indicating copy to clipboard operation
db-benchmark copied to clipboard

What `unsorted data` really implies

Open nickitat opened this issue 1 year ago • 2 comments

Hi 👋🏼 I'd like to ask for some clarification. Does the unsorted mode implies that no particular sorting is required (so we're free to use none or any of our choice) or that it is required to have no ordering (or sort-based index etc)? Thx.

nickitat avatar Dec 02 '24 17:12 nickitat

From what I have gathered, sorted mode applied only to data generation, meaning you can generate sorted input data.

If the order of the output data is different, it should not affect result verification, since that is done by calculating aggregate values over the answer table.

Tmonster avatar Jan 13 '25 15:01 Tmonster

From what I have gathered, sorted mode applied only to data generation, meaning you can generate sorted input data.

If the order of the output data is different, it should not affect result verification, since that is done by calculating aggregate values over the answer table.

Let me clarify a bit what I really meant to ask: Is it OK to create tables with indexes (e.g., with ordering by join columns) in the unsorted mode? I thought it might not be really fair in case those indexes would improve performance.

nickitat avatar Jan 13 '25 15:01 nickitat