ClickHouse icon indicating copy to clipboard operation
ClickHouse copied to clipboard

Skip useless values in primary key in memory.

Open alexey-milovidov opened this issue 5 years ago • 6 comments

Suppose there is primary key (x, y) when x is arbitrary column and y is variable length column like String.

When loading index in memory, we can avoid storing actual value of y if the value of x is different from the value in previous and next mark. We can store default value instead.

Example:

Instead of: ... (123, 'hello'), (124, 'world'), (125, 'goodbye') ...

We can store: ... (123, 'hello'), (124, ''), (125, 'goodbye') ...

Because 124 is not equal to 123 and 125. And the actual value of y column is not important.

alexey-milovidov avatar May 25 '20 23:05 alexey-milovidov

@CurtizJ RFC.

alexey-milovidov avatar May 25 '20 23:05 alexey-milovidov

What will happen, if we run query like this: SELECT * FROM table WHERE x = 124 and y = 'zoo' Isn't it would scan 2 granules instead of 1?

UnamedRus avatar May 29 '20 18:05 UnamedRus

@UnamedRus You are right, we need some modification to this scheme.

alexey-milovidov avatar May 29 '20 19:05 alexey-milovidov

What will happen, if we run query like this: SELECT * FROM table WHERE x = 124 and y = 'zoo' Isn't it would scan 2 granules instead of 1?

@UnamedRus How does y = 'zoo' make a difference? IIUC, zoo > '' and zoo > 'world'

amosbird avatar Apr 12 '22 16:04 amosbird

Because in case we skip useless values: default value doesn't mean that it's actually default, it does mean that anything can be here:

If we have query like that: SELECT * FROM table WHERE x = 124 and y = 'zoo'

And marks like that:

0 - 123, hello
1 - 124, world
2 - 125, goodbye

Without skipping it will read granule 1-2.

With skipping it need's to read granules 0-1 and 1-2, because it's possible that instead of world we had zoo value in that mark.

UnamedRus avatar Apr 12 '22 16:04 UnamedRus

#60091

alexey-milovidov avatar Feb 17 '24 05:02 alexey-milovidov

This is proven to be wrong. But see https://github.com/ClickHouse/ClickHouse/issues/60091

alexey-milovidov avatar Feb 20 '24 08:02 alexey-milovidov