pgcat icon indicating copy to clipboard operation
pgcat copied to clipboard

hold duplicate sqls and return it together

Open DeoLeung opened this issue 3 years ago • 5 comments

Is your feature request related to a problem? Please describe.

the application may issue duplicate queries at the same time

backend A sends select a from b, backend B also sends select a from b at the same time

my pg will suffer from processing identical queries especially when the sql consumes disk io

Describe the solution you'd like

support holding the duplicate queries, only issue one to pg at a time, when it returns , return the same result to all clients

maybe add a set dedup = true to only apply to certain queries

Describe alternatives you've considered

normally we do it in applications using redis lock, but we need extra storage to distribute the result, like store it in a redis key

Additional context Add any other context or screenshots about the feature request here.

DeoLeung avatar Aug 18 '22 05:08 DeoLeung

You're describing query caching. I think this is a valuable and dangerous feature: just because the client issues the same query, doesn't mean it will return the same result both times. I think this one merits some thought, but it definitely does have a precedent, e.g. PgPool-II has this feature.

FWIW, Postgres has good caching characteristics, so if you issue the same query twice, the second result will return much quicker because most necessary buffers will most likely be in RAM, unless of course the query scans a very large table, in which case the pooler wouldn't be able to cache it anyway.

levkk avatar Aug 18 '22 12:08 levkk

we could done caching in our application

I think in pgcat we can do it a little bit different, just return the result to all clients running the same sql immediately, without holding the postgres result, not really caching it.

ideas come from #62 , maybe pgcat do not need heavy processing on the package :)

introducing extra set dedup is to avoid situation where we just want the db the handle the concurrency

DeoLeung avatar Aug 19 '22 00:08 DeoLeung

immediately is a very long time in terms of computing. Even if it seems like <1ms for the user, during this time, the data could have changed many times on the server. Pgcat just doesn't have this kind of information, especially when multiple clients are involved, so what you're really describing is caching.

levkk avatar Aug 19 '22 15:08 levkk

we are experimenting readyset, which does the sql caching.

what I describe above is more like the --query-caching inrequestpath

maybe we could do this in readyset or alike, rather than implementing it in pgcat.

DeoLeung avatar Jan 06 '23 02:01 DeoLeung

ReadySet is a lightweight SQL caching engine that sits between your application and database and turns even the most complex SQL reads into lightning-fast lookups. Unlike other caching solutions, ReadySet requires no changes to your application code.

This is interesting. Query caching is a difficult problem, I'd be curious to know what your experience with ReadySet is like if you're willing to share.

levkk avatar Jan 06 '23 20:01 levkk