hold duplicate sqls and return it together
Is your feature request related to a problem? Please describe.
the application may issue duplicate queries at the same time
backend A sends select a from b, backend B also sends select a from b at the same time
my pg will suffer from processing identical queries especially when the sql consumes disk io
Describe the solution you'd like
support holding the duplicate queries, only issue one to pg at a time, when it returns , return the same result to all clients
maybe add a set dedup = true to only apply to certain queries
Describe alternatives you've considered
normally we do it in applications using redis lock, but we need extra storage to distribute the result, like store it in a redis key
Additional context Add any other context or screenshots about the feature request here.
You're describing query caching. I think this is a valuable and dangerous feature: just because the client issues the same query, doesn't mean it will return the same result both times. I think this one merits some thought, but it definitely does have a precedent, e.g. PgPool-II has this feature.
FWIW, Postgres has good caching characteristics, so if you issue the same query twice, the second result will return much quicker because most necessary buffers will most likely be in RAM, unless of course the query scans a very large table, in which case the pooler wouldn't be able to cache it anyway.
we could done caching in our application
I think in pgcat we can do it a little bit different, just return the result to all clients running the same sql immediately, without holding the postgres result, not really caching it.
ideas come from #62 , maybe pgcat do not need heavy processing on the package :)
introducing extra set dedup is to avoid situation where we just want the db the handle the concurrency
immediately is a very long time in terms of computing. Even if it seems like <1ms for the user, during this time, the data could have changed many times on the server. Pgcat just doesn't have this kind of information, especially when multiple clients are involved, so what you're really describing is caching.
we are experimenting readyset, which does the sql caching.
what I describe above is more like the --query-caching inrequestpath
maybe we could do this in readyset or alike, rather than implementing it in pgcat.
ReadySet is a lightweight SQL caching engine that sits between your application and database and turns even the most complex SQL reads into lightning-fast lookups. Unlike other caching solutions, ReadySet requires no changes to your application code.
This is interesting. Query caching is a difficult problem, I'd be curious to know what your experience with ReadySet is like if you're willing to share.