cstore_fdw
cstore_fdw copied to clipboard
Enable support for UPDATE/DELETE
It was already mentioned on issue #2 and is referenced in the TODO file, but it was last updated more than one year ago.
I guess it would not hurt to keep an issue open for this TODO item and raise awareness for it :)
IMHO I think this would be a difficult feature to implement in a columnar store. The key benefits are around absolute query performance. Better CTE support I think should be the priority.
I agree it might be difficult to implement in a columnar store. However, and while the general append-only use case is good enough, there might be a wider audience out there that would be using cstore if support for occasional UPDATE/DELETE would be implemented. MVHO of course :)
Exactly, for the occasional UPDATE/DELETE, it's not worth the huge effort.
The problem is that those that need this, occasionally, but sometimes, may not use cstore_fdw at all since it would not be supported. So implementing it may widen the user base.
But of course, balance between this and effort is on Citus (and/or community, of course). At least I wanted to raise awareness :) Thanks!
Both delete and update are in our radar for some time. However, due to columnar nature and our file layout, it is quite costly to that.
@sriharsha9992 did some work related to deleting rows from a cstore_fdw table. I am not quite sure of the outcome.
One way to simulate delete is mark a row as deleted, and not return that row during queries. Deleted rows would not be reported in query results, but it would still take disk space. A vacuum like operation on the table may re-organize the data and get rid of unused space as postgresql does.
v1.5 is going to be released in a week or so. Would you be interested in discussing next feature set ?
I am going to start a thread in our mailing list at https://groups.google.com/forum/#!forum/cstore-users after release.
Appreciate if you could share your thoughts.
I love cstore_fdw being able to get such a considered and helpful response from @mtuncer within hours of an issue being posted is amazing.
Hi ,
I did succeed in implementing DELETE and UPDATE using a executor hook !!
DELETE
- Introduced a new variable DeleteArray ( bool * ) in StripeMetadataList . It stores the information on whether the row is deleted or not for the whole stripe .
- Implemented a executor hook . Check whether it is a delete operation . Fork the ForeignScan part and will fill up deleteArray after performing the condition check (ExecQual)
- If there is no Qual , just truncate the table by calling the truncateCstoreTables method
UPDATE
- Implemented as DELETE and INSERT .
- Implemented a executor hook . Check whether it is a update operation . Fork the ForeignScan part and will fill up deleteArray after performing the condition check (ExecQual)
- ExecForeignUpdate would have the same code as ExecForeignInsert
As @mtuncer pointed out , I perform a vaccum like operation to free up the disk space . Currently , I just perform INSERT INTO TABLE_NEW SELECT * FROM TABLE , which is kind of a round-about solution . Planning to fork VACCUM TABLE command and clean up the table in future .
That's really cool @sriharsha9992 Is that code submitted as a PR?
@mtuncer I'm subscribed now to the cstore-users group and I'll be happy to provide feedback on future directions when asked for :)
Hi @sriharsha9992, those are great news.
Do you plan yo submit a pull request?
Do you need help with it?
any update for this enhancement?
@dawuwu -- We haven't prioritized native UPDATE / DELETE
support so far because it's a relatively complex change. That said, we're actively gathering feedback and prioritizing features in every release cycle.
Could you elaborate a bit more about your use case and your motivation for needing UPDATE / DELETE
?
I am cleaning up my code right now !! I will send you guys the patch for DELETE and UPDATE . You can review it and push it into the code if it meets your standard.
@sriharsha9992 - did you find time to do this?
+1 for this feature
Are there plans to provide this feature in an upcoming version of cstore_fdw? Is there a fork or a patch containing the code @sriharsha9992 described somewhere to take a look at and tinker around with?
In my case, I have a constant influx of new data and a need for periodically bulk shedding old data. So I would love to see support for some kind of bulk delete that does not require a complete re-import of all data or duplicating it into a new table that nearly doubles the storage requirement for the server. I never need update, and I see that as a perfectly valid restriction.
@Dead2 I can suggest using partitioned tables.
You can have a regular partitioned table, range partitioned on some time column. You can have cstore_fdw tables as partitions of that table.
When you need to retire old data, you can simply drop the related cstore_fdw partition.
There were some restrictions on using partitioned tables with foreign tables in PG10. PG11 has better support on those.
@mtuncer True, that would work for one of my tables. The others have rows that need to stay until they are no longer referenced from the first table. I guess I should have specified that in my previous comment, I'd need to do deletion on random rows, and that is probably harder to support from a cstore_fdw point of view.
@sriharsha9992 Thanks for a great feature. Do you have a PR on this ? I think community can help if more cleanup is required.