cstore_fdw icon indicating copy to clipboard operation
cstore_fdw copied to clipboard

Enable support for UPDATE/DELETE

Open ahachete opened this issue 8 years ago • 19 comments

It was already mentioned on issue #2 and is referenced in the TODO file, but it was last updated more than one year ago.

I guess it would not hurt to keep an issue open for this TODO item and raise awareness for it :)

ahachete avatar Aug 24 '16 15:08 ahachete

IMHO I think this would be a difficult feature to implement in a columnar store. The key benefits are around absolute query performance. Better CTE support I think should be the priority.

jhngrant avatar Aug 24 '16 15:08 jhngrant

I agree it might be difficult to implement in a columnar store. However, and while the general append-only use case is good enough, there might be a wider audience out there that would be using cstore if support for occasional UPDATE/DELETE would be implemented. MVHO of course :)

ahachete avatar Aug 24 '16 15:08 ahachete

Exactly, for the occasional UPDATE/DELETE, it's not worth the huge effort.

jhngrant avatar Aug 24 '16 15:08 jhngrant

The problem is that those that need this, occasionally, but sometimes, may not use cstore_fdw at all since it would not be supported. So implementing it may widen the user base.

But of course, balance between this and effort is on Citus (and/or community, of course). At least I wanted to raise awareness :) Thanks!

ahachete avatar Aug 24 '16 15:08 ahachete

Both delete and update are in our radar for some time. However, due to columnar nature and our file layout, it is quite costly to that.

@sriharsha9992 did some work related to deleting rows from a cstore_fdw table. I am not quite sure of the outcome.

One way to simulate delete is mark a row as deleted, and not return that row during queries. Deleted rows would not be reported in query results, but it would still take disk space. A vacuum like operation on the table may re-organize the data and get rid of unused space as postgresql does.

v1.5 is going to be released in a week or so. Would you be interested in discussing next feature set ?

I am going to start a thread in our mailing list at https://groups.google.com/forum/#!forum/cstore-users after release.

Appreciate if you could share your thoughts.

mtuncer avatar Aug 25 '16 10:08 mtuncer

I love cstore_fdw being able to get such a considered and helpful response from @mtuncer within hours of an issue being posted is amazing.

jhngrant avatar Aug 25 '16 10:08 jhngrant

Hi ,

I did succeed in implementing DELETE and UPDATE using a executor hook !!

DELETE

  1. Introduced a new variable DeleteArray ( bool * ) in StripeMetadataList . It stores the information on whether the row is deleted or not for the whole stripe .
  2. Implemented a executor hook . Check whether it is a delete operation . Fork the ForeignScan part and will fill up deleteArray after performing the condition check (ExecQual)
  3. If there is no Qual , just truncate the table by calling the truncateCstoreTables method

UPDATE

  1. Implemented as DELETE and INSERT .
  2. Implemented a executor hook . Check whether it is a update operation . Fork the ForeignScan part and will fill up deleteArray after performing the condition check (ExecQual)
  3. ExecForeignUpdate would have the same code as ExecForeignInsert

As @mtuncer pointed out , I perform a vaccum like operation to free up the disk space . Currently , I just perform INSERT INTO TABLE_NEW SELECT * FROM TABLE , which is kind of a round-about solution . Planning to fork VACCUM TABLE command and clean up the table in future .

sriharsha9992 avatar Aug 25 '16 10:08 sriharsha9992

That's really cool @sriharsha9992 Is that code submitted as a PR?

@mtuncer I'm subscribed now to the cstore-users group and I'll be happy to provide feedback on future directions when asked for :)

ahachete avatar Aug 25 '16 11:08 ahachete

Hi @sriharsha9992, those are great news.

Do you plan yo submit a pull request?

Do you need help with it?

martinmanzo avatar Nov 09 '16 22:11 martinmanzo

any update for this enhancement?

Danzondo avatar Feb 21 '17 14:02 Danzondo

@dawuwu -- We haven't prioritized native UPDATE / DELETE support so far because it's a relatively complex change. That said, we're actively gathering feedback and prioritizing features in every release cycle.

Could you elaborate a bit more about your use case and your motivation for needing UPDATE / DELETE?

ozgune avatar Feb 22 '17 00:02 ozgune

I am cleaning up my code right now !! I will send you guys the patch for DELETE and UPDATE . You can review it and push it into the code if it meets your standard.

sriharsha9992 avatar Feb 22 '17 05:02 sriharsha9992

@sriharsha9992 - did you find time to do this?

mjgp2 avatar Oct 18 '17 17:10 mjgp2

+1 for this feature

wibrt avatar Feb 02 '18 16:02 wibrt

Are there plans to provide this feature in an upcoming version of cstore_fdw? Is there a fork or a patch containing the code @sriharsha9992 described somewhere to take a look at and tinker around with?

martinspaeth avatar Apr 28 '18 15:04 martinspaeth

In my case, I have a constant influx of new data and a need for periodically bulk shedding old data. So I would love to see support for some kind of bulk delete that does not require a complete re-import of all data or duplicating it into a new table that nearly doubles the storage requirement for the server. I never need update, and I see that as a perfectly valid restriction.

Dead2 avatar Nov 22 '18 09:11 Dead2

@Dead2 I can suggest using partitioned tables.

You can have a regular partitioned table, range partitioned on some time column. You can have cstore_fdw tables as partitions of that table.

When you need to retire old data, you can simply drop the related cstore_fdw partition.

There were some restrictions on using partitioned tables with foreign tables in PG10. PG11 has better support on those.

mtuncer avatar Nov 22 '18 10:11 mtuncer

@mtuncer True, that would work for one of my tables. The others have rows that need to stay until they are no longer referenced from the first table. I guess I should have specified that in my previous comment, I'd need to do deletion on random rows, and that is probably harder to support from a cstore_fdw point of view.

Dead2 avatar Nov 22 '18 18:11 Dead2

@sriharsha9992 Thanks for a great feature. Do you have a PR on this ? I think community can help if more cleanup is required.

senthilsivanath avatar Sep 16 '19 11:09 senthilsivanath