pgsync icon indicating copy to clipboard operation
pgsync copied to clipboard

Ideas

Open ankane opened this issue 4 years ago • 12 comments

Ideas

  • [ ] Add option to wrap the destination sync in a transaction and use delete instead of truncate - #102
  • [ ] Add option to delete rows in SQL clause - #110
  • [ ] Change preferred variable syntax ($1) and quote by default
  • [ ] Add support for variables with commas - #206
  • [ ] Only defer necessary constraints - #209
  • [ ] Better error message when primary keys don't match
  • [ ] Skip columns with incompatible types (currently warns when different)
  • [ ] Add option to open transaction on source when manually specifying order of tables (for consistent snapshot)
  • [ ] Retry on connection errors
  • [ ] Support syncing to different table name. Ideally use Git remote syntax (pgsync src:dst), which is current used for variables. Could change variables to use pgsync product=123.
  • [ ] Add to_safe_hosts to replace to_safe (with * for all)

ankane avatar Jun 05 '20 07:06 ankane

Is there already an ability to limit rows without using a group? If not that would be good (and I'd attempt a PR).

probablykabari avatar Nov 03 '20 19:11 probablykabari

Hey @RipTheJacker, you should be able to do:

pgsync table "limit 1000"

ankane avatar Nov 03 '20 23:11 ankane

@ankane I was thinking as part of the config, in essence the same as group config but a level higher. But maybe this way is a better practice. ~~It's unclear in the docs, but does the groups config observe the exclude config?~~ I see it works across all tables when using pgsync * "limit N" which is aright.

probablykabari avatar Nov 04 '20 15:11 probablykabari

@ankane I'm wondering if it's possible to mix the usage of --in-batches with --defer-constraints-v2. I have two different large tables that refer to each other and I'm not sure how to address that. It seems --in-batches is restricted to a single table but in order to make the scenario I have work, I'd need the two tables to be synced within the same session. Please let me know if you have any thoughts/suggestions. I'd love to contribute.

caioariede avatar Jun 09 '21 17:06 caioariede

Hey @caioariede, I'm not sure how you could combine the two, but feel free to fork and experiment.

ankane avatar Jun 09 '21 20:06 ankane

Is it possible to set defer_constraints in the YAML? We effectively have to use it all the time, and being able to make it set by default in our config would help us clean up some scripts and CLI use.

geoffharcourt avatar Mar 08 '22 20:03 geoffharcourt

I needed to replace names, emails, and addresses.

I was experimenting with Faker, do you think it's a good idea?

alextakitani avatar Mar 08 '23 18:03 alextakitani

@alextakitani one thing to keep in mind with libraries like Faker is that they can get pretty slow if you're relying on random data. With small datasets that can be okay though.

caioariede avatar Mar 08 '23 18:03 caioariede

Indeed @caioariede , that was my experience.

I'm using Faker inside Postgres, https://gitlab.com/dalibo/postgresql_faker

But not everyone can install extensions in production ( that's my case )

So I ended up downloading a backup, restoring locally and then generating a sync with fake data.

alextakitani avatar Mar 08 '23 21:03 alextakitani

What if there is an option to "skip" the wrong line? For example, if there was a chance to skip the line with the error "y(insert or update on table "xxxx" violates foreign key constraint "xxxx")" and continue synchronization? @ankane

Natgho avatar May 18 '23 14:05 Natgho

@Natgho once that error happens the wrapping transaction is polluted and can't continue, so I think that would be hard to do

geoffharcourt avatar May 19 '23 22:05 geoffharcourt