data-anonymization icon indicating copy to clipboard operation
data-anonymization copied to clipboard

Impressive Speedup using activerecord-import

Open aiomaster opened this issue 6 years ago • 0 comments

I have very large tables that I want to anonymize. A simple run of the anonymization code took me near to 40 minutes! So I tried to optimize the code a little and could get it down to 5 minutes by using the activerecord-import gem. I update my records on a postgresql 10 database using the Blacklist strategy. The trick is to not save every single record, but collect them and use the import-method of activerecord-import with its On-Duplicate-Key-Update-Strategy. Problem is, that it just works for mysql and postgresql that way. To test this just add the gem 'activerecord-import' use my fork and run the anonymization against a mysql or postgresql database.

Maybe I can make a pull request, but I have just tested my own case and don't know if something else is broken. Are you interested in such a feature?

aiomaster avatar Jun 19 '18 13:06 aiomaster