alembic
alembic copied to clipboard
add this recipe for large bulk_inserts to the bulk_insert docs
Migrated issue, originally created by Florian Rüchel (@javex)
I recently had a problem where MySQL would not like to bulk_insert 80k rows at once. Instead I had to feed it 10k chunks. So I split the list into 10k chunks and gave each to alembic.
I don't know if it would make any sense, but I'd suggest adding a parameter chunk_size to bulk_insert. For reference, I did it like this:
for index, item in enumerate(data):
if index % 10000 == 0:
if index != 0:
op.bulk_insert(table, tmpdata)
tmpdata = []
tmpdata.append(item)
if tmpdata:
op.bulk_insert(GeoIP.__table__, tmpdata)
Let me know what you think of it. I'd be happy to create a patch for it, if you like the idea.
Michael Bayer (@zzzeek) wrote:
Here's a super easy and more efficient recipe for that, I'd put it in a new documentation section called "database specific recipes", as we have some hacks for Postgresql as well:
while data:
chunk = data[0:10000]
data = data[10000:]
op.bulk_insert(table, chunk)
i think a bulk insert of 80K rows in a migration script is a very unusual case.
Florian Rüchel (@javex) wrote:
Great recipe, thank you very much :) That's a better solution than integrating it into alemibc.
Changes by Michael Bayer (@zzzeek):
- changed title from "Chunks for bulk_insert" to "add this recipe for large bulk_inserts to the bulk"