databases
databases copied to clipboard
Inserting to database using execute_many much slower than using fetch_all
Hi,
I noticed that inserting multiple rows to postgres database with execute_many is much slower than with fetch_all
"inserting with execute_many": "Time (hh:mm:ss.ms) 0:00:02.538554",
"inserting with fetch_all": "Time (hh:mm:ss.ms) 0:00:00.249998",
execute_many does not currently invoke executemany of the Postgres driver. I don't know why :smile: I have this in my code:
class ExecuteManyConnection(databases.core.Connection):
"""Connection with a better execute_many()."""
async def execute_many(self,
query: Union[ClauseElement, str],
values: List[Mapping]) -> None:
"""Leverage executemany() if connected to PostgreSQL for better performance."""
if not isinstance(self.raw_connection, asyncpg.Connection):
return await super().execute_many(query, values)
sql, args = self._compile(query, values)
async with self._query_lock:
await self.raw_connection.executemany(sql, args)
def _compile(self, query: ClauseElement, values: List[Mapping]) -> Tuple[str, List[list]]:
compiled = query.compile(dialect=self._backend._dialect)
compiled_params = sorted(compiled.params.items())
sql_mapping = {}
param_mapping = {}
for i, (key, _) in enumerate(compiled_params):
sql_mapping[key] = "$" + str(i + 1)
param_mapping[key] = i
compiled_query = compiled.string % sql_mapping
processors = compiled._bind_processors
args = []
for dikt in values:
series = [None] * len(compiled_params)
args.append(series)
for key, val in dikt.items():
series[param_mapping[key]] = processors[key](val) if key in processors else val
return compiled_query, args
It seems to work fine in our production. If you PR this change, I'll happily merge it (I cannot merge my own PRs, and getting a review from another maintainer is almost impossible).
@vmarkovtsev what is going on with the other maintainers?
They just disappeared haha. I had to spend a titanic effort to make a new release last time. My own PR is hanging since June 2020, for example. My guess is that the original author is using something different now, or changed jobs.
While I have solved many problems that people reported here: performance, locks, etc., I don't have enough time to properly fork the project.
With the latest asyncpg v0.22 release: https://github.com/MagicStack/asyncpg/releases/tag/v0.22.0, this might be a good timing to upgrade execute_many to invoke the vastly improved executemany()
Agree :+1:
Same with mysql, execute_many does not use executemany() from aiomysql thus being way slower.
Please update ! The execute_many is as slow as you guys' progress 😹 @vmarkovtsev @tomchristie
@yinshurman The fix shouldn't be too hard. I'll do a simple draft.
Looks like there's an open PR linked. Any chance of this getting through? @vmarkovtsev ?
Would also love to see this PR Merged
Reigniting this thread, did this ever get merged?
Shout for this PR for the performance improvement
I've added some comments to #468 and am happy to review if someone is interested in taking it over.
Any news on this?