graph-node
graph-node copied to clipboard
[Bug] Speed up counting entities for copy/graft
We need to do something about the entity_count for grafts. Right now, when all data has been copied, graph-node will fire off a big query that counts the entities in the graft; that query can take hours in very large subgraphs.
There's a few different ways to handle that:
- give up on accurate entity counts and set the count for copies/grafts to some fast estimate (either the count from the source, or the estimate that analyze comes up with)
- count entities while we copy them. We'd have to turn queries of the form
insert into dst select * from srcintowith ranges (insert into .. returning block_range) select count(*) from ranges where block_range @> int32::MAXand then store the counts for each batch incopy_table_state. After data copying has finished, the entity count is a simple aggregation overcopy_table_state - keep counting entities as a separate step, but break it into batches along
vidjust like the actual copying does. That would require quite a bit more book keeping as counting can now be interrupted by node restarts