FOSElasticaBundle icon indicating copy to clipboard operation
FOSElasticaBundle copied to clipboard

[Feature] More optimal PagerPersister by Primary Key range.

Open oleg-andreyev opened this issue 2 years ago • 0 comments

It's well know that limit/offset is performing bad (at least for SQL/MySQL) as soon OFFSET is getting closed to the end.

There is one technique to use SELECT by RANGE of PK.

First SQL

SELECT MIN(id), MAX(id) FROM table

Iterative SQL

SELECT * FROM table WHERE id BETWEEN :start AND :end

Where

START = MIN(ID)
END = START + BATCH SIZE

START = END
END = START + BATCH SIZE

etc.

Pros: There is no full table scan. Cons: it's not guaranteed that will fetch exact amount of records from database (it could be less)


Small bench:

root@20a7c2baad01:/app# time php -d memory_limit=-1 ./bin/console f:e:p --pager-persister=in_place -v --no-debug
Resetting products
   2400/405457 [>---------------------------]   0%  1 min
Populating products^C

real    1m22,809s
user    0m34,826s
sys     0m7,067s
root@20a7c2baad01:/app# time php -d memory_limit=-1 ./bin/console f:e:p --pager-persister=optimized_in_place -v --no-debug
Resetting products
   3160/405457 [>---------------------------]   0%  1 min
Populating products^C

real    1m16,909s
user    0m42,173s
sys     0m7,466s
root@20a7c2baad01:/app# 

as you can see for ~1 min it loaded, ~3160 and it's ~31.67% more.

oleg-andreyev avatar Feb 22 '23 20:02 oleg-andreyev