tsinfer
tsinfer copied to clipboard
Improve ancestor fetching
Since #828 was merged we no longer access ancestors in order when matching, and create a seperate chunk_iterator
for each ancestor grouping. For large datasets on high-latency filesystems we are now spending more time reading ancestors than matching them!
This could be fixed by some sort of chunk cache that the iterator uses, along with caching the other, non-genotype arrays that are currently read for every chunk_iterator
.