Ben Jeffery

Results 116 issues of Ben Jeffery

There's no point doing the fetch from disk on the main process, as all the dask workers are idle while it does this.

Since #828 was merged we no longer access ancestors in order when matching, and create a seperate `chunk_iterator` for each ancestor grouping. For large datasets on high-latency filesystems we are...

#827 added resume for ancestor matching - we should also do this for sample matching.

https://github.com/tskit-dev/tsinfer/pull/827 introduced resume. Once we're happy with the API and functionality it should be documented.

Ancestor matching with 100k diploid sample data sets is looking like it will take more than a month if the number of cores available is in the 20-30 range (a...

There are a few ways to go about this for example some CPUs have specific instrcutions for this. After some research the most portable and robust way appears to be...

Document the workflow from VCF to inference via an sgkit dataset, with a couple of examples.

The API for this currently only accepts a path. It would be better if it accepted either a path or a zarr store as an argument. For example, allowing the...

Error should have code snippet for filtering to a single contig

See also https://github.com/pystatgen/sgkit/issues/464