Tom White comments

Results 506 comments of


                                            Tom White

maximise lossless compression of vcf_to_zarr

> I assume these are two digit base-10 values, which compress poorly when converted to float32? This is a common issue with converting these types of per-genotype floating point values,...

maximise lossless compression of vcf_to_zarr

I now have a representative VCF that is 17MB in size (compressed). Running `vct_to_zarr` on it with the default settings: ```python vcf_to_zarr(input=vcf_file, fields=["INFO/*", "FORMAT/*"], output=output, max_alt_alleles=1) ``` produced Zarr files...

maximise lossless compression of vcf_to_zarr

@ravwojdyla pointed out that #80 is related.

maximise lossless compression of vcf_to_zarr

With #943 I was able to get the Zarr size down to about 16% larger than genozip on the test file (using bzip2).

Implement G123 selection statistic

Thanks for opening this @sanjaynagi! > The only thing I was going to do was fix the seed for the simulated data, and ensure the values returned are the same...

Missing array-api support for some stats functions?

> Switching the order seems okay to me... It should only matter for types that implement both. And if a type supports `__array_namespace__` ideally we should be prioritizing that as...

Using Flox with cubed

> Because of the way cubed retries things the traceback doesn't give me any more useful context than this though. Is there a way to resurface the useful part of...

Using Flox with cubed

> > The default in dask is `None` > > Is it? [Looks like it's `True`](https://docs.dask.org/en/stable/_modules/dask/array/reductions.html#reduction) to me? It's confusing because the default for `concatenate` in [`blockwise` is `None`](https://docs.dask.org/en/stable/generated/dask.array.blockwise.html#dask.array.blockwise), but...

Using Flox with cubed

> #### For `method='blockwise'` > an error from what looks like a potential bug inside `cubed.rechunk` (from the call to `rechunk_for_blockwise` inside `flox.core.groupby_reduce`: > > ```python > File ~/Documents/Work/Code/cubed/cubed/primitive/rechunk.py:124, in...

Using Flox with cubed

> #### For `method='cohorts'` > I got to the point of cubed needing to implement `.blocks` as expected. From what I can tell, Flox only uses the shape of blocks,...