Gert Hulselmans
Gert Hulselmans
### Problem description Enable fake8-pytest-style PT lints in ruff config. I am not very familiar with pytest, so don't know how big of a problem it is, but it might...
### Polars version checks - [X] I have checked that this issue has not already been reported. - [X] I have confirmed this bug exists on the [latest version](https://pypi.org/project/polars/) of...
### Problem description Deprecate `avg` in favor of `mean` or expose them at all levels For Discord: > I just bumped my head on this, but is there a reason...
### Problem description Write only one dictionary when sinking to IPC. It would be great that when writing categorial data to an IPC sink, only one unified dictionary is written,...
### Problem description Support gzipped files as input for batched CSV reader. ```python In [6]: batched_reader = pl.read_csv_batched("test.tsv.gz", sep="\t") In [7]: batched_reader.next_batches(1) --------------------------------------------------------------------------- ComputeError Traceback (most recent call last) in...
#### Describe your feature request Read CSV files created by R. R does create annoying CSV files by default, where they don't write the column name for the index column....
It would be nice if BED files could be converted to d4 files. Similar to ``` bedtools genomecov -bg -i ${bed_file} -g ${genome_file}> ${bed_graph_file} ```
bcftools --write-index creates sometimes indexes older than the data file. ``` # Create BCF files from VCF files and create index on the fly. for i in $(seq 1 200);...
containt ==> contain typo: https://github.com/jorgecarleitao/parquet2/blob/7a5fc27039b192f255908154a0aba2e75f6ed5a1/src/read/metadata.rs#L40 https://github.com/jorgecarleitao/parquet2/blob/7a5fc27039b192f255908154a0aba2e75f6ed5a1/src/read/stream.rs#L33
Support NCBI sequence_report.jsonl as input for CollectAlternateContigNames, as it seems that NCBI now prefers that format, instead of the old https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/001/405/GCF_000001405.26_GRCh38/GCF_000001405.26_GRCh38_assembly_report.txt ``` datasets download genome accession GCF_000001405.26 --include seq-report --filename...