Nalini Ganapati
Nalini Ganapati
Currently, only vcf files can be ingested. Refactor code to have an api to ingest from multiple sources. This api should be made callable from Java/Python/R bindings as well.
The C unit tests are meant to test the basic functionality exposed by `/src/main/cpp` and `libtiledbgenomicsdb.so`, so should not have any mpi dependencies. However, `gt_mpi_gather` can be used for testing...
gatk forum users have started requesting support for MacOS and Linux arm64 architectures - see f[orum post](https://gatk.broadinstitute.org/hc/en-us/community/posts/5462468688539-Is-ARM64-Linux-MacOS-architecture-officially-supported-?page=1#community_comment_5972596856091).
This is something to keep in mind when we start using this functionality. _Originally posted by @jPleyte in https://github.com/GenomicsDB/GenomicsDB/pull/186#discussion_r794222104_
Looks like zstd has really good decompress performance and we should at least investigate this.
The default is `all rows` when no rows are specified, so GenomicsDB api `query_variants` and `query_variant_calls` should consider all rows. Thanks @jacmarjorie for pointing this issue.
This could be part of a separate workflow. We could just run run_spark_hdfs.py for pull requests.
This is a placeholder to - Hold issues we are seeing with logging currently - Collect ideas on how to improve logging.