williambrandler

Results 76 comments of williambrandler

Hey @edg1983, reading bgen's that are indexed with bgenix should work fine on cloud object storage You do not need to read the index itself, just the bgen like you...

ah ok thanks, Here is the offending line of code: https://github.com/projectglow/glow/blob/8b0bcd6b2f7320c3a5bd186bdcfa4707af303b47/core/src/main/scala/io/projectglow/bgen/BgenFileFormat.scala#L149 It is using SQLite to access the index. But cannot find the sqlite classes and driver on the class...

@edg1983 were you able to resolve this? Glow depends on `sqlite-jdbc 3.20.1`

thanks @edg1983, Can we work together to contribute this container back to Glow? It should be straightforward as we already have a container for running [Glow in Databricks](https://github.com/projectglow/glow/blob/master/docker/README.md) and a...

This is great, thanks. Would like to translate this into something anyone can use Do you have the dockerfiles in a repo that I could look at and see if...

thanks @edg1983 , working on a container here, https://github.com/projectglow/glow/pull/503 please could you test `projectglow/open-source-glow:1.1.2` https://hub.docker.com/r/projectglow/open-source-glow/tags to see if it works the same as your container? Acknowledged you in the documentation...

hey @bcajes yes this config is changed without warning and it would be useful to explicitly give a warning or put this in the docs. Are there other configs you...

another config we usually add for the regression step (which uses pandas udfs and arrow) is, "spark.sql.execution.arrow.maxRecordsPerBatch": 100

hey @dberma15 plink binary ped files can be read with Glow. What is your use case for plink files and is the data in any other format (vcf / bgen)?...

ah, I believe those SNPeff input files are Browser Extensible Data (BED) format, not plink binary PED (BED) format, which awkwardly has the same suffix. You can read Browser Extensible...