Frank Austin Nothaft

Results 15 comments of Frank Austin Nothaft

OOC, does this work with Spark 2.1.x? Spark 2.2.0 moves the Parquet dependency version, which causes classpath conflicts. We're working on a fix to this, see https://github.com/bigdatagenomics/adam/pull/1518.

Oh, interesting! It hadn't worked for me, but that was on one of the RCs. I'll retest today.

Hi @jpdna! Can you push the branch where you're working? I'd like to play around with the pom to see if we can shade the upstream transitive dependency. I think...

That's the same approach we use upstream in ADAM, so +1 from me!

@calbach In ADAM, we implement the same ordering as the GA4GH protocol. ``` Some Picard / GATK tools depend on this and assert/fail if this is not the case. ```...

> As an aside, this reference technology won't be adopted unless it can be communicated easily, and it is at risk of becoming very hard for anyone outside of this...

> In the absence of practical experience with these structures, I'm concerned that a lot of effort is going into potential problems that won't end up being very serious in...

w00t! Thanks for pushing this through!

Yes, this is still an issue. We can rewrite the region join APIs to use Spark SQL, which appears to yield a large performance gain. Moving to Spark SQL would...