glow
glow copied to clipboard
java.lang.ArrayIndexOutOfBoundsException when writing to vcf
I imported vcfs from several projects and combined them into one delta table. I am now trying to write from the delta table to a vcf, and I keep getting java.lang.ArrayIndexOutOfBoundsException
when it tries to write to vcf.
Can you give me suggestions for what might cause this problem? It seems to be related to the genotypes
column. It works if I only select genotypes.calls
.
Py4JJavaError Traceback (most recent call last)
/databricks/spark/python/pyspark/sql/readwriter.py in save(self, path, format, mode, partitionBy, **options) 1134 self._jwrite.save() 1135 else: -> 1136 self._jwrite.save(path) 1137 1138 @since(1.4)
/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in call(self, *args) 1302 1303 answer = self.gateway_client.send_command(command) -> 1304 return_value = get_return_value( 1305 answer, self.gateway_client, self.target_id, self.name) 1306
/databricks/spark/python/pyspark/sql/utils.py in deco(*a, **kw) 108 def deco(*a, **kw): 109 try: --> 110 return f(*a, **kw) 111 except py4j.protocol.Py4JJavaError as e: 112 converted = convert_exception(e.java_exception)
/databricks/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name) 324 value = OUTPUT_CONVERTER[type](answer[2:], gateway_client) 325 if answer[1] == REFERENCE_TYPE: --> 326 raise Py4JJavaError( 327 "An error occurred while calling {0}{1}{2}.\n". 328 format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o541.save.
Hey Sandra, not sure what the issue is! Please print the schema for the dataframe, and provide some more info about the dataset (num variants and samples), and show the code and full stacktrace please?
Closing since we don't have a reproduction