mango icon indicating copy to clipboard operation
mango copied to clipboard

Mango submit fails with parquet class not found

Open SatyaGsk opened this issue 7 years ago • 1 comments

I am using mango-distribution-0.0.1 on Hadoop 2.6.0-cdh5.14.4/spark version 2.2.0/Scala version 2.11.8 Mango submit fails with parquet class not found. I tried to pass parquet class in CLI but it not helping as shown by messages below. I also included how data files layout on HDFS.

[sm@bluedata750 bin]$ ./mango-submit --packages org.apache.parquet:parquet-hadoop:1.8.2 /user/sm/hg19.17.2bit -genes /user/sm/ensGene.bb -reads /user/sm/chr17.7500000-7515000.sam.adam -variants /user/sm/chr17.adam -show_genotypes -discover Using spark-submit=/usr/bin/spark2-submit Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/parquet/hadoop/metadata/CompressionCodecName at org.bdgenomics.utils.cli.ParquetArgs$class.$init$(ParquetArgs.scala:40) at org.bdgenomics.mango.cli.VizReadsArgs.(VizReads.scala:252) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.bdgenomics.utils.cli.Args4j$.apply(Args4j.scala:34) at org.bdgenomics.mango.cli.VizReads$.apply(VizReads.scala:196) at org.bdgenomics.utils.cli.BDGCommandCompanion$class.main(BDGCommand.scala:33) at org.bdgenomics.mango.cli.VizReads$.main(VizReads.scala:125) at org.bdgenomics.mango.cli.VizReads.main(VizReads.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.parquet.hadoop.metadata.CompressionCodecName at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 21 more [sm@bluedata750 bin]$ ./mango-submit /user/sm/hg19.17.2bit -genes /user/sm/ensGene.bb -reads /user/sm/chr17.7500000-7515000.sam.adam -variants /user/sm/chr17.adam -show_genotypes -discover Using spark-submit=/usr/bin/spark2-submit Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/parquet/hadoop/metadata/CompressionCodecName at org.bdgenomics.utils.cli.ParquetArgs$class.$init$(ParquetArgs.scala:40) at org.bdgenomics.mango.cli.VizReadsArgs.(VizReads.scala:252) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at org.bdgenomics.utils.cli.Args4j$.apply(Args4j.scala:34) at org.bdgenomics.mango.cli.VizReads$.apply(VizReads.scala:196) at org.bdgenomics.utils.cli.BDGCommandCompanion$class.main(BDGCommand.scala:33) at org.bdgenomics.mango.cli.VizReads$.main(VizReads.scala:125) at org.bdgenomics.mango.cli.VizReads.main(VizReads.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.parquet.hadoop.metadata.CompressionCodecName at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 21 more [sm@bluedata750 bin]$ hadoop fs -ls /user/sm Found 7 items drwxrwx---+ - sm supergroup 0 2018-10-30 21:27 /user/sm/.sparkStaging -rw-rw----+ 3 sm supergroup 5440756334 2018-10-30 21:03 /user/sm/LN44765.bed drwxrwx---+ - sm supergroup 0 2018-11-09 07:19 /user/sm/chr17.7500000-7515000.sam.adam drwxrwx---+ - sm supergroup 0 2018-10-30 21:27 /user/sm/chr17.adam -rw-rw----+ 3 sm supergroup 91866 2018-10-17 10:28 /user/sm/chr17.vcf -rw-rw----+ 3 sm supergroup 3344732 2018-11-09 07:28 /user/sm/ensGene.bb -rw-rw----+ 3 sm supergroup 21252941 2018-11-09 07:21 /user/sm/hg19.17.2bit

SatyaGsk avatar Nov 09 '18 13:11 SatyaGsk

Please try

./mango-submit --packages org.apache.parquet:parquet-hadoop:1.8.2 -- /user/sm/hg19.17.2bit -genes /user/sm/ensGene.bb -reads /user/sm/chr17.7500000-7515000.sam.adam -variants /user/sm/chr17.adam -show_genotypes

akmorrow13 avatar Nov 09 '18 13:11 akmorrow13