nlu
nlu copied to clipboard
load error
I get the following error when trying the following:
import nlu
nlu.load('elmo')
using configuration: OS: Windows 10 Java version: 1.8.0_311 (Java 8) Pyspark – version: 3.1.2
:: loading settings :: url = jar:file:/C:/Spark/spark-3.2.0-bin-hadoop3.2/jars/ivy-2.5.0.jar!/org/apache/ivy/core/settings/ivysettings.xml Ivy Default Cache set to: C:\Users\Lukas.ivy2\cache The jars for the packages stored in: C:\Users\Lukas.ivy2\jars com.johnsnowlabs.nlp#spark-nlp_2.12 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent-f9a2f2a7-e7ac-44f5-a922-ae1493621cbc;1.0 confs: [default] found com.johnsnowlabs.nlp#spark-nlp_2.12;3.3.4 in central found com.typesafe#config;1.4.1 in central found org.rocksdb#rocksdbjni;6.5.3 in central found com.amazonaws#aws-java-sdk-bundle;1.11.603 in central found com.github.universal-automata#liblevenshtein;3.0.0 in central found com.google.code.findbugs#annotations;3.0.1 in central found net.jcip#jcip-annotations;1.0 in central found com.google.code.findbugs#jsr305;3.0.1 in central found com.google.protobuf#protobuf-java-util;3.0.0-beta-3 in central found com.google.protobuf#protobuf-java;3.0.0-beta-3 in central found com.google.code.gson#gson;2.3 in central found it.unimi.dsi#fastutil;7.0.12 in central found org.projectlombok#lombok;1.16.8 in central found org.slf4j#slf4j-api;1.7.21 in central found com.navigamez#greex;1.0 in central found dk.brics.automaton#automaton;1.11-8 in central found org.json4s#json4s-ext_2.12;3.5.3 in central found joda-time#joda-time;2.9.5 in central found org.joda#joda-convert;1.8.1 in central found com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.3.3 in central found net.sf.trove4j#trove4j;3.0.3 in central :: resolution report :: resolve 391ms :: artifacts dl 16ms :: modules in use: com.amazonaws#aws-java-sdk-bundle;1.11.603 from central in [default] com.github.universal-automata#liblevenshtein;3.0.0 from central in [default] com.google.code.findbugs#annotations;3.0.1 from central in [default] com.google.code.findbugs#jsr305;3.0.1 from central in [default] com.google.code.gson#gson;2.3 from central in [default] com.google.protobuf#protobuf-java;3.0.0-beta-3 from central in [default] com.google.protobuf#protobuf-java-util;3.0.0-beta-3 from central in [default] com.johnsnowlabs.nlp#spark-nlp_2.12;3.3.4 from central in [default] com.johnsnowlabs.nlp#tensorflow-cpu_2.12;0.3.3 from central in [default] com.navigamez#greex;1.0 from central in [default] com.typesafe#config;1.4.1 from central in [default] dk.brics.automaton#automaton;1.11-8 from central in [default] it.unimi.dsi#fastutil;7.0.12 from central in [default] joda-time#joda-time;2.9.5 from central in [default] net.jcip#jcip-annotations;1.0 from central in [default] net.sf.trove4j#trove4j;3.0.3 from central in [default] org.joda#joda-convert;1.8.1 from central in [default] org.json4s#json4s-ext_2.12;3.5.3 from central in [default] org.projectlombok#lombok;1.16.8 from central in [default] org.rocksdb#rocksdbjni;6.5.3 from central in [default] org.slf4j#slf4j-api;1.7.21 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 21 | 0 | 0 | 0 || 21 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent-f9a2f2a7-e7ac-44f5-a922-ae1493621cbc confs: [default] 0 artifacts copied, 21 already retrieved (0kB/0ms) 22/01/14 17:30:48 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). elmo download started this may take some time. 22/01/14 17:31:05 WARN ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped EXCEPTION: Could not resolve singular Component for type=elmo and nlp_ref=elmo and nlu_ref=elmo and lang =en Traceback (most recent call last): File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 708, in construct_component_from_identifier return Embeddings(get_default=False, nlp_ref=nlp_ref, nlu_ref=nlu_ref, lang=language, File "D:.venv\python3.8_nlu\lib\site-packages\nlu\components\embedding.py", line 98, in init else : self.model =SparkNLPElmo.get_pretrained_model(nlp_ref, lang) File "D:.venv\python3.8_nlu\lib\site-packages\nlu\components\embeddings\elmo\spark_nlp_elmo.py", line 14, in get_pretrained_model return ElmoEmbeddings.pretrained(name,language)
File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\annotator.py", line 7760, in pretrained return ResourceDownloader.downloadModel(ElmoEmbeddings, name, lang, remote_loc) File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\pretrained.py", line 50, in downloadModel file_size = _internal._GetResourceSize(name, language, remote_loc).apply() File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 231, in init super(_GetResourceSize, self).init( File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 165, in init self._java_obj = self.new_java_obj(java_obj, *args) File "D:.venv\python3.8_nlu\lib\site-packages\sparknlp\internal.py", line 175, in new_java_obj return self._new_java_obj(java_class, *args) File "D:.venv\python3.8_nlu\lib\site-packages\pyspark\ml\wrapper.py", line 66, in _new_java_obj return java_obj(*java_args) File "D:.venv\python3.8_nlu\lib\site-packages\py4j\java_gateway.py", line 1304, in call return_value = get_return_value( File "D:.venv\python3.8_nlu\lib\site-packages\pyspark\sql\utils.py", line 111, in deco return f(*a, **kw) File "D:.venv\python3.8_nlu\lib\site-packages\py4j\protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling z:com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.getDownloadSize. : java.lang.NoClassDefFoundError: org/json4s/package$MappingException at org.json4s.ext.EnumNameSerializer.deserialize(EnumSerializer.scala:53) at org.json4s.Formats$$anonfun$customDeserializer$1.applyOrElse(Formats.scala:66) at org.json4s.Formats$$anonfun$customDeserializer$1.applyOrElse(Formats.scala:66) at scala.collection.TraversableOnce.collectFirst(TraversableOnce.scala:180) at scala.collection.TraversableOnce.collectFirst$(TraversableOnce.scala:167) at scala.collection.AbstractTraversable.collectFirst(Traversable.scala:108) at org.json4s.Formats$.customDeserializer(Formats.scala:66) at org.json4s.Extraction$.customOrElse(Extraction.scala:775) at org.json4s.Extraction$.extract(Extraction.scala:454) at org.json4s.Extraction$.extract(Extraction.scala:56) at org.json4s.ExtractableJsonAstNode.extract(ExtractableJsonAstNode.scala:22) at com.johnsnowlabs.util.JsonParser$.parseObject(JsonParser.scala:28) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.parseJson(ResourceMetadata.scala:101) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$$anonfun$readResources$1.applyOrElse(ResourceMetadata.scala:129) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$$anonfun$readResources$1.applyOrElse(ResourceMetadata.scala:128) at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38) at scala.collection.Iterator$$anon$13.next(Iterator.scala:593) at scala.collection.Iterator.foreach(Iterator.scala:943) at scala.collection.Iterator.foreach$(Iterator.scala:943) at scala.collection.AbstractIterator.foreach(Iterator.scala:1431) at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62) at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53) at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:184) at scala.collection.mutable.ListBuffer.$plus$plus$eq(ListBuffer.scala:47) at scala.collection.TraversableOnce.to(TraversableOnce.scala:366) at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364) at scala.collection.AbstractIterator.to(Iterator.scala:1431) at scala.collection.TraversableOnce.toList(TraversableOnce.scala:350) at scala.collection.TraversableOnce.toList$(TraversableOnce.scala:350) at scala.collection.AbstractIterator.toList(Iterator.scala:1431) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.readResources(ResourceMetadata.scala:128) at com.johnsnowlabs.nlp.pretrained.ResourceMetadata$.readResources(ResourceMetadata.scala:123) at com.johnsnowlabs.client.aws.AWSGateway.getMetadata(AWSGateway.scala:78) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.downloadMetadataIfNeed(S3ResourceDownloader.scala:62) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.resolveLink(S3ResourceDownloader.scala:68) at com.johnsnowlabs.nlp.pretrained.S3ResourceDownloader.getDownloadSize(S3ResourceDownloader.scala:145) at com.johnsnowlabs.nlp.pretrained.ResourceDownloader$.getDownloadSize(ResourceDownloader.scala:445) at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader$.getDownloadSize(ResourceDownloader.scala:577) at com.johnsnowlabs.nlp.pretrained.PythonResourceDownloader.getDownloadSize(ResourceDownloader.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182) at py4j.ClientServerConnection.run(ClientServerConnection.java:106) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.json4s.package$MappingException at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 51 more
Traceback (most recent call last): File "D:.venv\python3.8_nlu\lib\site-packages\nlu_init_.py", line 236, in load nlu_component = nlu_ref_to_component(nlu_ref, authenticated=is_authenticated) File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 171, in nlu_ref_to_component resolved_component = resolve_component_from_parsed_query_data(language, component_type, dataset, File "D:.venv\python3.8_nlu\lib\site-packages\nlu\pipe\component_resolution.py", line 320, in resolve_component_from_parsed_query_data raise ValueError(f'EXCEPTION : Could not create NLU component for nlp_ref={nlp_ref} and nlu_ref={nlu_ref}') ValueError: EXCEPTION : Could not create NLU component for nlp_ref=elmo and nlu_ref=elmo
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "
Hi @filemon11 this looks like Spark is not properly setup on Windows, can you make sure you follow all steps involved to install spark-nlp and pyspark for windows?
The windows setup can be a bit tricky, but if you are just getting started we recommend to use google colab, which provided you instantly with a working environment in your browser
https://colab.research.google.com/drive/1j4Ek0JkBPmnK75qIxyYjVtYWNUPRbh9v?usp=sharing
@C-K-Loan would it be possible to please share some inputs against Windows installation please ?