SynapseML icon indicating copy to clipboard operation
SynapseML copied to clipboard

java.util.NoSuchElementException: Param binSampleCount does not exist on LightGBM

Open sarmientoj24 opened this issue 3 years ago • 5 comments

Describe the bug binSampleCount does not exist

Instantiation of LightGBM

model = LightGBMClassifier(featuresCol="idfFeatures", labelCol="label")
pipeline_cv_lr = Pipeline().setStages(
    [nltk_cleaner, count_vectorizer, idf, model]
)

model_cv_lr = pipeline_cv_lr.fit(train_data)
predictions_cv_lr = model_cv_lr.transform(test_data)

Info (please complete the following information):

  • MMLSpark Version: mmlspark_2.11;1.0.0-rc3
  • Spark Version e.g. 2.4.5
  • Spark Platform: Local Jupyter

** Error**

Py4JJavaError: An error occurred while calling o80.getParam.
: java.util.NoSuchElementException: Param binSampleCount does not exist.
	at org.apache.spark.ml.param.Params$$anonfun$getParam$2.apply(params.scala:729)
	at org.apache.spark.ml.param.Params$$anonfun$getParam$2.apply(params.scala:729)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.ml.param.Params$class.getParam(params.scala:728)
	at org.apache.spark.ml.PipelineStage.getParam(Pipeline.scala:42)
	at sun.reflect.GeneratedMethodAccessor28.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.lang.Thread.run(Thread.java:748)

AB#1164920

sarmientoj24 avatar May 10 '21 12:05 sarmientoj24

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it.

welcome[bot] avatar May 10 '21 12:05 welcome[bot]

@sarmientoj24 sorry about the trouble you are having. This looks very strange, because this parameter definitely does exist in lightgbm. I wonder if you somehow have multiple environments installed, or your python/pyspark code is somehow not in sync with the scala/Java code. How did you install the library in your local jupyter notebook?

imatiach-msft avatar May 10 '21 14:05 imatiach-msft

@imatiach-msft it is installed like this

session = SparkSession.builder.appName("person-classifier").config("spark.jars.packages", "com.microsoft.ml.spark:mmlspark_2.11:1.0.0-rc3").config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven").getOrCreate()
spark_context = sql.SQLContext(session)

sarmientoj24 avatar May 10 '21 15:05 sarmientoj24

@sarmientoj24 very strange, it should just work. What kind of environment/cluster are you running on? Maybe this way doesn't work there? It is possible there are multiple mmlspark versions installed there somehow?

imatiach-msft avatar May 14 '21 04:05 imatiach-msft

@imatiach-msft I met a similar error as above, and I am thinking this is due to installation issues. I have spark 3.1.2 in my local jupyter notebook, and I followed the python installation method as in this official website: (https://microsoft.github.io/SynapseML/)

I use this code for installation as I use 3.1.2 spark: spark = SparkSession.builder.appName("MyApp") \ .config("spark.jars.packages", "com.microsoft.azure:synapseml_2.12:0.9.5-13-d1b51517-SNAPSHOT") \ .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven") \ .getOrCreate()

But if failed when I run "import synapse", it said no module found. I also tried in Kaggle notebook and GCP Dataproc, both had same error. Ultimately, I had to use pip install synapseml, and pip install synapse so that I can run the LightGBM model and import this module.

Could you please help me with this installation issue if possible? Thanks a lot!

My error is shown below (I one-hot encoded categorical features and then create vectorAssembler on numerical features and onehot encoded categorical features, my label is multiclass)

> ---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
/tmp/ipykernel_213/1295349156.py in <module>
----> 1 model = model.fit(updated_train)

/opt/conda/lib/python3.7/site-packages/pyspark/ml/base.py in fit(self, dataset, params)
    159                 return self.copy(params)._fit(dataset)
    160             else:
--> 161                 return self._fit(dataset)
    162         else:
    163             raise ValueError("Params must be either a param map or a list/tuple of param maps, "

/opt/conda/lib/python3.7/site-packages/synapse/ml/lightgbm/LightGBMClassifier.py in _fit(self, dataset)
   2015 
   2016     def _fit(self, dataset):
-> 2017         java_model = self._fit_java(dataset)
   2018         return self._create_model(java_model)
   2019 

/opt/conda/lib/python3.7/site-packages/pyspark/ml/wrapper.py in _fit_java(self, dataset)
    329             fitted Java model
    330         """
--> 331         self._transfer_params_to_java()
    332         return self._java_obj.fit(dataset._jdf)
    333 

/opt/conda/lib/python3.7/site-packages/synapse/ml/core/schema/Utils.py in _transfer_params_to_java(self)
    132                     self._java_obj.set(pair)
    133             if self.hasDefault(param):
--> 134                 pair = self._make_java_param_pair(param, self._defaultParamMap[param])
    135                 pair_defaults.append(pair)
    136         if len(pair_defaults) > 0:

/opt/conda/lib/python3.7/site-packages/synapse/ml/core/serialize/java_params_patch.py in _mml_make_java_param_pair(self, param, value)
     85     sc = SparkContext._active_spark_context
     86     param = self._resolveParam(param)
---> 87     java_param = self._java_obj.getParam(param.name)
     88     java_value = _mml_py2java(sc, value)
     89     return java_param.w(java_value)

/opt/conda/lib/python3.7/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1303         answer = self.gateway_client.send_command(command)
   1304         return_value = get_return_value(
-> 1305             answer, self.gateway_client, self.target_id, self.name)
   1306 
   1307         for temp_arg in temp_args:

/opt/conda/lib/python3.7/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
    109     def deco(*a, **kw):
    110         try:
--> 111             return f(*a, **kw)
    112         except py4j.protocol.Py4JJavaError as e:
    113             converted = convert_exception(e.java_exception)

/opt/conda/lib/python3.7/site-packages/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
--> 328                     format(target_id, ".", name), value)
    329             else:
    330                 raise Py4JError(

Py4JJavaError: An error occurred while calling o1001.getParam.
: java.util.NoSuchElementException: Param catSmooth does not exist.
	at org.apache.spark.ml.param.Params.$anonfun$getParam$2(params.scala:705)
	at scala.Option.getOrElse(Option.scala:189)
	at org.apache.spark.ml.param.Params.getParam(params.scala:705)
	at org.apache.spark.ml.param.Params.getParam$(params.scala:703)
	at org.apache.spark.ml.PipelineStage.getParam(Pipeline.scala:41)
	at jdk.internal.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
	at py4j.Gateway.invoke(Gateway.java:282)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:238)
	at java.base/java.lang.Thread.run(Thread.java:829)

Haizhuolaojisite avatar Feb 01 '23 02:02 Haizhuolaojisite