ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

'ModuleNotFoundError'>: No module named 'dataset'

Open ForJadeForest opened this issue 2 years ago • 3 comments

My code contain several .py files:

  • brainMRI.py
  • dataset.py
  • Unet.py And I want to use the bigdl backend to train the model.
if args.cluster_mode == "local":
    init_orca_context(memory=args.memory)

if args.backend == "bigdl":
    net = model_creator(config={})
    optimizer = optim_creator(model=net, config={"lr": 0.001})
    orca_estimator = Estimator.from_torch(model=net,
                                          optimizer=optimizer,
                                          loss=bce_dice_loss,
                                          metrics=[],
                                          backend=args.backend,
                                          )
    orca_estimator.fit(data=train_loader, epochs=args.epochs)

The cluster_mode is local. But I got the problem:

2022-06-27 11:07:40 ERROR TaskSetManager:70 - Task 0 in stage 1.0 failed 1 times; aborting job
Traceback (most recent call last):
  File "brainMRI.py", line 192, in <module>
    orca_estimator.fit(data=train_loader, epochs=args.epochs)
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/bigdl/orca/learn/pytorch/pytorch_spark_estimator.py", line 168, in fit
    train_fset, val_fset = self._handle_data_loader(data, validation_data)
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/bigdl/orca/learn/pytorch/pytorch_spark_estimator.py", line 94, in _handle_data_loader
    train_feature_set = FeatureSet.pytorch_dataloader(data, "", "")
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/bigdl/dllib/feature/common.py", line 389, in pytorch_dataloader
    False, features, labels)
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/bigdl/dllib/utils/file_utils.py", line 227, in callZooFunc
    raise e
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/bigdl/dllib/utils/file_utils.py", line 221, in callZooFunc
    java_result = api(*args)
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/py4j/java_gateway.py", line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/py4j/protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o42.createFeatureSetFromPyTorch.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost, executor driver): jep.JepException: jep.JepException: <class 'ModuleNotFoundError'>: No module named 'dataset'
        at com.intel.analytics.bigdl.orca.utils.PythonInterpreter$.threadExecute(PythonInterpreter.scala:98)
        at com.intel.analytics.bigdl.orca.utils.PythonInterpreter$.exec(PythonInterpreter.scala:108)
        at com.intel.analytics.bigdl.orca.net.PythonFeatureSet$$anonfun$loadPythonSet$1.apply(PythonFeatureSet.scala:96)
        at com.intel.analytics.bigdl.orca.net.PythonFeatureSet$$anonfun$loadPythonSet$1.apply(PythonFeatureSet.scala:86)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$23.apply(RDD.scala:823)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:346)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:310)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: jep.JepException: <class 'ModuleNotFoundError'>: No module named 'dataset'
        at /home/arda/anaconda3/envs/mainly/lib/python3.7/site-packages/pyspark/serializers.loads(serializers.py:587)
        at <string>.<module>(<string>:4)
        at jep.Jep.exec(Native Method)
        at jep.Jep.exec(Jep.java:478)
        at com.intel.analytics.bigdl.orca.utils.PythonInterpreter$$anonfun$1.apply$mcV$sp(PythonInterpreter.scala:106)
        at com.intel.analytics.bigdl.orca.utils.PythonInterpreter$$anonfun$1.apply(PythonInterpreter.scala:105)
        at com.intel.analytics.bigdl.orca.utils.PythonInterpreter$$anonfun$1.apply(PythonInterpreter.scala:105)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
        at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        ... 3 more

I try to use export PYTHONPATH=/the/path/to/brainMRI:$PYTHONPATH and then it can run successfully.

I want to know how can I solve the problem in the python code?

ForJadeForest avatar Jun 27 '22 04:06 ForJadeForest

@qiuxin2012 Could you take a look at this issue? If there are extra files, do we need to manually add to PYTHONPATH? As in the Python side, the currently working directly will be automatically be within PYTHONPATH, but Java is not?

hkvision avatar Jun 27 '22 06:06 hkvision

Yes , jep is using different implement rather than Python.

qiuxin2012 avatar Jun 28 '22 02:06 qiuxin2012

Then we will add export PYTHONPATH in our README for this example. Do you think you need to add this in the document somewhere? @qiuxin2012

hkvision avatar Jun 30 '22 02:06 hkvision

Add export PYTHONPATH=/the/path/to/brainMRI:$PYTHONPATH in README. Fixed.

hkvision avatar Sep 14 '22 02:09 hkvision