raydp
raydp copied to clipboard
Can't start raydp when ray head node is not the same as the raydp node
I am trying to setup raydp on my ray cluster, but I am creating the ray client like this
import ray, raydp
ray.init(address='ray://10.112.80.176:10001')
spark = raydp.init_spark(app_name='RayDP Example',
num_executors=2,
executor_cores=2,
executor_memory='4GB'
)
But this results in these errors
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Input In [4], in <cell line: 1>()
----> 1 spark = raydp.init_spark(app_name='RayDP Example',
2 num_executors=2,
3 executor_cores=2,
4 executor_memory='4GB'
5 )
File ~/raymodin/lib/python3.8/site-packages/raydp/context.py:126, in init_spark(app_name, num_executors, executor_cores, executor_memory, configs)
123 try:
124 _global_spark_context = _SparkContext(
125 app_name, num_executors, executor_cores, executor_memory, configs)
--> 126 return _global_spark_context.get_or_create_session()
127 except:
128 _global_spark_context = None
File ~/raymodin/lib/python3.8/site-packages/raydp/context.py:70, in _SparkContext.get_or_create_session(self)
68 return self._spark_session
69 self.handle = RayDPConversionHelper.options(name=RAYDP_OBJ_HOLDER_NAME).remote()
---> 70 spark_cluster = self._get_or_create_spark_cluster()
71 self._spark_session = spark_cluster.get_spark_session(
72 self._app_name,
73 self._num_executors,
74 self._executor_cores,
75 self._executor_memory,
76 self._configs)
77 return self._spark_session
File ~/raymodin/lib/python3.8/site-packages/raydp/context.py:63, in _SparkContext._get_or_create_spark_cluster(self)
61 if self._spark_cluster is not None:
62 return self._spark_cluster
---> 63 self._spark_cluster = SparkCluster(self._configs)
64 return self._spark_cluster
File ~/raymodin/lib/python3.8/site-packages/raydp/spark/ray_cluster.py:34, in SparkCluster.__init__(self, configs)
32 self._app_master_bridge = None
33 self._configs = configs
---> 34 self._set_up_master(None, None)
35 self._spark_session: SparkSession = None
File ~/raymodin/lib/python3.8/site-packages/raydp/spark/ray_cluster.py:40, in SparkCluster._set_up_master(self, resources, kwargs)
37 def _set_up_master(self, resources: Dict[str, float], kwargs: Dict[Any, Any]):
38 # TODO: specify the app master resource
39 self._app_master_bridge = RayClusterMaster(self._configs)
---> 40 self._app_master_bridge.start_up()
File ~/raymodin/lib/python3.8/site-packages/raydp/spark/ray_cluster_master.py:56, in RayClusterMaster.start_up(self, popen_kwargs)
54 self._gateway = self._launch_gateway(extra_classpath, popen_kwargs)
55 self._app_master_java_bridge = self._gateway.entry_point.getAppMasterBridge()
---> 56 self._set_properties()
57 self._host = ray.util.get_node_ip_address()
58 self._create_app_master(extra_classpath)
File ~/raymodin/lib/python3.8/site-packages/raydp/spark/ray_cluster_master.py:145, in RayClusterMaster._set_properties(self)
142 node = ray.worker.global_worker.node
144 options["ray.run-mode"] = "CLUSTER"
--> 145 options["ray.node-ip"] = node.node_ip_address
146 options["ray.address"] = node.redis_address
147 options["ray.redis.password"] = node.redis_password
AttributeError: 'NoneType' object has no attribute 'node_ip_address'
Seems that it tries to assume that the local machine is the ray server... Is there a way to configure raydp?
Hi @tdeboer-ilmn , Glad you tried raydp. You are right, raydp.init_spark is assumed to be called in the ray cluster. If you need to use ray client, for the current stable release, you need to wrap your driver program in a ray actor, so that it can be executed on a node in the ray cluster. If you are willing to try raydp-nightly, then you can use raydp.init_spark on your local machine, and it works fine with ray client. However, to_spark does not work now because ray has not merged my PR.
RayDP now works directly in ray client mode. Closing this as stale