raven
raven copied to clipboard
Updating to ray2
Pull Request Description
What issue does this change request address?
#1972
What are the significant changes in functionality due to this change request?
Updates ray to version 2 Always pass PYTHONPATH to ray init.
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
- [ ] 1. Review all computer code.
- [ ] 2. If any changes occur to the input syntax, there must be an accompanying change to the user manual and xsd schema. If the input syntax change deprecates existing input files, a conversion script needs to be added (see Conversion Scripts).
- [ ] 3. Make sure the Python code and commenting standards are respected (camelBack, etc.) - See on the wiki for details.
- [ ] 4. Automated Tests should pass, including run_tests, pylint, manual building and xsd tests. If there are changes to Simulation.py or JobHandler.py the qsub tests must pass.
- [ ] 5. If significant functionality is added, there must be tests added to check this. Tests should cover all possible options. Multiple short tests are preferred over one large test. If new development on the internal JobHandler parallel system is performed, a cluster test must be added setting, in <RunInfo> XML block, the node
<internalParallel>
to True. - [ ] 6. If the change modifies or adds a requirement or a requirement based test case, the Change Control Board's Chair or designee also needs to approve the change. The requirements and the requirements test shall be in sync.
- [ ] 7. The merge request must reference an issue. If the issue is closed, the issue close checklist shall be done.
- [ ] 8. If an analytic test is changed/added is the the analytic documentation updated/added?
- [ ] 9. If any test used as a basis for documentation examples (currently found in
raven/tests/framework/user_guide
andraven/docs/workshop
) have been changed, the associated documentation must be reviewed and assured the text matches the example.
Job Precheck on 743a8f0 : invalidated by @milljm
Job Test qsubs sawtooth on ca0493d : invalidated by @joshua-cogliati-inl
FAILED: Diff tests/cluster_tests/AdaptiveSobol/test_parallel_adaptive_sobol
Hm, mac failed with:
( 0.14 sec) Job Handler : DEBUG -> Initializing ray locally with num_cpus: 4
2022-09-30 14:55:25,799 ERROR node.py:742 -- Unable to succeed in selecting a random port.
Traceback (most recent call last):
File "/Users/civet/civet/build_0/raven/raven_framework.py", line 26, in <module>
sys.exit(main(True))
File "/Users/civet/civet/build_0/raven/ravenframework/Driver.py", line 203, in main
raven()
File "/Users/civet/civet/build_0/raven/ravenframework/Driver.py", line 155, in raven
simulation.initialize()
File "/Users/civet/civet/build_0/raven/ravenframework/Simulation.py", line 543, in initialize
self.jobHandler.initialize()
File "/Users/civet/civet/build_0/raven/ravenframework/JobHandler.py", line 140, in initialize
self.__initializeRay()
File "/Users/civet/civet/build_0/raven/ravenframework/JobHandler.py", line 224, in __initializeRay
self.rayServer = ray.init(num_cpus=int(self.runInfoDict['totalNumCoresUsed']),include_dashboard=db) if _rayAvail else \
File "/Users/civet/.conda/envs/raven_libraries/lib/python3.8/site-packages/ray/_private/client_mode_hook.py", line 105, in wrapper
return func(*args, **kwargs)
File "/Users/civet/.conda/envs/raven_libraries/lib/python3.8/site-packages/ray/_private/worker.py", line 1420, in init
_global_node = ray._private.node.Node(
File "/Users/civet/.conda/envs/raven_libraries/lib/python3.8/site-packages/ray/_private/node.py", line 267, in __init__
self._ray_params.update_pre_selected_port()
File "/Users/civet/.conda/envs/raven_libraries/lib/python3.8/site-packages/ray/_private/parameter.py", line 326, in update_pre_selected_port
raise ValueError(
ValueError: Ray component dashboard_agent_grpc is trying to use a port number 65534 that is used by other components.
Port information: {'gcs': 'random', 'object_manager': 'random', 'node_manager': 'random', 'gcs_server': 65534, 'client_server': 'random', 'dashboard': 'random', 'dashboard_agent_grpc': 65534, 'dashboard_agent_http': 52365, 'metrics_export': 65535, 'redis_shards': 'random', 'worker_ports': 'random'}
If you allocate ports, please make sure the same port is not used by multiple components.
Running test failed with exit code -15
(678F1/819) Failed ( 7.69sec)tests/framework/InternalParallelTests/ROMscikit
Job Test mac on ca0493d : invalidated by @joshua-cogliati-inl
ValueError: Ray component dashboard_agent_grpc is trying to use a port number 65534 that is used by other components.
Windows failed.
Job Test Fedora 31 on 6bbda33 : invalidated by @joshua-cogliati-inl
FAILED: Diff tests/framework/PostProcessors/EconomicRatio/timeDepDataset
checklist is good, and tests are green. PR can be merged.