amoro
amoro copied to clipboard
[Bug]: AMS unexpectedly exits after a long full GC
What happened?
If the embedded Spark local terminal is enabled, the AMS process may exit if the Spark executor's heartbeat times out with the driver, which could happen if a minute-long full gc happens.
The related logs are as following:
ERROR [executor-heartbeater] [org.apache.spark.executor.Executor] [] - Exit as unable to send heartbeats to driver more than 60 times
Affects Versions
0.5.1
What table formats are you seeing the problem on?
Iceberg
What engines are you seeing the problem on?
AMS
How to reproduce
No response
Relevant log output
No response
Anything else
No response
Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct