mrjob icon indicating copy to clipboard operation
mrjob copied to clipboard

progress indicators are wrong when steps run simultaneously

Open coyotemarin opened this issue 4 years ago • 1 comments

_parse_progress_from_resource_manager() assumes that there will be at most one job running on a cluster at the same time, which is wrong now that clusters can run steps concurrently.

If we know a step's StartTime from the ListSteps API, that seems to only be a few seconds off of Start Time in the resource manager UI. So that's a way we could possibly match up step progress correctly.

It would be really nice if there EMR API would tell us the mapping between EMR step IDs and YARN application IDs, but so far I haven't found one.

coyotemarin avatar Aug 19 '20 21:08 coyotemarin

Since we now have code to talk to the resource manager API, we can guess the application ID for the step from the apps API (based on start time) and then get its progress from the app API.

coyotemarin avatar Aug 27 '20 21:08 coyotemarin