ray icon indicating copy to clipboard operation
ray copied to clipboard

[Core] Fix condition in `IsAvailableForScheduling`

Open ofey404 opened this issue 3 years ago • 0 comments

Why are these changes needed?

Fix a logical flaw at a logic flaw at src/ray/raylet/worker.h:

  • !GetAssignedTaskId().IsNil() should be GetAssignedTaskId().IsNil().
  • Currently, IsAvailableForScheduling is only used for resource deadlock detection.
    • But it may cause subtle bug at resource management in the future.

Related issue number

Closes #27727

Checks

  • [x] I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • [x] I've run scripts/format.sh to lint the changes in this PR.
  • [ ] I've included any doc changes needed for https://docs.ray.io/en/master/.
  • [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • [ ] Unit tests
    • [ ] Release tests
    • [x] This PR is not tested :(

ofey404 avatar Aug 10 '22 02:08 ofey404

Oh no, I just set git config pull.rebase true and did something strange!

  • I would rollback it quickly...

ofey404 avatar Aug 11 '22 06:08 ofey404

Looks like lots of tests are failing

rkooo567 avatar Aug 17 '22 08:08 rkooo567

Looks like lots of tests are failing

Maybe the comment // No assigned task is wrong, but the condition is correct?

I discussed this possibility in #27727 , comment link.


The only usage shows !GetAssignedTaskId().IsNil() is correct, since task would not cause dead lock.

// src/ray/raylet/node_manager.cc
  for (const auto &worker : worker_pool_.GetAllRegisteredWorkers()) {
    if (worker->IsAvailableForScheduling()) {
      // Progress is being made in a task, don't warn.
      resource_deadlock_warned_ = 0;
      return;
    }
  }

So we might just edit the comment like this:

  bool IsAvailableForScheduling() const {
    return !IsDead()                        // Not dead
           && !GetAssignedTaskId().IsNil()  // Task won't cause dead lock, since progress is being made.
           && !IsBlocked()                  // Not blocked
           && GetActorId().IsNil();         // No assigned actor
  }

ofey404 avatar Aug 17 '22 09:08 ofey404

cc @ofey404 can you change the func name and merge the latest master?

rkooo567 avatar Sep 01 '22 02:09 rkooo567

cc @ofey404 can you change the func name and merge the latest master?

Sure.

ofey404 avatar Sep 01 '22 03:09 ofey404

cpp tests failing!

rkooo567 avatar Sep 01 '22 16:09 rkooo567

cpp tests failing!

I would rebase to the latest release and trigger CI again.

ofey404 avatar Sep 15 '22 06:09 ofey404