parsec icon indicating copy to clipboard operation
parsec copied to clipboard

Miscellaneous sched issues

Open omor1 opened this issue 2 years ago • 0 comments

Collection of various issues in different MCA sched components; I'll make a PR to fix these at some point.

  • LFQ
    • Currently iterates over local queue twice if local queue is empty
      • Loop in sched_lfq_select should start at i = 1 to avoid this; hierarch_queues[0] is always task_queue
  • LHQ
    • Small memory leak by allocating hierarch_queues twice
    • Not sure how this functions when hwloc is disabled or broken, this probably hasn't been tested recently—I didn't see any checks for this, unlike in e.g. LFQ
  • LL
    • Minor typo in SDE counter description
  • LTQ
    • ~~Appears that it can lose tasks~~
      • ~Scenario:~
        1. ~~Local task_queue is empty~~
        2. ~~All other task_queue are empty~~
        3. ~~We obtain a heap from the system_queue~~
        4. ~~That heap has 3+ tasks so heap_split_and_steal returns a valid heap in new_heap~~
      • ~~heap gets pushed back to the local task_queue~~
      • ~~new_heap doesn't and all tasks on it disappear~~
      • ~~See lines 217–231~~
      • ~~Either push back new_heap too or just use heap_remove instead of heap_split_and_steal~~
        • ~~heap_split_and_steal is used when stealing a heap from a non-local task_queue so that we don't steal all tasks in that heap~~
        • ~~This isn't a concern when we get a heap from the system_queue~~
        • This is actually more of a problem of documentation and heap_split_and_steal not doing exactly what the name implies
        • The function essentially returns a linked list of two heaps and both get pushed onto the local task_queue.
        • I think that it does cause the heap of tasks with shared inputs to be split into two sub-heaps though.

omor1 avatar Jun 15 '22 20:06 omor1