cgru icon indicating copy to clipboard operation
cgru copied to clipboard

Houdini PDG Afanasy Scheduler

Open timurhai opened this issue 3 years ago • 9 comments

Hi everybody! This issue is created to discuss PDG Afanasy scheduler implementation. Since the implementation scheduler is started, I decided to create a new issue for more concrete discussion.

timurhai avatar Jul 06 '21 11:07 timurhai

Here is the first commit.

It uses dynamic method. On work item schedule a new block/task will be appended to an existing job.

Each TOP node work items are joined in a block. In feature we should have an ability to setup TOP node Afanasy task parameters via block parameters (capacity, service, parser and so on). Also it helps to visualize job structure in GUIs.

For now "control" job is just an empty task - an opened Houdini scene is needed with a running graph.

timurhai avatar Jul 06 '21 11:07 timurhai

There is a lots of work to do. Only few features/callbacks are implemented, no checks for any errors can happen. I can say that for now it is the minimal version that can just work, if everything is just OK.

timurhai avatar Jul 06 '21 11:07 timurhai

Would it perhaps be an idea to create the scheduler 100% using python and replace the .hda?

This way it's easier to subclass it to make customizations.

lithorus avatar Jul 07 '21 21:07 lithorus

If it is possible, it will be better. Is it possible? (may be i missed something)

timurhai avatar Jul 08 '21 07:07 timurhai

Yes, look at the other schedulers.

In the templateBody class method. I really hope they extend this to not just TOP.

lithorus avatar Jul 08 '21 08:07 lithorus

It seems that layout is not supported by templateBody https://www.sidefx.com/forum/topic/74776/?page=1#post-318968

timurhai avatar Jul 08 '21 08:07 timurhai

Hmmm.. I will try and see if something can be done through "on creation" callbacks..

lithorus avatar Jul 08 '21 09:07 lithorus

"Submit Job As Graph" sends a job with 1 block and 1 task to cook TOP network. This job will create another separate job and dynamically append job/tasks to it. It will works the same as you to cook from Houdini session (that task command does the same). So you can re-cook w/o opening Houdini, if you delete work items job and restart graph job.

timurhai avatar Jul 13 '21 07:07 timurhai

By default, workItemResultServerAddr() returns local host name and port. This address is used to notify PDG (in an opened Houdini session) that an item is done. As Afanasy task can be not done if an item is in a batch. This way PDG can start to render if the first frames of a simulation finished, but not the entire simulation task.

But on our farm, artist machine is not reachable by name, only by IP.

The solution to find a local IP address is used from: https://stackoverflow.com/questions/166506/finding-local-ip-addresses-using-pythons-stdlib?page=1&tab=votes#tab-top

May be better to create an option (checkbox) for this on the scheduler node.

timurhai avatar Jul 14 '21 13:07 timurhai