nipype icon indicating copy to clipboard operation
nipype copied to clipboard

Resource profiler not working?

Open atsuch opened this issue 7 years ago • 9 comments
trafficstars

Summary

I have followed the instruction here (http://nipype.readthedocs.io/en/latest/users/resource_sched_profiler.html) to monitor the resource of the each node, but I do not seem to be getting the profiles I need, namely the runtime_memory_gb and runtime_num_threads. I get "N/A" for them in the run_stats.log, and these attributes are absent from results.pklz of each node as well as from the report.rst.

Also, this is not directly related to this problem, but somehow my terminal outputs are written vertically in the report.rst, as can be seen in the attached file...

Actual behavior

I attached an example report.rst from one of my nodes, as well as run_stats.log report.txt run_stats.log

Expected behavior

I have psutil in my conda env, and it could be imported without any problem, so I expected to get runtime info.

How to replicate the behavior

It might be specific version of things I have in my conda env, but I have not figured out which package is the culprit.

Platform details:

I'm using nipype 1.0.2 on python3.6 conda env with the default pickle, and psutil 5.4.3.

Execution environment

Choose one

  • Container [Tag: ???]
  • My python environment inside container [Base Tag: ???]
  • My python environment outside container

atsuch avatar May 09 '18 15:05 atsuch

Hi @atsuch, yes this is an open problem. We had to rewrite the resource profiler and I am guessing that the final conversion from the new profiler to the old one is not working. Also the documentation is very outdated.

I'll try to address this problem ASAP.

Thanks for reporting. I might contact you for follow up information and to confirm that fixes work for you.

oesteban avatar May 09 '18 16:05 oesteban

Thank you @oesteban for letting me know...!

I read a few issue threads on the new resource profiler but was not sure if this was an open issue. Meanwhile, could I perhaps try my pipeline in an older version of nipype to get the resource profile? Do you know from which version this is an issue?

atsuch avatar May 09 '18 16:05 atsuch

I would rather prefer that you tried the new resource profiling framework and work out your own plotting. I've been meaning to work on this plotting myself for long. If you are up to trying, we can make it a concerted effort where I'd support you.

To enable the resource profiling, please check http://nipype.readthedocs.io/en/latest/users/config_file.html#resource-monitor

Then, it will generate a log file with all memory and cpu traces, so that you can build up the graphs you'd like on top.

The project here would be some D3 code to render these usage plots. WDYT?

oesteban avatar May 09 '18 18:05 oesteban

So buried in my previous comment: current nipype version will collect resource usage statistics. The broken part is the adaptor to the old plotting resources.

oesteban avatar May 09 '18 18:05 oesteban

@osteban, OK, so after some struggles, I got the resource profiler to work. What happened was that I had only followed the steps in this page (http://nipype.readthedocs.io/en/latest/users/resource_sched_profiler.html), without reading config page you mentioned, and I guess the resource monitoring was not enabled...

However, the docs should be linked so that whoever following the resource_sched_profiler read the config page, no?

Also, there is something very strange about the resource monitoring. First of all, I don't think I am getting the right information... Here is the new run_stats.log after enabling the resource monitoring;

run_stats.log

My runtime_threads are values like 173.417969.. this doesn't make sense, does it?

Additionally, the wf prematurely terminates at a particular node (always at the same node, it seems) with this message;

Traceback (most recent call last): File "diff_test.py", line 58, in 'status_callback': log_nodes_cb}) File "/homes_unix/tsuchida/anaconda2/envs/py3_ls/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 595, in run runner.run(execgraph, updatehash=updatehash, config=self.config) File "/homes_unix/tsuchida/anaconda2/envs/py3_ls/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 170, in run self._task_finished_cb(jobid) File "/homes_unix/tsuchida/anaconda2/envs/py3_ls/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 385, in _task_finished_cb self._status_callback(self.procs[jobid], 'end') File "/homes_unix/tsuchida/anaconda2/envs/py3_ls/lib/python3.6/site-packages/nipype/utils/profiler.py", line 144, in log_nodes_cb 'start': getattr(node.result.runtime, 'startTime'), AttributeError: 'list' object has no attribute 'startTime'

So there seems to be something wrong with node.result.runtime object, at least for this node.

Sorry if my question is deviating from my original post... but let me know if you have any insights!

atsuch avatar May 11 '18 09:05 atsuch

I have the same problem with the node.result.runtime object! I think the problem only occurs with MapNodes, in this case the log_nodes_cb is called with a list of runtimes in node.result.runtime. For me it was something like this: node.result.runtime = [Bunch(<runtime_node_1>), Bunch(<runtime_node_2), Bunch(<runtime_node_3>)]

I propose to ignore MapNodes by checking for lists, since the subnodes already called log_nodes_cb if isinstance(node.result.runtime, list): return

Is there a cleaner way to do this? Also it does not solve the original problem in this Issue (which I also have!

divetea avatar Jun 11 '18 11:06 divetea

For me the solution was:

from nipype import config
config.enable_resource_monitor()

In addition to the stuff with the logger as seen in this tutorial:

https://github.com/nipy/nipype/blob/master/doc/users/resource_sched_profiler.rst

divetea avatar Jul 13 '18 13:07 divetea

Thanks for posting a solution :)

oesteban avatar Jul 13 '18 15:07 oesteban

I am still getting this issue in nipype 1.8.5. I created my own copy of the log_nodes_cb function and added the following as a workaround:

if isinstance(node, MapNode):
    return

astewartau avatar Dec 09 '22 07:12 astewartau