chimera icon indicating copy to clipboard operation
chimera copied to clipboard

test data processing issue

Open zheshiyige opened this issue 6 years ago • 7 comments

When I run the script on the test data, it looks like slow and sometimes seems no change. Could you tell me how to solve this issue? Thank you!

zheshiyige avatar Jul 14 '19 15:07 zheshiyige

First thing I would try is:

  1. In planner/naive_planner.py change is_parallel from True to False. That will make sure the issue is not the parallelization.
  2. Under cache directory, delete the train-planner directory (which will cause the system to instantiate it, without needing to re-train things)

If it is still stuck, I believe it is under tqdm, can you post a screenshot of the number it is stuck on (and the time?)

AmitMY avatar Jul 14 '19 16:07 AmitMY

WeChat Image_20190714094514

Here is the the number and time it is stuck on after I following the above instructions, it looks like still stuck.

zheshiyige avatar Jul 14 '19 16:07 zheshiyige

Wow, 8 hours seems excessive. Can you also say how much time has passed? this sometimes happens to me, and then after 20 minutes or so I see it passed a few more and that just 1 was very large to process.

AmitMY avatar Jul 14 '19 16:07 AmitMY

Actually, it only takes about 10 minutes to process 892/1862, but then there are no changes in the following, there is no change in most time of 8 hours.

zheshiyige avatar Jul 14 '19 17:07 zheshiyige

Do I need to skip this test case? Is it because this case has too many possible plans so that it takes a long time to generate the plan?

zheshiyige avatar Jul 15 '19 17:07 zheshiyige

Sorry for the late response. You can skip this part of the pipeline, but then you will not get test-set evaluation results.

The test set is built of 2 parts, and the largest, hardest graphs to process are numbers 860-960. I'm unsure how I can further help here, except to say - try again - and make sure your machine has enough memory. If I would know exactly what part of the code is so hard for it, I would definitely try to help here.

There is one immediate possible solution, which is to create a sub-pipeline for every test datum, such that you would run the code until it is stuck, then rerun, and it would continue where it left off, hopefully with much clearer memory (I'm suspecting a memory leak)

AmitMY avatar Jul 18 '19 10:07 AmitMY

Hi, I also encountered this problem when the code runs on the test set. I tried this changing the " is_parallel" from True to False as you suggested but the problem still persists. I'm wondering if there is an update on this?

jeffersonHsieh avatar Mar 05 '20 14:03 jeffersonHsieh