PyTorch-Encoding icon indicating copy to clipboard operation
PyTorch-Encoding copied to clipboard

training on PASCAL Context

Open ghost opened this issue 5 years ago • 1 comments

Hello,

I am trying to use that pcontext.py in your repo to reproduce the results from another paper, EMANet (Expectation Maximization Attention Networks, ICCV 2019), but I'm currently ~4% away from the mIOU values cited in the repo. Following the advice in #179 , #78 I train the EMANet on 59 classes without the background class, by applying def _mask_transform(self, mask) during training. https://github.com/zhanghang1989/PyTorch-Encoding/blob/d9dea1724e38362a7c75ca9498f595248f283f00/encoding/datasets/pcontext.py#L97 I also apply def _mask_transform(self, mask) during evaluation of the mIOU on the test (val) set.

However, I noticed that pcontext.py does not actually use _mask_transform() for training but still sets NUM_CLASS = 59 https://github.com/zhanghang1989/PyTorch-Encoding/blob/d9dea1724e38362a7c75ca9498f595248f283f00/encoding/datasets/pcontext.py#L19

https://github.com/zhanghang1989/PyTorch-Encoding/blob/d9dea1724e38362a7c75ca9498f595248f283f00/encoding/datasets/pcontext.py#L71-L95

Is there anything about the training procedure that I may be misunderstanding? I'm assuming that I've done something wrong in the training of the network rather than writing the script for multi-scale evaluation because of the huge difference in performance (~ 4%).

Is it possible that number of training iterations can have a huge effect on the performance? I am using 15k iterations (~48 epochs) as cited in the EMANet paper, but I noticed in the CFNet paper, you use 80 epochs (~25k iterations).

I would really appreciate your help. I look forward to hearing from you.

ghost avatar May 18 '20 08:05 ghost

I think you are going to the wrong repo.

I am only maintaining the code for my own papers.

zhanghang1989 avatar May 18 '20 17:05 zhanghang1989