affordance-net
affordance-net copied to clipboard
A question about .sm file
Hi, thanks for your code! I have a question about .sm file. I read the convert_instance_png_to_sm.py
. The image '0.png' has 3 objects --> has 3 affordance masks: '0_1.png', '0_2.png', '0_3.png'.
. But pascal_voc dataset can't devide several masks, what should I do? Also, How should I deal with .sm files? I found pascal_voc.py have tips:
if cfg.TRAIN.MASK_REG: ## need more processing here # 1. create seg_mask_save for this obj (mask size equals to image size) # 2. Convert to bool: # seg_mask_save = seg_mask_save.astype(bool). #Note that in case multi label---> DO NOT convert to bool # 3. seg_mask_path = './data/cache/seg_mask_pascal2012_gt/' + str(index) + '_' + str(count) + '_segmask.sm' # 4. save into folder # with open(seg_mask_path, 'wb') as f_seg_save: # cPickle.dump(seg_mask_save, f_seg_save, cPickle.HIGHEST_PROTOCOL) # print ("=======================index:" + str(index)) # print ("=======================ix:" + str(ix)) #index has form: index = "2008_000008" --> has to parse into integer number index_t = index.strip()
Can you fix this part ? thx!
The segmentation groundtruth from Pascal dataset is only binary. If you don't care about the object parts/affordances, then you can simply just treat all masks equally. In this case, it becomes the instance segmentation problem, which is less complicated. Each .sm file is for one object and keeps the affordance IDs that this object has.
@nqanh Thanks! But I still don't understand, for pascal_voc dataset, I find segmentationclass only have 2913 .png less than train samples. If I want use it to affordanceNet and don't care about the object parts, What should I do?
And about your dataset(IIT), I download the IIT_Affordances_2017
dataset. Can you tell me how to deal with it to get dataset like yours? I find the dataset don't have .png. I'm really anxious with it! I'll be appreciate if you have time to answer it
If you don't care about the object parts, then in your mask groundtruth, you'll have only 2 classes (background + foreground). If you prepare your data correctly, then AffordanceNet code works fine with 2 classes. You can visualize the groundtruth to understand more (there are already some discussions and code in other issues).
The IIT_Affordances_2017 does has the image files :)
@litingsjj and please change the number of classes in proto.txt files
@thanhtoando thanks!