ALFNet
ALFNet copied to clipboard
run train.py occur error
when I run train.py ,I run into some error
File "/ghome/zhenye/ALFNet-master/keras_alfnet/data_generators.py", line 7, in
I have the same problem.
@zhenyezi @zhenyezi
num of training samples: 2112
Using TensorFlow backend.
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
File "train.py", line 35, in
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git and cd py-faster-rcnn/lib and make then copy the utils document from py-faster-rcnn to the utils document from ALFNet then uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps" it works for me...
@pnnnnnnn , Is the trained results right?
@pnnnnnnn , Is the trained results right?
still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19
@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?
@pnnnnnnn , what is the meaning of "uncomment all "from .utils.bbox import box_op" and change "box_op" to "bbox_overlaps""? comment or uncomment?
oh, sorry, it's "comment" comment all "from .utils.bbox import box_op" change the remaining "box_op" to "bbox_overlaps"
@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?
@pnnnnnnn do you check the box_op and bbox_overlaps have the same function?
there's no box_op function
@yongqiangzhang1 @zhangxydlut @MADONOKOUKI @pnnnnnnn Please try this compiled document utils.zip
"No module named cython_bbox" and "No module named bbox" are solved by your compiled utils.zip
files. But there is a new error from nms.gpu_nms import gpu_nms; ImportError: No module named gpu_nms
, can you compile the nms
and upload the compiled nms document. Thanks.
@yongqiangzhang1 You can have a try. nms.zip
nms works, thank you very much.
@pnnnnnnn , Is the trained results right?
still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19
Did you get the same MR as the paper?
@pnnnnnnn , Is the trained results right?
still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19
Did you get the same MR as the paper?
not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs
@pnnnnnnn , Is the trained results right?
still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19
Did you get the same MR as the paper?
not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs
Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.
@pnnnnnnn , Is the trained results right?
still training, for now i've trained for 70 epochs and the total loss dropped from 0.66 to 0.19
Did you get the same MR as the paper?
not yet(?), i've trained for 200 epochs(2k iterations per epoch, batchsize 4, gpu 1050ti) and got 16.53 on the best model, and now i'm decreasing the lr from 1e-4 to 1e-5 for 100 epochs
Hi, still the question, did you get the same MR as the paper? The best score I have got is 16.33. A BIG GAP.
the best i've got is 13.18, maybe it's because my small batchsize(only 4) that i can't reach 12.01
hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch
Traceback (most recent call last):
File "test.py", line 32, in
@yongqiangzhang1 You can have a try. nms.zip
hi, @VideoObjectSearch , when i use the nms.zip, i have the problem: ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
i use CUDA9.0, how can i compile to make it work?
@yongqiangzhang1 You can have a try. nms.zip hi,when i use the nms.zip,i solve the problem "ImportError: No module named gpu_nms",but the new problem comes: Traceback (most recent call last): File "train.py", line 40, in
from keras_alfnet.model.model_2step import Model_2step File "/home/by/ma/ALFNet-master/keras_alfnet/model/model_2step.py", line 7, in from keras_alfnet import bbox_process File "/home/by/ma/ALFNet-master/keras_alfnet/bbox_process.py", line 7, in from nms_wrapper import nms File "/home/by/ma/ALFNet-master/keras_alfnet/nms_wrapper.py", line 9, in from nms.cpu_nms import cpu_nms ImportError: /home/by/ma/ALFNet-master/keras_alfnet/nms/cpu_nms.so: undefined symbol: PyFPE_jbuf how can i solve it? Thank you.
@yongqiangzhang1 You can have a try. nms.zip
hi, @VideoObjectSearch , when i use the nms.zip, i have the problem: ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory
i use CUDA9.0, how can i compile to make it work?
I meet the same problem, do you find the cuda 9 version of nms?
@yongqiangzhang1 You can have a try. nms.zip
hi, @VideoObjectSearch , when i use the nms.zip, i have the problem: ImportError: libcudart.so.8.0: cannot open shared object file: No such file or directory i use CUDA9.0, how can i compile to make it work?
I meet the same problem, do you find the cuda 9 version of nms?
you can try nms.zip
hi, when i run the test.py, also have the same problem. i use python3.5 @VideoObjectSearch Traceback (most recent call last): File "test.py", line 32, in from keras_alfnet.model.model_1step import Model_1step File "/home/ou/workplace/ALFNet/keras_alfnet/model/model_1step.py", line 1, in from .base_model import Base_model File "/home/ou/workplace/ALFNet/keras_alfnet/model/base_model.py", line 2, in from keras_alfnet import data_generators File "/home/ou/workplace/ALFNet/keras_alfnet/data_generators.py", line 7, in from .utils.cython_bbox import bbox_overlaps ImportError: /home/ou/workplace/ALFNet/keras_alfnet/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct
hi, i meet the same question, have you solved it?
@yongqiangzhang1 @pnnnnnnn follow the code I can train 150 epochs, but when i run the test.py using the train result resnet_e3_l1.15433712553.hdf5 , I cannot get test result, the val_det.txt is empty, why?