Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet
Hi, there. I wrote a project in order to use attention-based sampler of TASN without the need of rebuilding MXNet. The link of this project is https://github.com/wkcn/AttentionSampler
It is available for MXNet and PyTorch.
The result (default setting):
INFO:root:Epoch[204] Train-att_net_accuracy=1.000000
INFO:root:Epoch[204] Train-part_net_accuracy=0.979167
INFO:root:Epoch[204] Train-master_net_accuracy=0.989583
INFO:root:Epoch[204] Train-part_net_aux_accuracy=0.979167
INFO:root:Epoch[204] Train-master_net_aux_accuracy=0.989583
INFO:root:Epoch[204] Train-distillation_loss=4.280940
INFO:root:Epoch[204] Time cost=20.882
INFO:root:Epoch[204] Validation-att_net_accuracy=0.806771
INFO:root:Epoch[204] Validation-part_net_accuracy=0.849132
INFO:root:Epoch[204] Validation-master_net_accuracy=0.856944
INFO:root:Epoch[204] Validation-part_net_aux_accuracy=0.870486
INFO:root:Epoch[204] Validation-master_net_aux_accuracy=0.867361
INFO:root:Epoch[204] Validation-distillation_loss=3.713491
INFO:root:Epoch[299] Train-att_net_accuracy=1.000000
INFO:root:Epoch[299] Train-part_net_accuracy=0.984375
INFO:root:Epoch[299] Train-master_net_accuracy=0.984375
INFO:root:Epoch[299] Train-part_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-master_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-distillation_loss=4.100089
INFO:root:Epoch[299] Time cost=20.978
INFO:root:Saved checkpoint to "./model/tasn-0300.params"
INFO:root:Epoch[299] Validation-att_net_accuracy=0.804986
INFO:root:Epoch[299] Validation-part_net_accuracy=0.856728
INFO:root:Epoch[299] Validation-master_net_accuracy=0.860485
INFO:root:Epoch[299] Validation-part_net_aux_accuracy=0.864754
INFO:root:Epoch[299] Validation-master_net_aux_accuracy=0.869023
INFO:root:Epoch[299] Validation-distillation_loss=3.620270
Hope that it will be helpful for you!
Hi,I tried your method,but get this error
File "./AttentionSampler/attention_sampler/attention_sampler.py", line 26, in forward self.F.broadcast_minimum(threshold, attx, out=attx) File "<string>", line 48, in broadcast_minimum File "/usr/local/lib/python3.6/site-packages/mxnet-1.3.1-py3.6.egg/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke ctypes.byref(out_stypes))) File "/usr/local/lib/python3.6/site-packages/mxnet-1.3.1-py3.6.egg/mxnet/base.py", line 253, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [10:27:32] src/operator/tensor/./elemwise_binary_broadcast_op.h:68: Check failed: l == 1 || r == 1 operands could not be broadcast together with shapes [12,1] [12,512,1]
@tiancity-bytedance Thank you for the report. I will check it.
Thanks ! look forward your response
hello, can you tell the mean of function mobula.func.cumsum
File "./AttentionSampler/attention_sampler/attention_sampler.py", line 53, in forward mobula.func.cumsum(N, attx, attxi, att_size) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/func.py", line 184, in __call__ using_async=using_async) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/func.py", line 89, in __call__ func = self.loader(self, arg_types, ctx, **self.loader_kwargs) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/op/loader.py", line 539, in op_loader _build_lib(cpp_fname, code_buffer, ctx, dll_fname) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/op/loader.py", line 274, in _build_lib build_path_ctx = os.path.join(build_path, ctx) File "/usr/local/lib/python3.6/posixpath.py", line 94, in join genericpath._check_arg_types('join', a, *p) File "/usr/local/lib/python3.6/genericpath.py", line 149, in _check_arg_types (funcname, s.__class__.__name__)) from None TypeError: join() argument must be str or bytes, not 'NoneType'
@tiancity-bytedance
mobula.func.cumsum is similar to np.cumsum
@tiancity-bytedance
mobula.func.cumsumis similar tonp.cumsum
thanks for you respones, but np.cumsum just one array param. but what mean the four params in the func.cumsum?
@tiancity-bytedance
cumsum_kernel(const int N, const T* X, T* I, const int att_size)
The four parameters are batch size, input, output, the number of elements in a batch respectively.
thanks ! it works for me ! and have you test the accuracy ?
Sorry, I have not tested it. I'm busy recently. I will train it.
Hi, @tiancity-bytedance . I have tested it and got the 86~87 accuracy on CUB-200-2011.
Setting: Number of GPUs: 4 Batch Size: 48 MobulaOP/mobula/config.yaml : USING_ASYNC_EXEC: 0
INFO:root:Epoch[299] Train-att_net_accuracy=1.000000
INFO:root:Epoch[299] Train-part_net_accuracy=0.984375
INFO:root:Epoch[299] Train-master_net_accuracy=0.984375
INFO:root:Epoch[299] Train-part_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-master_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-distillation_loss=4.100089
INFO:root:Epoch[299] Time cost=20.978
INFO:root:Saved checkpoint to "./model/tasn-0300.params"
INFO:root:Epoch[299] Validation-att_net_accuracy=0.804986
INFO:root:Epoch[299] Validation-part_net_accuracy=0.856728
INFO:root:Epoch[299] Validation-master_net_accuracy=0.860485
INFO:root:Epoch[299] Validation-part_net_aux_accuracy=0.864754
INFO:root:Epoch[299] Validation-master_net_aux_accuracy=0.869023
INFO:root:Epoch[299] Validation-distillation_loss=3.620270
Hi, can you help me with this error?
Error in CustomOp.forward: Traceback (most recent call last): File "C:\Users\DIT\Anaconda3\envs\mxnet\lib\site-packages\mxnet\operator.py", line 987, in forward_entry aux=tensors[4]) File "d:\software\mobulaop\mobula\glue\mx.py", line 103, in forward out = self._forward(*in_data) File "./AttentionSampler/attention_sampler\attention_sampler.py", line 44, in forward attxi = F.cumsum(attx, 1) AttributeError: module 'mxnet.ndarray' has no attribute 'cumsum'
Hi @vb123er951 , the function mx.nd.cumsum is supported in the eldder version of MXNet.
Please use the latest version such as MXNet 1.6 : )
Hi @wkcn , thank you for reply, I am using Windows, but MXNet 1.6 seems not support in Windows... does there has any other solution?
Hi @vb123er951 , I have updated the code, which supports the old version of MXNet without cumsum.
@wkcn Thank you very much!
cumsum problem solved now, trying to solve other problems...
I have the problem
AttributeError: module "mobula.op" has no attribute "AttSamplerGrid"
Can anyone help me? Thanks a lot