Gaze-Attention copied to clipboard
Reproducing results
Hi and thanks for the great work.
I have difficulties reproducing the result reported on the EGTEA Gaze+ dataset. I'm using your provided trained weights and following the guide on code usage I get this number on different splits:
: acc: 36.85, 49.21 / 0:22:27
:acc: 47.65, 57.44 / 0:15:22
:acc: 50.41, 60.14 / 0:15:07
How should I reproduce 69.73%?
I'm using parameters as default:
parser.add_argument('--mode', default='test', help='train | test')
parser.add_argument('--crop', type=int, default=224, help='for spatial cropping')
parser.add_argument('--trange', type=int, default=24, help='temporal range')
parser.add_argument('--stride', type=int, default=8, help='pooling stride for gaze prediction')
parser.add_argument('--b', type=int, default=1, help='batch size')
parser.add_argument('--wd', type=float, default=4e-5, help='weight decay')
parser.add_argument('--it1', type=int, default=8000, help='first decay point')
parser.add_argument('--it2', type=int, default=15000, help='second decay point')
parser.add_argument('--iters', type=int, default=18000, help='number of max iterations for training')
parser.add_argument('--lr', type=float, default=0.032, help='learning rate')
parser.add_argument('--ngpu', type=int, default=1, help='number of GPUs to use')
parser.add_argument('--eps', type=float, default=1000, help='epsilon for the gradient estimator')
parser.add_argument('--anneal', type=float, default=1e-3, help='anneal rate for epsilon')
parser.add_argument('--datapath', default='dataset', help='path to dataset')
parser.add_argument('--datasplit', type=int, default=1, help='data split for the cross validation')
parser.add_argument('--weight', default='weights/', help='path to the weight file for the base network')
parser.add_argument('--seed', type=int, default=1, help='random seed')
parser.add_argument('--test_sparse', action='store_true', help='whether to test sparsely for fast evaluation')
I think that the problem is the optical flow.
For the optical flow frames, I used this repository: link.
If you build it,
will be created. Then, you can use this file to extract the flow frames.
I hope it helps.