WS_DAN
WS_DAN copied to clipboard
Embedding dimension
The depth of the feature_maps, aka the depth of Mixed_6e
from Inception_v3, is 768 and by default 32 attention_maps are generated, then after the BAP module, the width and height of tensor are reduced, leaving a tensor of shape (N, 32, 768), right?
Then it is normalized and reshape to (N, 32*768) as the embeddings. It confuses me that wouldn't it a bit too large for an embedding? I read other papers about metric learning and most of them would not generate an embedding of size large than 512.