Pix2Seq icon indicating copy to clipboard operation
Pix2Seq copied to clipboard

Semantic Conflict about variable 'max_len'

Open JJJYmmm opened this issue 1 year ago • 0 comments

Hi Shariatnia, thanks for your tutorial! I have some question about variable max_len. I see max_len first in class Tokenizer,I think the role of it is to limit the maximum number of objects.

labels = labels.astype('int')[:self.max_len]

bboxes = self.quantize(bboxes)[:self.max_len]

But in function collect_fn used for dataloader,I think max_len is to limit the maximum length of input sequence.

if max_len: # [B,max_seq_len,dim] -> [B,max_len,dim]
        pad = torch.ones(seq_batch.size(0), max_len -
                         seq_batch.size(1)).fill_(pad_idx).long()
        seq_batch = torch.cat([seq_batch, pad], dim=1)

I have checked the inputs of the two variables and both came from CFG.max_len,so it's not a coincidence.

I think the variable in the second place should be 5 times that in the first place (excluding eos and bos), because an object corresponds to 5 tokens. I don't know if I'm right,Looking forward to your reply.

JJJYmmm avatar Sep 02 '23 02:09 JJJYmmm