Transformers-Tutorials icon indicating copy to clipboard operation
Transformers-Tutorials copied to clipboard

Donut model size beyond 768*2 max_length

Open nik13 opened this issue 1 year ago • 3 comments

Is there any way to go beyond the max_length of 768x2? I tried training the model using 768x4 as the max_length with sufficient gpu power, but its giving internal cuda error (not related to memory usage).

Is there any way to achieve greater max_length? or its just model limitation?

nik13 avatar Apr 28 '23 08:04 nik13

I am also looking for the answer to this.

lusid avatar Jun 01 '23 19:06 lusid

Any conclusions?

sjtu-cz avatar Apr 17 '24 05:04 sjtu-cz

I think you would need to interpolate the position embeddings of the pre-trained text decoder for the model to go beyond 768 tokens.

As seen here: https://github.com/clovaai/donut/blob/4cfcf972560e1a0f26eb3e294c8fc88a0d336626/donut/model.py#L188-L195

NielsRogge avatar Apr 17 '24 06:04 NielsRogge