sgpt icon indicating copy to clipboard operation
sgpt copied to clipboard

If I input more than the max_seq_length?

Open runwean opened this issue 1 year ago • 4 comments

I see that sgpt-bloom-7b1-mamarco model has a vector length of 300,but

If I input more than the maximum length, for example, input more than 400 Chinese characters, it seems that it can also be embedded in the vector, but it seems that the increase to more than 500 will not affect the vector calculation results.

Can I enter a maximum Chinese character of 500?

runwean avatar Aug 09 '23 08:08 runwean

Yes you can input more characters. The calculation may not be affected because you need to change the max_sequence_length - Check this issue: https://github.com/Muennighoff/sgpt/issues/23#issuecomment-1486379896

If it still does not work, please provide the exact code you are using.

Muennighoff avatar Aug 09 '23 08:08 Muennighoff

Thanks for the reply. My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

runwean avatar Aug 09 '23 09:08 runwean

Thanks for the reply. My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

Yeah, it'd be really interesting to know how performance is at longer sequences. If you run any experiments and have any data on how it performs, would be amazing if you could share it 🚀

Muennighoff avatar Aug 09 '23 09:08 Muennighoff

Thanks for the reply. My understanding is that because the model is trained with 300 tokens, if we change the input length, for example, to 500, the effect may be similar, but if the increase is larger, it is not possible to have a bad effect, because the training sample is not so long 🤔

Yeah, it'd be really interesting to know how performance is at longer sequences. If you run any experiments and have any data on how it performs, would be amazing if you could share it 🚀

Thanks you!

runwean avatar Aug 09 '23 12:08 runwean