fairseq icon indicating copy to clipboard operation
fairseq copied to clipboard

In load_input_to_shared function of fairseq/fairseq/modules /cuda_utils.cu, does it use a negative index risking with undefined behaviour?

Open JoshuaGhost opened this issue 1 year ago • 0 comments

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs. I searched the issues and the docs. It seems no one mentioned this question or a similar one.

What is your question? (with code)

In the function load_input_to_shared in fairseq/fairseq/modules/cuda_utils.cu, does it use a negative index risking with undefined behaviour?

It is written that for the first sequence block that, if the padding length is less than the sequence block, the load the left overhang (for the first block, paddings) at once:

...
  if (iteration > 0) {
    if (padding_l < SB) {
      // load all at once
      if (tid < padding_l) {
        output[tid] =
            (no_prev) ? input[inputOffset - padding_l + tid] : output[tid + SB];  // <- here
      }
    } else {
...

But when calling from, e.g. this line in lightconv_cuda_kernel.cu, for the first block, the input Offset is 0, padding_l is larger than 0 and tid is 0 for the first thread. The resulting index inputOffset - padding_l + tid equals -padding_l and is a negative index. I trained a model with the lightconv_layer as a component but everything worked just fine without issuing any error.

My question is, why there is no problem using negative indices, and whether it might result in undefined behaviour?

What have you tried?

I tried to train a model with a lightconv_layer as a component layer, everything works well without issuing any error. This makes me even more confused

What's your environment?

  • fairseq Version (main):
  • PyTorch Version (2.0.1)
  • OS (Ubuntu 20.04):
  • How you installed fairseq (pip):
  • Build command you used (python setup.py build within the folder lightconv_layer):
  • Python version: 3.8
  • CUDA/cuDNN version: CUDA: 11.8.0
  • GPU models and configuration: Nvidia A100 with driver version: 525.125.06
  • Any other relevant information: the training runs in a docker container, whose image is built based on nvidia/cuda:11.8.0-runtime-ubuntu20.04

JoshuaGhost avatar Dec 21 '23 13:12 JoshuaGhost