djl-demo Error in DJL 0.14.0 with roberta model

With DJL 0.14.0 and a roberta model, I am getting the following error on predict. Please note that the exact same code works with DJL 0.13.0. I have written my own RobertaTokenizer and Translator. However, for the sake of this problem, I have hard-coded the inputs so Tokenizer is not needed. processInput code is at the end.

The error is as follows:

0 [main] DEBUG ai.djl.repository.zoo.DefaultModelZoo  - Scanning models in repo: class ai.djl.repository.SimpleRepository, file:/mnt/d/code/djltest/../zirai/aamir/dotakb/reranker/outputs/roberta_squad2_output/traced.pt
29 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Loading model with Criteria:
        Application: UNDEFINED
        Input: class ai.djl.modality.nlp.qa.QAInput
        Output: interface java.util.List
        Engine: PyTorch
        ModelZoo: ai.djl.localmodelzoo

29 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Searching model in specified model zoo: ai.djl.localmodelzoo43 [main] DEBUG ai.djl.engine.Engine  - Found EngineProvider: PyTorch
43 [main] DEBUG ai.djl.engine.Engine  - Found default engine: PyTorch
55 [main] WARN ai.djl.repository.SimpleRepository  - Simple repository pointing to a non-archive file.
61 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Checking ModelLoader: ai.djl.localmodelzoo:traced.pt UNDEFINED [
        ai.djl.localmodelzoo/traced.pt/traced.pt {}
]
69 [main] DEBUG ai.djl.repository.MRL  - Preparing artifact: file:/mnt/d/code/djltest/../zirai/aamir/dotakb/reranker/outputs/roberta_squad2_output/traced.pt, ai.djl.localmodelzoo/traced.pt/traced.pt {}
69 [main] DEBUG ai.djl.repository.SimpleRepository  - Skip prepare for local repository.
Loading:     100% |████████████████████████████████████████|
309 [main] DEBUG ai.djl.util.cuda.CudaUtils  - cudart library not found.
314 [main] DEBUG ai.djl.pytorch.jni.LibUtils  - Using cache dir: /home/aamir/.djl.ai/pytorch/1.9.1-cpu-linux-x86_64
316 [main] INFO ai.djl.pytorch.jni.LibUtils  - Extracting /jnilib/linux-x86_64/cpu/libdjl_torch.so to cache ...
444 [main] DEBUG ai.djl.pytorch.jni.LibUtils  - Loading pytorch library from: /home/aamir/.djl.ai/pytorch/1.9.1-cpu-linux-x86_64/0.14.0-cpu-libdjl_torch.so
1167 [main] INFO ai.djl.pytorch.engine.PtEngine  - Number of inter-op threads is 4
1168 [main] INFO ai.djl.pytorch.engine.PtEngine  - Number of intra-op threads is 8
6384 [main] INFO ai.zir.djl.Predictor  - Model loaded successfully...
Enter your question and context to get predicted answers via Bert model.

Enter your question or enter exit to finish:
what is the height of mount everest?
Enter your context or enter exit to finish:
There are certain tall and deep things in this world. for example, the depth of mariana trench is 80000 feet and the height of everst is 32000 feet
ai.djl.translate.TranslateException: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 13, in forward
    qa_outputs = self.qa_outputs
    roberta = self.roberta
    _0 = (roberta).forward(input_ids, attention_mask, )
          ~~~~~~~~~~~~~~~~ <--- HERE
    _1 = torch.split((qa_outputs).forward(_0, ), 1, -1)
    start_logits, end_logits, = _1
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 46, in forward
    _10 = torch.to(extended_attention_mask, 6)
    attention_mask0 = torch.mul(torch.rsub(_10, 1.), CONSTANTS.c0)
    _11 = (embeddings).forward(input_ids, input, )
           ~~~~~~~~~~~~~~~~~~~ <--- HERE
    _12 = (encoder).forward(_11, attention_mask0, )
    return _12
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 73, in forward
    incremental_indices = torch.mul(torch.add(_13, CONSTANTS.c1), mask)
    input0 = torch.add(torch.to(incremental_indices, 4), CONSTANTS.c2)
    _14 = (word_embeddings).forward(input_ids, )
           ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _15 = (token_type_embeddings).forward(input, )
    embeddings = torch.add(_14, _15)
  File "code/__torch__/torch/nn/modules/sparse.py", line 10, in forward
    input_ids: Tensor) -> Tensor:
    weight = self.weight
    inputs_embeds = torch.embedding(weight, input_ids, 1)
                    ~~~~~~~~~~~~~~~ <--- HERE
    return inputs_embeds

Traceback of TorchScript, original code (most recent call last):
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/functional.py(2044): embedding
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py(158): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(131): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(837): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(1498): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/home/aamir/.local/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
/mnt/d/code/zirai/aamir/dotakb/reranker/save_torchscript.py(42): save_model_as_torchscript
/mnt/d/code/zirai/aamir/dotakb/reranker/save_torchscript.py(65): <module>
RuntimeError: index out of range in self

        at ai.djl.inference.Predictor.batchPredict(Predictor.java:186)
        at ai.djl.inference.Predictor.predict(Predictor.java:123)
        at ai.zir.djl.Predictor.predictRoberta(Predictor.java:99)
        at ai.zir.djl.DjlTest.processInputs(DjlTest.java:63)
        at ai.zir.djl.DjlTest.main(DjlTest.java:43)
Caused by: ai.djl.engine.EngineException: The following operation failed in the TorchScript interpreter.
Traceback of TorchScript, serialized code (most recent call last):
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 13, in forward
    qa_outputs = self.qa_outputs
    roberta = self.roberta
    _0 = (roberta).forward(input_ids, attention_mask, )
          ~~~~~~~~~~~~~~~~ <--- HERE
    _1 = torch.split((qa_outputs).forward(_0, ), 1, -1)
    start_logits, end_logits, = _1
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 46, in forward
    _10 = torch.to(extended_attention_mask, 6)
    attention_mask0 = torch.mul(torch.rsub(_10, 1.), CONSTANTS.c0)
    _11 = (embeddings).forward(input_ids, input, )
           ~~~~~~~~~~~~~~~~~~~ <--- HERE
    _12 = (encoder).forward(_11, attention_mask0, )
    return _12
  File "code/__torch__/transformers/models/roberta/modeling_roberta.py", line 73, in forward
    incremental_indices = torch.mul(torch.add(_13, CONSTANTS.c1), mask)
    input0 = torch.add(torch.to(incremental_indices, 4), CONSTANTS.c2)
    _14 = (word_embeddings).forward(input_ids, )
           ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _15 = (token_type_embeddings).forward(input, )
    embeddings = torch.add(_14, _15)
  File "code/__torch__/torch/nn/modules/sparse.py", line 10, in forward
    input_ids: Tensor) -> Tensor:
    weight = self.weight
    inputs_embeds = torch.embedding(weight, input_ids, 1)
                    ~~~~~~~~~~~~~~~ <--- HERE
    return inputs_embeds

Traceback of TorchScript, original code (most recent call last):
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/functional.py(2044): embedding
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/sparse.py(158): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(131): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(837): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/transformers/models/roberta/modeling_roberta.py(1498): forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1090): _slow_forward
/home/aamir/.local/lib/python3.8/site-packages/torch/nn/modules/module.py(1102): _call_impl
/home/aamir/.local/lib/python3.8/site-packages/torch/jit/_trace.py(958): trace_module
/home/aamir/.local/lib/python3.8/site-packages/torch/jit/_trace.py(741): trace
/mnt/d/code/zirai/aamir/dotakb/reranker/save_torchscript.py(42): save_model_as_torchscript
/mnt/d/code/zirai/aamir/dotakb/reranker/save_torchscript.py(65): <module>
RuntimeError: index out of range in self

        at ai.djl.pytorch.jni.PyTorchLibrary.moduleForward(Native Method)
        at ai.djl.pytorch.jni.IValueUtils.forward(IValueUtils.java:46)
        at ai.djl.pytorch.engine.PtSymbolBlock.forwardInternal(PtSymbolBlock.java:126)
        at ai.djl.nn.AbstractBlock.forward(AbstractBlock.java:126)
        at ai.djl.nn.Block.forward(Block.java:122)
        at ai.djl.inference.Predictor.predictInternal(Predictor.java:137)
        at ai.djl.inference.Predictor.batchPredict(Predictor.java:177)
        ... 4 more

For the purpose of making this simpler, I have hard-coded the input in the processInput and masked it to 128. Here is how the processInput in translator looks like:

        @Override
        public NDList processInput(TranslatorContext translatorContext, QAInput qaInput) throws Exception {
            NDManager manager = translatorContext.getNDManager();
            // Hard-coded array of roberta indices.
            long[] indices = new long[] {0, 12196, 16, 5, 6958, 9, 14206, 15330, 7110, 116, 2, 2, 37099, 9, 15330, 7110, 16, 2107, 151, 1730, 2};
            int INPUT_LENGTH = 128;
            long[] finalIndices = new long[INPUT_LENGTH];
            long[] attentionMasks = new long[INPUT_LENGTH];
            // Masking upto INPUT_LENGTH
            System.arraycopy(indices, 0, finalIndices, 0, indices.length);
            Arrays.fill(finalIndices, indices.length, INPUT_LENGTH, 1);
            Arrays.fill(attentionMasks, 0, indices.length, 1);
            Arrays.fill(attentionMasks, indices.length, INPUT_LENGTH, 0);

            NDArray indicesArray = manager.create(finalIndices);
            NDArray attentionMaskArray = manager.create(attentionMasks);
            // The order matters
            return new NDList(indicesArray, attentionMaskArray);
        }

predict code is simply as follows:

            var predictor = model.newPredictor(translator);
            predictor.predict(new QAInput(question, paragraph));

Error appears at this line: predictor.predict(new QAInput(question, paragraph));

Jan 13 '22 15:01 aamirbutt

@frankfliu Can you please take a look at this? I was having this problem in an inferentia instance where DJL 0.14.0 was needed. However, I found out that the error is with DJL 0.14.0 and has nothing to do with Inferentia. The same code works fine with 0.13.0

Jan 13 '22 15:01 aamirbutt

@aamirbutt The different between 0.13.0 and 0.14.0 is the PyTorch version, can you try use PyTorch 1.9.0?

export PYTORCH_VERSION=1.9.0

You can also try to run it with python and see if pytorch 1.9.1 has the same issue.

Jan 13 '22 19:01 frankfliu

The problem looks to be with DJL 0.14.0. Here is a working output from DJL 0.13.0 which loads pytorch 1.9.1 apparently:

0 [main] DEBUG ai.djl.repository.zoo.DefaultModelZoo  - Scanning models in repo: class ai.djl.repository.SimpleRepository, file:/D:/code/zirai/aamir/dotakb/reranker/outputs/roberta_squad2_output/traced.pt
4 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Loading model with Criteria:
	Application: UNDEFINED
	Input: class ai.djl.modality.nlp.qa.QAInput
	Output: interface java.util.List
	Engine: PyTorch
	ModelZoo: ai.djl.localmodelzoo

4 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Searching model in specified model zoo: ai.djl.localmodelzoo
10 [main] DEBUG ai.djl.engine.Engine  - Found EngineProvider: PyTorch
13 [main] DEBUG ai.djl.engine.Engine  - Found default engine: PyTorch
16 [main] WARN ai.djl.repository.SimpleRepository  - Simple repository pointing to a non-archive file.
18 [main] DEBUG ai.djl.repository.zoo.ModelZoo  - Checking ModelLoader: ai.djl.localmodelzoo:traced.pt UNDEFINED [
	ai.djl.localmodelzoo/traced.pt/traced.pt {}
]
21 [main] DEBUG ai.djl.repository.MRL  - Preparing artifact: file:/D:/code/zirai/aamir/dotakb/reranker/outputs/roberta_squad2_output/traced.pt, ai.djl.localmodelzoo/traced.pt/traced.pt {}
21 [main] DEBUG ai.djl.repository.SimpleRepository  - Skip prepare for local repository.
Loading:     100% |========================================|
112 [main] DEBUG ai.djl.util.cuda.CudaUtils  - No cudart library found in path.
123 [main] DEBUG ai.djl.pytorch.jni.LibUtils  - Using cache dir: C:\Users\Aamir\.djl.ai\pytorch
209 [main] DEBUG ai.djl.pytorch.jni.LibUtils  - Loading pytorch library from: C:\Users\Aamir\.djl.ai\pytorch\1.9.1-cpu-win-x86_64\0.13.0-cpu-djl_torch.dll
1023 [main] INFO ai.djl.pytorch.engine.PtEngine  - Number of inter-op threads is 4
1318 [main] INFO ai.djl.pytorch.engine.PtEngine  - Number of intra-op threads is 8
9886 [main] INFO ai.zir.djl.Predictor  - Model loaded successfully...
Enter your question and context to get predicted answers via Bert model.

Enter your question or enter exit to finish:
what is the height of mount everest?
Enter your context or enter exit to finish:
There are certain tall and deep things in this world. for example, the depth of mariana trench is 80000 feet and the height of everst is 32000 feet
Answer: 32000 feet
Press Enter to continue...

I tried your suggestion of loading pytorch 1.9.0 and the problem persists.

Jan 14 '22 05:01 aamirbutt

This is really strange. Would you mind build DJL from source using git bisect to find which commit cause the issue?

You need manually build pytorch JNI and then test your code:

          call "C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\VC\Auxiliary\Build\vcvarsall.bat" amd64
          gradlew :engines:pytorch:pytorch-native:compileJNI -Ppt_version=1.9.1

Jan 14 '22 05:01 frankfliu

I don't have VS2019 license, unfortunately.

Jan 17 '22 05:01 aamirbutt

Visual Studio Community addition is free.

Jan 17 '22 05:01 frankfliu