djl icon indicating copy to clipboard operation
djl copied to clipboard

ai.djl.engine.EngineException

Open ling976 opened this issue 2 years ago • 5 comments

错误描述

   加载模型报错 ai.djl.engine.EngineException

错误信息

    Exception in thread "main" ai.djl.engine.EngineException: 
    Unknown type name 'transformers.models.bert.modeling_bert.BertForSequenceClassification':
    transformers.models.bert.modeling_bert.BertForSequenceClassification
    at ai.djl.pytorch.jni.PyTorchLibrary.moduleLoad(Native Method)

代码

    System.setProperty("DJL_CACHE_DIR", "E://Python/cache/");
	Path modelDir = Paths.get("./modules/codegen-350M-mono/codegen-350M-mono.pt");
	Criteria<String,NDList> criteria =
            Criteria.builder()
                    .setTypes(String.class, NDList.class)
                    .optModelPath(modelDir)
                    .optModelName("codegen-350M-mono")
                    .optDevice(Device.gpu())
                    .optProgress(new ProgressBar())
                    .optEngine("PyTorch")
                    .optTranslator(new MyTranslator())
                    .build();
	
	ZooModel model = ModelZoo.loadModel(criteria);
   	Predictor<String, NDList> predictor = model.newPredictor();
    
   	NDList classifications = predictor.predict("def add func");

异常位置

    具体错误位置在 ai.djl.pytorch.jni.PyTorchLibrary 类的第523行处,是一个native方法

    native long moduleLoad(
              String path,
              int[] device,
              boolean mapLocation,
              String[] extraFileNames,
              String[] extraFileValues,
              boolean trainParam);

测试环境

  java17
 djl版本0.22.0
 3060ti显卡
 PyTorch版本1.13.1+cu116

ling976 avatar Apr 03 '23 12:04 ling976

Are you able to load your torchscript model from python use:

torch.jit.load("codegen-350M-mono.pt")

frankfliu avatar Apr 03 '23 14:04 frankfliu

模型用下面的方法转换过的 model_name = "Salesforce/codegen-350M-mono" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name)

test_text = "def hello_world():"

completion = model.generate(**tokenizer(test_text, return_tensors="pt"))
print(completion)

model.eval()
store = torch.jit.trace(model, completion)
store  = torch.jit.script(model)
torch.jit.save(store,"./codegen-350M-mono.pt")
torch.jit.load("./codegen-350M-mono.pt")

ling976 avatar Apr 03 '23 14:04 ling976

I don't think your script works:

  1. store = torch.jit.trace(model, completion) most likely won't work, you have to trace the decoder model only
  2. you overwrite store with jit.script, so you actually not using trace

frankfliu avatar Apr 03 '23 15:04 frankfliu

那这里应该怎么做呢

ling976 avatar Apr 03 '23 15:04 ling976

这里正确的方式应该是怎么样的啊,把codegen-350M-mono转换一下

ling976 avatar Apr 04 '23 17:04 ling976

You can take a look this script: https://github.com/deepjavalibrary/djl/blob/master/examples/src/main/python/trace_gpt2.py

You can also convert the model to onnx: https://github.com/deepjavalibrary/djl/blob/master/examples/src/main/java/ai/djl/examples/inference/nlp/TextGeneration.java#L172-L183

frankfliu avatar Jul 11 '24 00:07 frankfliu