djl
djl copied to clipboard
Linear class on loadMetadata method not initialize inputShapes
version:
ai.djl.nn.core.Linear#loadMetadata
this method Override parent#loadMetadata cause inputShapes not initialize
Could you provide a test code that shows the bug, along with the expected behavior?
I took a look at the ai.djl.nn.core.Linear#loadMetadata and didn't find any part that causes inputShapes not to initialize. Could you elaborate on why you think that the method Override parent#loadMetadata causes inputShapes not initialize?
ai.djl.nn.AbstractBaseBlock#loadMetadata protected void loadMetadata(byte loadVersion, DataInputStream is) throws IOException, MalformedModelException { if (loadVersion != version) { throw new MalformedModelException( "Cannot load parameters for " + this.getClass().getCanonicalName() + ", expected version " + version + ", got " + loadVersion + "."); } readInputShapes(is); } protected void readInputShapes(DataInputStream is) throws IOException { int len = is.readInt(); Shape[] shapes = new Shape[len]; for (int i = 0; i < len; ++i) { shapes[i] = Shape.decode(is); } if (inputShapes == null) { // load inputShapes from parameter file if Block has not been initialized inputShapes = shapes; } } ai.djl.nn.core.Linear#loadMetadata @Override public void loadMetadata(byte loadVersion, DataInputStream is) throws IOException, MalformedModelException { switch (loadVersion) { case VERSION: units = is.readLong(); inputFeatures = is.readLong(); break; case 3: units = is.readLong(); if (is.readBoolean()) { throw new IllegalArgumentException("flatten is not supported!"); } inputFeatures = is.readLong(); break; case 2: if (is.readBoolean()) { throw new IllegalArgumentException("flatten is not supported!"); } inputFeatures = is.readLong(); break; case 1: inputFeatures = Shape.decode(is).size(); break; default: throw new MalformedModelException("Unsupported encoding version: " + loadVersion); } inputShape = Shape.decode(is); } from code inputShapes not initialize ai.djl.nn.AbstractBaseBlock#isInitialized @Override public boolean isInitialized() { if (inputShapes == null) { return false; } for (Parameter param : getParameters().values()) { if (!param.isInitialized()) { return false; } } return true; }
灰色&月光 @.***
------------------ 原始邮件 ------------------ 发件人: "deepjavalibrary/djl" @.>; 发送时间: 2023年3月7日(星期二) 晚上6:21 @.>; @.@.>; 主题: Re: [deepjavalibrary/djl] Linear class on loadMetadata method not initialize inputShapes (Issue #2446)
Could you provide a test code that shows the bug, along with the expected behavior?
I took a look at the ai.djl.nn.core.Linear#loadMetadata and didn't find any part that causes inputShapes not to initialize. Could you elaborate on why you think that the method Override parent#loadMetadata cause inputShapes not initialize?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
@Override public void loadMetadata(byte loadVersion, DataInputStream is) throws IOException, MalformedModelException { switch (loadVersion) { case VERSION: units = is.readLong(); inputFeatures = is.readLong(); break; case 3: units = is.readLong(); if (is.readBoolean()) { throw new IllegalArgumentException("flatten is not supported!"); } inputFeatures = is.readLong(); break; case 2: if (is.readBoolean()) { throw new IllegalArgumentException("flatten is not supported!"); } inputFeatures = is.readLong(); break; case 1: inputFeatures = Shape.decode(is).size(); break; default: throw new MalformedModelException("Unsupported encoding version: " + loadVersion); } inputShape = Shape.decode(is);
Shape input = inputShape.add(inputFeatures);
inputShapes = new Shape[]{input};
}
Thanks for sharing the code. However, still the code is hard to read. I still had a hard time understanding your point. Could you follow the issue template?
Please see the attachment 《TestDjl.zip》
/**
- 1、第一执行保存了网络权重文件
- 2、第二次执行就会报错
- Exception in thread "main" java.lang.IllegalStateException: No gradient attached to this NDArray, please call array.setRequiresGradient() on your NDArray or block.setInitializer() on your Block
-
at ai.djl.mxnet.engine.MxNDArray.getGradient(MxNDArray.java:258) -
at ai.djl.training.Trainer.lambda$checkGradients$3(Trainer.java:350) -
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) -
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175) -
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384) -
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) -
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) -
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151) -
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174) -
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) -
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418) -
at ai.djl.training.Trainer.checkGradients(Trainer.java:345) -
at ai.djl.training.Trainer.step(Trainer.java:208) -
at ai.djl.training.EasyTrain.fit(EasyTrain.java:61) -
at com.luosoy.RNMnodel.train(RNModel.java:156) -
at com.luosoy.TestMxNetError.main(TestMxNetError.java:50) -
原因是Linear类的loadMetadata方法覆盖了父类的方法,导致父类里面的变量inputShapes没有被赋值 - @param args */
灰色&月光 @.***
------------------ 原始邮件 ------------------ 发件人: "deepjavalibrary/djl" @.>; 发送时间: 2023年3月7日(星期二) 晚上11:05 @.>; @.@.>; 主题: Re: [deepjavalibrary/djl] Linear class on loadMetadata method not initialize inputShapes (Issue #2446)
Thanks for sharing the code. However, still the code is hard to read. I still had a hard time understanding your point. Could you follow the issue template?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
I have created a PR to solve the issue of inputShapes from AbastractedBlock being not initialized. But I'm not clear how it solves the exception:
Exception in thread "main" java.lang.IllegalStateException: No gradient attached to this NDArray, please call array.setRequiresGradient() on your NDArray or block.setInitializer() on your Block
Also the attachment《TestDjl.zip》is not seen.
Could you provide the test code which can be used to check if this fix really works?