openvino
openvino copied to clipboard
[CPU] rnn: add pd to cache for preparing memory accordingly
Details:
original implementation use auto itpd = descs[0].createPrimitiveDescriptorIterator(getEngine(), dnnl::primitive_attr());
to recreate a primitive descriptor from descs[0]
for querying the actual memory descriptor used for current primitive, but this only works when primitive cache was miss and descs[0]
was correctly set in builder()
, when cache hit, descs[0]
may not be updated accordingly thus incorrect memory descriptor was queried.
Now we change the design to always querying directly from current primitive, and it works in either cache hit or miss cases.
Tickets:
- 88266
@maxnick Can you please help to review this PR?
The valid link to the validation results: https://openvino-ci.toolbox.iotg.sclab.intel.com/job/private-ci/job/github_trigger/job/openvino/job/PR-12571/2/
@usstq @maxnick Do you have any agreement on the solution?
@dmitry-gorokhov @maxnick I removed the confusing initialization of internalBlobDesc
in constructor since we are using another overload of prepareMemory
, PR is updated, please review again, thank!
@usstq I am just wondering is there a possibility to cover the issue by functional tests? It was caught by e2e testing so sounds like no specifics there.
@dmitry-gorokhov sure, will try to add one
@usstq As I can see the test was added. Can I proceed with the review?
@usstq just a kindly reminder
@dmitry-gorokhov @maxnick I try to add a test case dynamically switch between batch sizes 1 and 20, to trigger the WA branch if (SL != 1 || B < optimalBatchSize)
inside rnn.cpp, and surprisingly even original buggy implementation can pass this new test.
then I checked the testcase, it seems to be related to the fact that W,R,B matrixes was initialized as random number ranging from 1 to 10, and after passing through sigmoid activation function, this produces saturated output values near 1, and correctness checks failed to detect the error.
@dmitry-gorokhov @maxnick I try to add a test case dynamically switch between batch sizes 1 and 20, to trigger the WA branch
if (SL != 1 || B < optimalBatchSize)
inside rnn.cpp, and surprisingly even original buggy implementation can pass this new test.then I checked the testcase, it seems to be related to the fact that W,R,B matrixes was initialized as random number ranging from 1 to 10, and after passing through sigmoid activation function, this produces saturated output values near 1, and correctness checks failed to detect the error.
If it dose make sense to use another range of input data, just overload the corresponding input preparation method defining a different values range.
@maxnick I changed some data range, CPU tests are good now, but GNA test failed, I don't have GNA environment so maybe I can only change the data range specifically for this particular newly introduced test ?
@maxnick I changed some data range, CPU tests are good now, but GNA test failed, I don't have GNA environment so maybe I can only change the data range specifically for this particular newly introduced test ?
Yes, I thought since the changes was initially made in the CPU specific SL test, I thought the overloads of input tensors initializations would also be introduced on the CPU specific SL test level.
@maxnick I reverted some change and now only dynamic batch switching test will use special initial W/R/B value and abs_threshold, these special values & threshold are tuned so that the bug can be reported in original rnn implementation and will be fixed with this patch.
@iefode , are you OK with the changes to the common makeLSTMS
builder?