MIOpen icon indicating copy to clipboard operation
MIOpen copied to clipboard

[Question] Where to delete the dropoutDesc in RNNDescriptor

Open wenobug opened this issue 4 years ago • 13 comments

I found miopen::deref(&dropoutDesc) = new miopen::DropoutDescriptor(); in rnn.cpp, but didn't known where to delete it. Can you help me determine if this is a bug. https://github.com/ROCmSoftwarePlatform/MIOpen/blob/4949d29f70b6ac26b834713493d9e33759ba5f41/src/rnn.cpp#L282-L371

wenobug avatar Jan 07 '22 08:01 wenobug

Hi @ce1adon , if you have time could you help to take a look? Thanks!

junliume avatar Jan 07 '22 16:01 junliume

It will be deleted after the lifecycle of the API.

ce1adon avatar Jan 07 '22 17:01 ce1adon

@ce1adon Hmm... It looks like memleak (rather small, but anyway), or I do not understand something? If RNNDescriptor instance owns an instance of DropoutDescriptor that resides in the heap, why don't we delete dropout descriptor in ~RNNDescriptor()?

atamazov avatar Jan 12 '22 22:01 atamazov

...Because of miopenGetRNNDescriptor_V2() which must return ptr to dropout descriptor to the user, and that dropout descriptor must be alive even after RNN descriptor is destroyed?

atamazov avatar Jan 12 '22 22:01 atamazov

@atamazov You can add the delete with ~RNNDescriptor()

ce1adon avatar Jan 12 '22 23:01 ce1adon

@ce1adon Please consider the following sequence:

  • (1) RNNDescriptor created
  • (2) The user reads dropout descriptor by means of miopenGetRNNDescriptor_V2
  • (3) RNNDescriptor deleted, thus invalidating dropout descriptor
  • (4) The user passes dropout descriptor to miopenSetRNNDescriptor_V2 -> undefined behavior

atamazov avatar Jan 13 '22 13:01 atamazov

@ce1adon Please consider the following sequence:

  • (1) RNNDescriptor created
  • (2) The user reads dropout descriptor by means of miopenGetRNNDescriptor_V2

I believe setrnndesc will be run first before you can get anything. This get function is to get an existing desc.

  • (3) RNNDescriptor deleted, thus invalidating dropout descriptor

see above

  • (4) The user passes dropout descriptor to miopenSetRNNDescriptor_V2 -> undefined behavior

you can delete the very first defined dropout desc, which is independent with rnn desc.

ce1adon avatar Jan 13 '22 20:01 ce1adon

@ce1adon @atamazov @junliume let's see RNNDriver in MIOpenDriver:

    RNNDriver() : Driver()
    {
        miopenCreateTensorDescriptor(&inputTensor);
        miopenCreateTensorDescriptor(&hiddenTensor);
        miopenCreateTensorDescriptor(&weightTensor);
        miopenCreateTensorDescriptor(&outputTensor);

        miopenCreateRNNDescriptor(&rnnDesc);
        workspace_dev    = nullptr;
        reservespace_dev = nullptr;
        data_type        = (sizeof(Tgpu) == 4) ? miopenFloat : miopenHalf;

        miopenCreateDropoutDescriptor(&DropoutDesc);
    }
    ~RNNDriver() override
    {
        miopenDestroyTensorDescriptor(outputTensor);
        miopenDestroyTensorDescriptor(weightTensor);
        miopenDestroyTensorDescriptor(hiddenTensor);
        miopenDestroyTensorDescriptor(inputTensor);

        miopenDestroyRNNDescriptor(rnnDesc);
    }

First, I don't find where to destroy the DropoutDesc created in the constructor. Then, in the constructor, the code of miopenCreateRNNDescriptor is:

RNNDescriptor::RNNDescriptor()
{
    nLayers                     = 1;
    hsize                       = 0;
    nHiddenTensorsPerLayer      = 0;
    rnnMode                     = miopenRNNTANH;
    dirMode                     = miopenRNNunidirection;
    biasMode                    = miopenRNNNoBias;
    algoMode                    = miopenRNNdefault;
    inputMode                   = miopenRNNlinear;
    dataType                    = miopenFloat;
    typeSize                    = 4;
    workspaceScale              = 1;
    miopen::deref(&dropoutDesc) = new miopen::DropoutDescriptor();
}

Here create the dropoutDesc but no where to destroy it, the setrnndesc will only replace the pointer dropoutDesc to another place, but don't destroy the previous one that created by new miopen::DropoutDescriptor().

In the end, RNNDriver::SetRNNDescriptorFromCmdLineArgs() may execute the following command:

    else
    {
        miopenSetRNNDescriptor(
            rnnDesc, wei_hh, layer, inMode, directionMode, mode, biasMode, algo, data_type);
    }

This setrnndesc will also create the dropoutDesc in the heap but no where to destroy:

RNNDescriptor::RNNDescriptor(int hsz,
                             int layers,
                             miopenRNNMode_t rmode,
                             miopenRNNInputMode_t inMode,
                             miopenRNNDirectionMode_t bidir,
                             miopenRNNBiasMode_t bmode,
                             miopenRNNAlgo_t amode,
                             miopenDataType_t dType)
{
    hsize                       = hsz;
    nLayers                     = layers;
    inputMode                   = inMode;
    dirMode                     = bidir;
    rnnMode                     = rmode;
    algoMode                    = amode;
    biasMode                    = bmode;
    dataType                    = dType;
    miopen::deref(&dropoutDesc) = new miopen::DropoutDescriptor();
}

wenobug avatar Jan 14 '22 01:01 wenobug

@zpwenjh https://github.com/ROCmSoftwarePlatform/MIOpen/issues/1368#issuecomment-1011532409

ce1adon avatar Jan 14 '22 03:01 ce1adon

@zpwenjh #1368 (comment)

You should not simply delete dropoutDesc in ~RNNDescriptor() because the user might have gotten it earlier via miopenGetRNNDescriptor_V2.

What I can propose:

  • (1) Delete dropoutDesc in ~RNNDescriptor(). Track the use of miopenGetRNNDescriptor_V2 in RNNDescriptor. After miopenGetRNNDescriptor_V2, we shall assume that ownership of dropoutDesc is passed to the user and avoid deletion of it in the destructor.
  • (2) When ownership of dropoutDesc is passed to the user, memleaks will continue to happen unless the user calls miopenDestroyDropoutDescriptor explicitly. This should be clearly documented in the API.
  • (3) Update the driver and tests, if necessary.

atamazov avatar Jan 16 '22 19:01 atamazov

@atamazov could you recommend someone from API to fix this? Email could work. Thanks!

junliume avatar Jan 21 '22 03:01 junliume

@junliume The best candidate is someone who will take over Jing's works.

atamazov avatar Jan 30 '22 21:01 atamazov

@junliume Has this issue been resolved? Thanks!

ppanchad-amd avatar Apr 16 '24 14:04 ppanchad-amd