RTNeural icon indicating copy to clipboard operation
RTNeural copied to clipboard

Large GRU Segmentation Fault

Open MattBeton opened this issue 1 year ago • 3 comments

A GRU larger than a certain size appears to cause a segmentation fault. This is not specific to the backend, the result has been replicated with all backends.

Minimal Replication

#include "RTNeural/RTNeural/RTNeural.h"
#include "RTNeural/tests/functional/load_csv.hpp"
#include <filesystem>
#include <iostream>

namespace fs = std::filesystem;

constexpr int vocab_size = 27;
constexpr int hidden_size = 512;

using ModelType = RTNeural::ModelT<float, vocab_size, vocab_size,
    RTNeural::DenseT<float, vocab_size, hidden_size>,
    RTNeural::GRULayerT<float, hidden_size, hidden_size>,
    RTNeural::GRULayerT<float, hidden_size, hidden_size>,
    RTNeural::DenseT<float, hidden_size, vocab_size>>;

int main([[maybe_unused]] int argc, char* argv[])
{
    ModelType model;

    return 0;
}

Build Environment

Macbook Pro with M2 Pro Processor. CMakeLists.txt is as follows:

cmake_minimum_required(VERSION 3.10)
project(GenerativeGRU VERSION 1.0 LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 17)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

add_executable(GenerativeGRU main.cpp)

set(RTNEURAL_STL ON CACHE BOOL "Use RTNeural with this backend" FORCE)
add_subdirectory(RTNeural)
target_link_libraries(GenerativeGRU LINK_PUBLIC RTNeural)

MattBeton avatar Nov 21 '24 09:11 MattBeton

Thanks for the report! Do you have any more information about the root cause of the seg fault? My guess is that it's just a stack overflow, since the model might be too large to be allocated on the stack.

jatinchowdhury18 avatar Nov 21 '24 14:11 jatinchowdhury18

Hi! Yes, I think it's due to the fact that weights are being stored on the stack. I don't know how practical it is to reorg some of the code so that layer weights are stored on the heap using a vector or etc?

MattBeton avatar Nov 21 '24 18:11 MattBeton

It would be possible to store the layer weights on the heap, but I'd rather not do that in the "compile-time" implementations of the layers, for performance reasons.

I would suggest trying one of two options:

  • Using the "run-time" API rather than the compile-time API. With the run-time API, the weights are stored on the heap.
  • Store the entire model on the heap, e.g. auto model = std::make_unique<ModelType>();.

jatinchowdhury18 avatar Nov 21 '24 21:11 jatinchowdhury18