mxnet icon indicating copy to clipboard operation
mxnet copied to clipboard

Several crashes are found in mxnet version 1.5.x 1.4.x 1.3.x

Open rubbberrabbit opened this issue 3 years ago • 1 comments

Description

Hello, we try to use keras as the front-end to run Mxnet, but find several Mxnet crashes, we are not should if there is a real bug trigger by those model, so we collected the Execution stack information when Mxnet crashes, most of them are related to libmxnet.so which is hard to compile in debug mode. Here is the list of the Mxnet version and Execution stack information of our models.

Further, the triggering-crash models and replay script is provided in https://drive.google.com/drive/folders/1he3I-1PKGI01t09E2FAin0_mUnmu2-oz?usp=sharing

Error Message

Some of stack informations are shown below image-20220504220358522 image-20220504220358524 image-20220504222003296

To Reproduce

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. Download the scripts and models from the cloud links
  2. Chang the path in /scripts/bugs_replay.conf into the path of model folder
  3. Run mxnet_test.py in the corresponding environment

What have you tried to solve it?

To analysis the Execution stack information in libmxnet.so, we try to compile Mxnet with choice Debug=1 in config.mk but face a error report of "relocation trcuncated to fit" in several different environments. we assume that is because too much redundant code is added when compiling with DEBUG mode.

Environment

Environment Information Mxnet 1.5.1 Keras-Mxnet 2.2.4.2 CUDA 10.1 python 3.6.12

Mxnet 1.4.1 Keras-Mxnet 2.2.4.2 CUDA 10.0 python 3.6.12 Mxnet 1.3.1 Keras-Mxnet 2.2.4.2 CUDA 9.0 python 3.6.12

rubbberrabbit avatar May 07 '22 16:05 rubbberrabbit

Welcome to Apache MXNet (incubating)! We are on a mission to democratize AI, and we are glad that you are contributing to it by opening this issue. Please make sure to include all the relevant context, and one of the @apache/mxnet-committers will be here shortly. If you are interested in contributing to our project, let us know! Also, be sure to check out our guide on contributing to MXNet and our development guides wiki.

github-actions[bot] avatar May 07 '22 16:05 github-actions[bot]