inference_results_v0.5 icon indicating copy to clipboard operation
inference_results_v0.5 copied to clipboard

make failed to product int4_offline executable

Open XiaotaoChen opened this issue 5 years ago • 5 comments

I am to product int4_offline with instructions as blows: build loadgen it seems successful.

git clone --recurse-submodules https://github.com/mlperf/inference.git mlperf_inference

LOADGEN_DIR=<Path> (e.g. ${PWD}/mlperf_inference/inference/loadgen)
CUDA_PATH=<Path for CUDA toolkit e.g. /usr/local/cuda-10.1>
cd $LOADGEN_ DIR
CFLAGS="-std=c++14 -O3" python setup.py bdist_wheel

And then to product int4_offline as belows:

#!/bin/bash
LOADGEN_DIR='/mnt/truenas/upload/xiaotao.chen/Repositories/mlperf_inference/loadgen'
CUDA_PATH='/usr/local/cuda'
make -j CUDA=${CUDA_PATH} LOADGEN_PATH=${LOADGEN_DIR} clean
make -j CUDA=${CUDA_PATH} LOADGEN_PATH=${LOADGEN_DIR} all

the error info

if [ ! -d ../../../code/resnet/int4 ] ; then mkdir ../../../code/resnet/int4    ; fi
/usr/local/cuda/bin/nvcc -o int4_offline ../../../code/resnet/int4/int4_offline.a -lpython2.7 --library :mlperf_loadgen.so --library-path /mnt/truenas/upload/xiaotao.chen/Repositories/mlperf_inference/loadgen/build/lib.linux-x86_64-2.7 -L/usr/lib/x86-64-linux-gnu -I/usr/include -lcudnn -lcublas
../../../code/resnet/int4/int4_offline.a(int4_offline.o): In function `Stream::launchThread(SyncWorkQueue*)':
tmpxft_00000099_00000000-5_int4_offline.cudafe1.cpp:(.text+0x18ea): undefined reference to `std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())'
../../../code/resnet/int4/int4_offline.a(int4_offline.o): In function `Server::Setup(command_line_args&, mlperf::TestSettings&, std::shared_ptr<Server>)':
tmpxft_00000099_00000000-5_int4_offline.cudafe1.cpp:(.text+0x26d7): undefined reference to `std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::default_delete<std::thread::_State> >, void (*)())'
../../../code/resnet/int4/int4_offline.a(int4_offline.o): In function `std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (Stream::*)(), Stream*> > >::~_State_impl()':
tmpxft_00000099_00000000-5_int4_offline.cudafe1.cpp:(.text._ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEED2Ev[_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEED5Ev]+0xb): undefined reference to `std::thread::_State::~_State()'
../../../code/resnet/int4/int4_offline.a(int4_offline.o): In function `std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (Stream::*)(), Stream*> > >::~_State_impl()':
tmpxft_00000099_00000000-5_int4_offline.cudafe1.cpp:(.text._ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEED0Ev[_ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEED5Ev]+0xf): undefined reference to `std::thread::_State::~_State()'
../../../code/resnet/int4/int4_offline.a(int4_offline.o):(.data.rel.ro._ZTINSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEEE[_ZTINSt6thread11_State_implINS_8_InvokerISt5tupleIJM6StreamFvvEPS3_EEEEEE]+0x10): undefined reference to `typeinfo for std::thread::_State'
collect2: error: ld returned 1 exit status
Makefile:38: recipe for target 'int4_offline' failed
make: *** [int4_offline] Error 1

it tells undefined reference to std::thread::_M_start_thread , seems like gcc version is mismatch between my host and int4_offline.a used. my version as belows:

gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

g++ (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Can you tells the detail version int4_offline.a used. How can i produce int4_offline ? thanks @psyhtest

XiaotaoChen avatar Feb 10 '20 09:02 XiaotaoChen

it can build successfully when i switch gcc version to 5.5 as belows:

gcc (Ubuntu 5.5.0-12ubuntu1~16.04) 5.5.0 20171010
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

XiaotaoChen avatar Feb 10 '20 12:02 XiaotaoChen

@XiaotaoChen Sorry, I'm not familiar with int4_offline. I assume it's a build product from NVIDIA's submission?

/cc @nvpohanh

psyhtest avatar Feb 10 '20 12:02 psyhtest

Hi @XiaotaoChen , we only tested on the gcc version installed in the docker container. As you found out, it does not support older versions of gcc.

nvpohanh avatar Feb 11 '20 02:02 nvpohanh

@nvpohanh Thanks for your reminding.

XiaotaoChen avatar Feb 11 '20 02:02 XiaotaoChen

@XiaotaoChen Does gcc version fix your problem? If so, please close this issue. Thanks

nvpohanh avatar Apr 28 '20 16:04 nvpohanh