RLTrader icon indicating copy to clipboard operation
RLTrader copied to clipboard

Memory Leak: std::bad_alloc error during optimize

Open n8henrie opened this issue 5 years ago • 5 comments

I've gotten this a couple times now after python ./optimize.py:

Arch Linux Python 3.7.3

...
[I 2019-06-11 20:16:49,493] Setting status of trial#51 as TrialState.PRUNED. 
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
[computer:74848] *** Process received signal ***
[computer:74848] Signal: Aborted (6)
[computer:74848] Signal code:  (-6)
[computer:74848] [ 0] /usr/lib/libpthread.so.0(+0x124d0)[0x7ff4e9dce4d0]
[computer:74848] [ 1] /usr/lib/libc.so.6(gsignal+0x10f)[0x7ff4e9c2e82f]
[computer:74848] [ 2] /usr/lib/libc.so.6(abort+0x125)[0x7ff4e9c19672]
[computer:74848] [ 3] /usr/lib/libstdc++.so.6(+0x8a58e)[0x7ff4a95f358e]
[computer:74848] [ 4] /usr/lib/libstdc++.so.6(+0x90e0a)[0x7ff4a95f9e0a]
[computer:74848] [ 5] /usr/lib/libstdc++.so.6(+0x90e67)[0x7ff4a95f9e67]
[computer:74848] [ 6] /usr/lib/libstdc++.so.6(+0x910bc)[0x7ff4a95fa0bc]
[computer:74848] [ 7] /usr/lib/libstdc++.so.6(+0x91647)[0x7ff4a95fa647]
[computer:74848] [ 8] /usr/lib/libstdc++.so.6(_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE9_M_assignERKS4_+0xb0)[0x7ff4a9692760]
[computer:74848] [ 9] /usr/lib/libprotobuf.so.18(_ZNK6google8protobuf8internal26GeneratedMessageReflection9SetStringEPNS0_7MessageEPKNS0_15FieldDescriptorERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x1be)[0x7ff41f
a99e6e]
[computer:74848] [10] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python17CheckAndSetStringEP7_objectPNS0_7MessageEPKNS0_15FieldDescriptorEPKNS0_10Reflec
tionEbi+0x13a)[0x7ff41fc178da]
[computer:74848] [11] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage25InternalSetNonOneofScalarEPNS0_7MessageEPKNS0_15FieldDescriptorEP7_obj
ect+0xeb)[0x7ff41fc179eb]
[computer:74848] [12] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage13SetFieldValueEPNS1_8CMessageEPKNS0_15FieldDescriptorEP7_object+0x91)[0
x7ff41fc17de1]
[computer:74848] [13] /usr/lib/python3.7/site-packages/google/protobuf/pyext/_message.cpython-37m-x86_64-linux-gnu.so(_ZN6google8protobuf6python8cmessage14InitAttributesEPNS1_8CMessageEP7_objectS6_+0x229)[0x7ff41fc18aa9]
[computer:74848] [14] /usr/lib/libpython3.7m.so.1.0(_PyObject_FastCallKeywords+0x11c)[0x7ff4e9a0039c]
[computer:74848] [15] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x5951)[0x7ff4e9a454b1]
[computer:74848] [16] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [17] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallKeywords+0x2b2)[0x7ff4e99d3882]
[computer:74848] [18] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x4d2)[0x7ff4e9a40032]
[computer:74848] [19] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [20] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallKeywords+0x2b2)[0x7ff4e99d3882]
[computer:74848] [21] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x4b8a)[0x7ff4e9a446ea]
[computer:74848] [22] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalCodeWithName+0x2f9)[0x7ff4e998cd09]
[computer:74848] [23] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x2ec)[0x7ff4e998df8c]
[computer:74848] [24] /usr/lib/libpython3.7m.so.1.0(_PyObject_Call_Prepend+0x68)[0x7ff4e999d818]
[computer:74848] [25] /usr/lib/libpython3.7m.so.1.0(+0x16d0e3)[0x7ff4e99ec0e3]
[computer:74848] [26] /usr/lib/libpython3.7m.so.1.0(_PyObject_FastCallKeywords+0x11c)[0x7ff4e9a0039c]
[computer:74848] [27] /usr/lib/libpython3.7m.so.1.0(_PyEval_EvalFrameDefault+0x5951)[0x7ff4e9a454b1]
[computer:74848] [28] /usr/lib/libpython3.7m.so.1.0(_PyFunction_FastCallDict+0x11b)[0x7ff4e998ddbb]
[computer:74848] [29] /usr/lib/libpython3.7m.so.1.0(_PyObject_Call_Prepend+0x68)[0x7ff4e999d818]
[computer:74848] *** End of error message ***
Aborted (core dumped)

n8henrie avatar Jun 12 '19 23:06 n8henrie

I wasn't able to get the code to run on 3.7. I think it was written for 3.5.7

silentrob avatar Jun 12 '19 23:06 silentrob

There is some kind of memory leak -- I have 64g RAM and it goes steadily from 4.6% to 96.5% and then crashes over about 1 hour of python ./optimize.py. Here is the relevant tensorflow issue.

n8henrie avatar Jun 20 '19 01:06 n8henrie

You can change one line of optimize.py opt_pool.imap(optimize_code, [250]) it's train much faster and use only 10% of RAM (I have 32 Gb).

g0lemXIV avatar Jul 09 '19 16:07 g0lemXIV

I managed to reduce the memory consumption greatly by adjusting line 22 of optimize.py:

params = {'n_envs': 1, 'reward_strategy': WeightedUnrealizedProfit}

You may try to use the n_envs value between 1-4.

We usually specify a large number of environments to increase sampling efficiency, but here it will cause memory pressure since we already run the optimization in multiple processes.

xiaohu557 avatar Mar 29 '20 15:03 xiaohu557

I managed to reduce the memory consumption greatly by adjusting line 22 of optimize.py:

params = {'n_envs': 1, 'reward_strategy': WeightedUnrealizedProfit}

You may try to use the n_envs value between 1-4.

We usually specify a large number of environments to increase sampling efficiency, but here it will cause memory pressure since we already run the optimization in multiple processes.

I tried to use the n_envs value = 1. Still got the memory leak error. Any suggestion? Thanks

coolsober avatar Jun 03 '20 07:06 coolsober