DeepSpeech
DeepSpeech copied to clipboard
Memory Leak using Java Native Client
For support and discussions, please use our Discourse forums.
If you've found a bug, or have a feature request, then please create an issue with the following information:
- Have I written custom code (as opposed to running examples on an unmodified clone of the repository):
- OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Android R (API 30)
- Exact command to reproduce:
I modified the STT-example in Kotlin for use in a Mozilla-STT React Native module I'm building. However, running streaming transcription uses more and more memory until the app crashes. See a fully working example by cloning this repo, running yarn bootstrap, then yarn example android. Attach to it using the Android Studio profiler to watch the memory usage creep upwards.
Implementation: https://github.com/zaptrem/react-native-transcription/blob/master/android/src/main/java/com/reactnativetranscription/TranscriptionModule.kt
Please describe the problem clearly. Be sure to convey here why it's a bug or a feature request.
Include any logs or source code that would be helpful to diagnose the problem. For larger logs, link to a Gist, not a screenshot. If including tracebacks, please include the full traceback. Try to provide a reproducible test case.
However, running streaming transcription uses more and more memory until the app crashes.
I know nothing about Kotlin and its memory model, can you ensure this is 100% NOT related to your usage of the library?
However, running streaming transcription uses more and more memory until the app crashes.
I know nothing about Kotlin and its memory model, can you ensure this is 100% NOT related to your usage of the library? @lissyx
I, too, know nothing about Kotlin and its memory model apart from what I learned building this module, which I started only a few days ago. However, my usage is nearly line-for-line identical to the STT-examples Android demo. The only exception is my packaging of the Metadata object and Native Event dispatching. However, the leak remains even when those are commented out.
However, running streaming transcription uses more and more memory until the app crashes.
I know nothing about Kotlin and its memory model, can you ensure this is 100% NOT related to your usage of the library? @lissyx
I, too, know nothing about Kotlin and its memory model apart from what I learned building this module, which I started only a few days ago. However, my usage is nearly line-for-line identical to the STT-examples Android demo. The only exception is my packaging of the Metadata object and Native Event dispatching. However, the leak remains even when those are commented out.
My point being: I have no idea whether the STT-examples code is leak-free, and in the current state of events, I can't focus time on investigate that on JNI/Java side.
@lissyx Are there other contributors with experience in this area who might be good to cc?
The kotlin example was contributed by @erksch
@lissyx @erksch the Android Studio debugger reports that the memory leak is in the "Native Code" section. Doesn't this indicate that it's a library issue? I'm running multi-hour recognition jobs, so maybe the issue doesn't show up in your shorter test cases?
@lissyx @erksch the Android Studio debugger reports that the memory leak is in the "Native Code" section. Doesn't this indicate that it's a library issue? I'm running multi-hour recognition jobs, so maybe the issue doesn't show up in your shorter test cases?
Again, I know nothing about that tooling. If it's the case, then can you write C code reproducing ? How can we distinguish "native" from our library from swig wrapper?
@lissyx Doesn't the SWIG count as part of DeepSpeech's repo/code? The RN module I was building is basically ready for its first release, but I don't want to move forward until we can resolve this issue. I have an example above that's extremely easy to reproduce (one command and two clicks in the emulator). Is writing a whole C interface (while closely studying the SWIG so that they match) with no prior experience absolutely necessary to move forward on fixing this bug?
Also of importance, commenting out model.feedAudioContent() in the AudioRecorder loop seems to solve the memory leak.
@erksch (and lissyx) I can confirm that the exact same issue is present on the android-streaming-demo in STT-examples. This is more evidence that the problem isn't just on my end, right?
You seem not to understand that i really have no time right now to track a memory leak from the top down.
have an example above that's extremely easy to reproduce (one command and two clicks in the emulator). Is writing a whole C interface (while closely studying the SWIG so that they match) with no prior experience absolutely necessary to move forward on fixing this bug?
Share and we'll see.
have an example above that's extremely easy to reproduce (one command and two clicks in the emulator). Is writing a whole C interface (while closely studying the SWIG so that they match) with no prior experience absolutely necessary to move forward on fixing this bug?
Share and we'll see.
See OP:
See a fully working example by cloning this repo, running yarn bootstrap, then yarn example android. Attach to it using the Android Studio profiler to watch the memory usage creep upwards.
The STT-examples implementation might be easier to follow, though.
https://github.com/mozilla/DeepSpeech-examples/tree/r0.8/android_mic_streaming
If nobody can fix it now could it be marked as a bug so it doesn’t get forgotten about later?
See a fully working example by cloning this repo, running yarn bootstrap, then yarn example android. Attach to it using the Android Studio profiler to watch the memory usage creep upwards.
The STT-examples implementation might be easier to follow, though.
Sorry but this is way too much.
The bug could be:
- in the c code it self
- in the c generated by swig
- in the Java libdeepspeech part
- in the Java using it
Please try and repro to reduce the search space.
Also, please understand we have had a very stressful month and it's the week-end, i might want to just rest for a few hours more ?
If nobody can fix it now could it be marked as a bug so it doesn’t get forgotten about later?
It needs to be investigated to know if the bug is really on our side...
@erksch (and lissyx) I can confirm that the exact same issue is present on the android-streaming-demo in STT-examples. This is more evidence that the problem isn't just on my end, right?
Again, i have no idea, you are both using kotlin and i have not thoroughly verified your logic..
Also, please understand we have had a very stressful month and it's the week-end, i might want to just rest for a few hours more ?
I totally understand. It’s 5am where I am and I’m just trying to get a few tasks (this was the last) checked off before I can sleep. It’s an issue forum and not text messages, so I don’t expect instant responses. I just want to make the information I have available to all of you as it comes to me.
Not much more digging I can do on this front for now as there’s a big launch on Tuesday coming up.
Not much more digging I can do on this front for now as there’s a big launch on Tuesday coming up
Thanks for your understanding. Please note if you can find to repro in plain C on linux/amd64 it can be much quickly investigated.
I also hope we can add debug build to our CI now that we have had some work on it, that would help on such case.
@zaptrem Hey sorry for being absent maybe I can look at it in the next time. But if its deep down in DeepSpeech I will not be able to help either, but I could at least reproduce it and make me my own picture.
But I guess a memory leak could easily be in the android example code because maybe some buffers are not cleaned up properly. But that only commenting out feedAudioContent solves the memory leak makes me worry that the problem is further down.
But that only commenting out feedAudioContent solves the memory leak makes me worry that the problem is further down.
Ftr we have tracked down other memory leak recently and this code path never showed, so it'd be great if we can confirm it is on Java, swig or c side.
As I feared, lots of things are breaking on TensorFlow side when trying to build a debug version on CI and on setups where it is more complicated to investigate (Windows CPU and CUDA debug builds) ...
@zaptrem even getting android debug build requires non trivial amount of work, there are several tensorflow r2.3 debug build specific breakages reported upstream, some not yet with a fix, some only fixed on master. This, added to the complex stack up to kotlin makes me unable to quickly investigate: we really need your help here into trying and reproduce the bogus behavior with pure linux/amd64 builds.
@zaptrem @erksch Have you been able to get more informations on that matter ?
@zaptrem ping? Have you been able to qualify better the issue? I have been able to have a look, comparing non stream vs stream uses via the C++ client,
(this is on current master, c7d58d628e6f701ce3575a44ef9282730ec02ab3)
Non Stream:
==1401849== LEAK SUMMARY:
==1401849== definitely lost: 0 bytes in 0 blocks
==1401849== indirectly lost: 0 bytes in 0 blocks
==1401849== possibly lost: 0 bytes in 0 blocks
==1401849== still reachable: 8,260 bytes in 103 blocks
==1401849== suppressed: 0 bytes in 0 blocks
The still reachable are all from libsox according to what I could analyze.
Stream:
==2125011== LEAK SUMMARY:
==2125011== definitely lost: 20 bytes in 3 blocks
==2125011== indirectly lost: 0 bytes in 0 blocks
==2125011== possibly lost: 0 bytes in 0 blocks
==2125011== still reachable: 8,260 bytes in 103 blocks
==2125011== suppressed: 0 bytes in 0 blocks
And it would come from the intermediate decode:
240 ==2125011== 20 bytes in 3 blocks are definitely lost in loss record 17 of 104 241 ==2125011== at 0x483877F: malloc (vg_replace_malloc.c:307) 242 ==2125011== by 0x5E93F0A: strdup (strdup.c:42) 243 ==2125011== by 0x4A12003: ModelState::decode(DecoderState const&) const (modelstate.cc:36) 244 ==2125011== by 0x4A067AD: StreamingState::intermediateDecode() const (deepspeech.cc:137)
245 ==2125011== by 0x4A0776B: DS_IntermediateDecode (deepspeech.cc:458) 246 ==2125011== by 0x11ADBF: LocalDsSTT(ModelState*, short const*, unsigned long, bool, bool) (in /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/DeepSpeech-lissyx/native_client/deepspeech) 247 ==2125011== by 0x11B3A7: ProcessFile(ModelState*, char const*, bool) (in /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/DeepSpeech-lissyx/native_client/deepspeech) 248 ==2125011== by 0x11BB05: main (in /home/alexandre/Documents/codaz/Mozilla/DeepSpeech/DeepSpeech-lissyx/native_client/deepspeech)
The definitely lost seems to be a hint but the leak is low (but only one file analyzed).
Does this matches your experience? Do you have higher volume of leaking?
One can verify the JNI part:
$ less native_client/java/jni/deepspeech_wrap.cpp
[...]
SWIGEXPORT jstring JNICALL Java_org_mozilla_deepspeech_libdeepspeech_implJNI_IntermediateDecode(JNIEnv *jenv, jclass jcls, jlong jarg1) {
jstring jresult = 0 ;
StreamingState *arg1 = (StreamingState *) 0 ;
char *result = 0 ;
(void)jenv;
(void)jcls;
arg1 = *(StreamingState **)&jarg1;
result = (char *)DS_IntermediateDecode((StreamingState const *)arg1);
if (result) jresult = jenv->NewStringUTF((const char *)result);
DS_FreeString(result);
return jresult;
}
[...]
So there is an explicit call to DS_FreeString
The
definitely lostseems to be a hint but the leak is low (but only one file analyzed).
And I can confirm this is because of the implementation in client.cc, JNI part seems to be okay.
That’s great news, right?
On Tue, Sep 29, 2020 at 8:42 AM lissyx [email protected] wrote:
The definitely lost seems to be a hint but the leak is low (but only one file analyzed).
And I can confirm this is because of the implementation in client.cc, JNI part seems to be okay.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/mozilla/DeepSpeech/issues/3271#issuecomment-700676669, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMJTRRSOEBSXOLNQ2SMEG3SIHI4HANCNFSM4QI7OFOQ .
That’s great news, right? … On Tue, Sep 29, 2020 at 8:42 AM lissyx @.***> wrote: The definitely lost seems to be a hint but the leak is low (but only one file analyzed). And I can confirm this is because of the implementation in client.cc, JNI part seems to be okay. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#3271 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMJTRRSOEBSXOLNQ2SMEG3SIHI4HANCNFSM4QI7OFOQ .
That depends, if you confirm or not my analysis ...
Unfortunately:
- tensorflow does not seems to care much about debug builds, given how
-c dbgfails mostly everywhere on CI - tensorflow or bazel does not respect
--strip=nevernor anything - so there's mostly no way to produce debug builds / symbols on CI (i've already burnt a few days on that, with no success)
- there's also no way to separately generate debug symbols, so it's mostly useless to run
valgrindon our CI
I might have found a middle-ground to provide debug builds + valgrind, with the following limitations:
- valgrind only on linux/amd64
- debug builds on all linux flavors, as much as I can hope, but still to be confirmed
- this should also cover android
- this might work on macOS and Windows, but I will not spend too much time on that
- CUDA might be too painful for that (
libdeepspeech.sofor CUDA > 2GB, that is triggering infra issues) - you would have to use those
libdeepspeech.sodebug build by swapping thelibdeepspeech.soinside packages, building full blown two flavors efficiently would require too much CI work
@zaptrem I know nothing about kotlin nor your code, but there is something I don't get there:
model?.let { model ->
streamContext?.let { streamContext ->
val shortArray = ShortArray(audioRecordData.size / 2) {
(audioRecordData[it * 2].toUByte().toInt() + (audioRecordData[(it * 2) + 1].toInt() shl 8)).toShort()
}
model.feedAudioContent(streamContext, shortArray, shortArray.size)
val decoded = model.intermediateDecodeWithMetadata(streamContext, 1)
val decodedString = model.intermediateDecode(streamContext)
if(decodedString != lastTranscription){
lastTranscription = decodedString
val map = packageTranscription(decoded)
emitDeviceEvent("onRecordingChange", map)
Log.d("transcription", decodedString)
}
}
}
Aside, it took me half an hour navigating in your code to find where the calls where. We would have saved everybody a lot of time if you had directly linked to the "streaming transcription uses" you referred to instead of just linking to the repo.
First, why do you call both intermediateDecodeWithMetadata and intermediateDecode ?
Second, I don't see any finalize() nor delete() call on the Metadata object, which is how the bindings will call DS_FreeMetadata() and that is, as much as we know, the canonical way to handle that in Java. I have no idea if it is up to you or the VM's job to call delete().
$ cat native_client/java/libdeepspeech/src/main/java/org/mozilla/deepspeech/libdeepspeech/Metadata.java
/* ----------------------------------------------------------------------------
* This file was automatically generated by SWIG (http://www.swig.org).
* Version 4.0.2
*
* Do not make changes to this file unless you know what you are doing--modify
* the SWIG interface file instead.
* ----------------------------------------------------------------------------- */
package org.mozilla.deepspeech.libdeepspeech;
public class Metadata {
private transient long swigCPtr;
protected transient boolean swigCMemOwn;
protected Metadata(long cPtr, boolean cMemoryOwn) {
swigCMemOwn = cMemoryOwn;
swigCPtr = cPtr;
}
protected static long getCPtr(Metadata obj) {
return (obj == null) ? 0 : obj.swigCPtr;
}
@SuppressWarnings("deprecation")
protected void finalize() {
delete();
}
public synchronized void delete() {
if (swigCPtr != 0) {
if (swigCMemOwn) {
swigCMemOwn = false;
implJNI.delete_Metadata(swigCPtr);
}
swigCPtr = 0;
}
}
public long getNumTranscripts() {
return implJNI.Metadata_NumTranscripts_get(swigCPtr, this);
}
public CandidateTranscript getTranscript(int i) {
return new CandidateTranscript(implJNI.Metadata_getTranscript(swigCPtr, this, i), false);
}
}
So if DS_FreeMetadata() is never called, it would not be surprising you observe a leak, but it is obviously not a bug on our side if that is the case.