mediapipe
mediapipe copied to clipboard
Android x86 tasks_genai - llm inference runtime error `mediapipe.tasks.core.jni.LlmResponseContext.responses' contains invalid UTF-8 data when serializing a protocol buffer`
Have I written custom code (as opposed to using a stock example script provided in MediaPipe)
No
OS Platform and Distribution
Android 11 X86_64
Mobile device if the issue happens on mobile device
Intel Nuc
Browser and version if the issue happens on browser
No response
Programming Language and version
Java
MediaPipe version
No response
Bazel version
No response
Solution
LLMInference
Android Studio, NDK, SDK versions (if issue is related to building in Android environment)
No response
Xcode & Tulsi version (if issue is related to building for iOS)
No response
Describe the actual behavior
After the chat launches, If I enter in "Hi", I see getting "Hello" back in response but the app crashes right at that time
Describe the expected behaviour
The chat should work with the Gemma 2b model
Standalone code/steps you may have used to try to get what you need
I am trying to run mediapipe genai llm inference example on x86 Android but Google's published tasks_genai release doesn't contain x86_64 jni libs
I decided to build it from source but then I realized how mediapipe only supports ndk r21 and below which resulted into the below build errors
r21 = clang 9.0 -> lacks support for the following
clang: error: unknown argument: '-mamx-int8'
clang: error: unknown argument: '-mavxvnni'
So I locally built new llvm by cherry-picking necessary commits and rebuilt NDK r21 using the new llvm. I was able to obtain tasks_core.aar and tasks_genai.aar
Commits cherry-picked (might be missing one or two which I forgot to note down) -
https://github.com/llvm/llvm-project/commit/e0fa3dc3738a8ac311d277fc946976c6644e9096
https://github.com/llvm/llvm-project/commit/1d93be29def9690e8cd74f767bea5870b32f6c2f
https://github.com/llvm/llvm-project/commit/14942f4e02ef10f7a94d3c2b989c11dbe7709a02
https://github.com/llvm/llvm-project/commit/180548c5c7848f82ceac5d6a3528a8cb14c20fed
https://github.com/llvm/llvm-project/commit/903f8fd2635f2b136b643e619f7bd96242e577ae
https://github.com/llvm/llvm-project/commit/bc7f6c6dd8252370e6b485b8193093004644a16d
https://github.com/llvm/llvm-project/commit/3ad09fd03c51823aeb0bcbd7898aada33e9228d6
https://github.com/llvm/llvm-project/commit/a7e45ea30d4c9c3f66f44f0e69e31eac3a22db42
https://github.com/llvm/llvm-project/commit/adccc0bfa301005367d6b89a3aacc07ef0166e64
https://github.com/llvm/llvm-project/commit/a8a91533dd65041ced68ed5b9348b5d023837488
Build commands -
bazel build -c opt mediapipe/tasks/java/com/google/mediapipe/tasks/core:tasks_core.aar --linkopt="-s" --cpu=x86_64 --config=android_x86_64 --fat_apk_cpu=x86_64
bazel build -c opt mediapipe/tasks/java/com/google/mediapipe/tasks/genai:tasks_genai.aar --linkopt="-s" --cpu=x86_64 --config=android_x86_64 --fat_apk_cpu=x86_64
### Other info / Complete Logs
```shell
04-06 15:07:49.511 4754 4806 I native : I0000 00:00:1712416069.511620 4806 jni_util.cc:41] GetEnv: not attached
04-06 15:07:49.912 4754 4806 E libprotobuf-native: [libprotobuf ERROR external/com_google_protobuf/src/google/protobuf/wire_format_lite.cc:581] String field 'mediapipe.tasks.core.jni.LlmResponseContext.responses' contains invalid UTF-8 data when serializing a protocol buffer. Use the 'bytes' type if you intend to send raw bytes.
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] JNI DETECTED ERROR IN APPLICATION: JNI GetObjectClass called with pending exception java.lang.IllegalStateException: Failed to parse response
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at com.google.mediapipe.tasks.core.jni.proto.LlmResponseContextProto$LlmResponseContext com.google.mediapipe.tasks.core.LlmTaskRunner.parseResponse(byte[]) (LlmTaskRunner.java:78)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at void com.google.mediapipe.tasks.core.LlmTaskRunner.onAsyncResponse(byte[]) (LlmTaskRunner.java:83)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol message had invalid UTF-8.
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at int com.google.protobuf.ArrayDecoders.decodeStringListRequireUtf8(int, byte[], int, int, com.google.protobuf.Internal$ProtobufList, com.google.protobuf.ArrayDecoders$Registers) (ArrayDecoders.java:622)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at int com.google.protobuf.MessageSchema.parseRepeatedField(java.lang.Object, byte[], int, int, int, int, int, int, long, int, long, com.google.protobuf.ArrayDecoders$Registers) (MessageSchema.java:3651)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at int com.google.protobuf.MessageSchema.parseMessage(java.lang.Object, byte[], int, int, int, com.google.protobuf.ArrayDecoders$Registers) (MessageSchema.java:4153)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at void com.google.protobuf.MessageSchema.mergeFrom(java.lang.Object, byte[], int, int, com.google.protobuf.ArrayDecoders$Registers) (MessageSchema.java:4300)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at com.google.protobuf.GeneratedMessageLite com.google.protobuf.GeneratedMessageLite.parsePartialFrom(com.google.protobuf.GeneratedMessageLite, byte[], int, int, com.google.protobuf.ExtensionRegistryLite) (GeneratedMessageLite.java:1630)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at com.google.protobuf.GeneratedMessageLite com.google.protobuf.GeneratedMessageLite.parseFrom(com.google.protobuf.GeneratedMessageLite, byte[]) (GeneratedMessageLite.java:1719)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at com.google.mediapipe.tasks.core.jni.proto.LlmResponseContextProto$LlmResponseContext com.google.mediapipe.tasks.core.jni.proto.LlmResponseContextProto$LlmResponseContext.parseFrom(byte[]) (LlmResponseContextProto.java:276)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at com.google.mediapipe.tasks.core.jni.proto.LlmResponseContextProto$LlmResponseContext com.google.mediapipe.tasks.core.LlmTaskRunner.parseResponse(byte[]) (LlmTaskRunner.java:76)
04-06 15:07:49.975 4754 4806 F es.llminferenc: java_vm_ext.cc:577] at void com.google.mediapipe.tasks.core.LlmTaskRunner.onAsyncResponse(byte[]) (LlmTaskRunner.java:83)
Full logs are available here - https://gist.github.com/keyur-esper/18ff4fab0ba937ca5de48c8c5bf056d4
I was able to get past this error by switching from string to bytes repeated bytes responses = 1; in protobuf definition at mediapipe/mediapipe/tasks/java/com/google/mediapipe/tasks/core/jni/proto/llm_response_context.proto
But after that in async chat, I encountered unreadable or invalid UTF-8 bytes which I realized were incomplete sequences. When concatenated, they parsed correctly as valid UTF-8 strings. So I implemented a buffer and that resolved most of the issue. But the final word or a few characters are always missing in the output. It works without any problems in synchronized chat. The final sequence of bytes is missing right from the JNI layer. I am attaching debug logs to visualize the flow of the data.
02:45:31.161 D Final sanitized responses: [ks.▁Wh]
02:45:31.245 I I0000 00:00:1712569531.245441 32190 llm.cc:66] Response: at▁w
02:45:31.245 I I0000 00:00:1712569531.245486 32190 llm.cc:69] Output: responses: "at\342\226\201w"
02:45:31.245 I I0000 00:00:1712569531.245508 32190 llm.cc:73] Serialized response context size: 8
02:45:31.245 I I0000 00:00:1712569531.245520 32190 llm.cc:99] response_context_bytes: 0x4d5
02:45:31.245 D Response bytes size: 8
02:45:31.245 D Processing ByteString: 61 74 e2 96 81 77
02:45:31.245 D Combined ByteString for processing: 61 74 e2 96 81 77
02:45:31.245 D ByteString is well-formed
02:45:31.245 D Final sanitized responses: [at▁w]
02:45:31.328 I I0000 00:00:1712569531.328953 32190 llm.cc:66] Response: ould▁
02:45:31.329 I I0000 00:00:1712569531.328998 32190 llm.cc:69] Output: responses: "ould\342\226\201"
02:45:31.329 I I0000 00:00:1712569531.329024 32190 llm.cc:73] Serialized response context size: 9
02:45:31.329 I I0000 00:00:1712569531.329034 32190 llm.cc:99] response_context_bytes: 0x4f5
02:45:31.329 D Response bytes size: 9
02:45:31.329 D Processing ByteString: 6f 75 6c 64 e2 96 81
02:45:31.329 D Combined ByteString for processing: 6f 75 6c 64 e2 96 81
02:45:31.329 D ByteString is well-formed
02:45:31.329 D Final sanitized responses: [ould▁]
02:45:31.413 I I0000 00:00:1712569531.413001 32190 llm.cc:66] Response: you?
02:45:31.413 I I0000 00:00:1712569531.413049 32190 llm.cc:69] Output: responses: "you\342\226"
02:45:31.413 I I0000 00:00:1712569531.413069 32190 llm.cc:73] Serialized response context size: 7
02:45:31.413 I I0000 00:00:1712569531.413080 32190 llm.cc:99] response_context_bytes: 0x515
02:45:31.413 D Response bytes size: 7
02:45:31.413 D Processing ByteString: 79 6f 75 e2 96
02:45:31.413 D Combined ByteString for processing: 79 6f 75 e2 96
02:45:31.413 D ByteString is incomplete, storing for next processing
02:45:31.476 I I0000 00:00:1712569531.476527 32190 llm.cc:66] Response: ?like?
02:45:31.476 I I0000 00:00:1712569531.476568 32190 llm.cc:69] Output: responses: "\201like\342\226"
02:45:31.476 I I0000 00:00:1712569531.476590 32190 llm.cc:73] Serialized response context size: 9
02:45:31.476 I I0000 00:00:1712569531.476600 32190 llm.cc:99] response_context_bytes: 0x535
02:45:31.476 D Response bytes size: 9
02:45:31.476 D Processing ByteString: 81 6c 69 6b 65 e2 96
02:45:31.476 D Combined ByteString for processing: 79 6f 75 e2 96 81 6c 69 6b 65 e2 96
02:45:31.476 D ByteString is incomplete, storing for next processing
02:45:31.545 I I0000 00:00:1712569531.545858 32190 llm.cc:66] Response: ?to?
02:45:31.545 I I0000 00:00:1712569531.545899 32190 llm.cc:69] Output: responses: "\201to\342\226"
02:45:31.545 I I0000 00:00:1712569531.545920 32190 llm.cc:73] Serialized response context size: 7
02:45:31.545 I I0000 00:00:1712569531.545932 32190 llm.cc:99] response_context_bytes: 0x555
02:45:31.546 D Response bytes size: 7
02:45:31.546 D Processing ByteString: 81 74 6f e2 96
02:45:31.546 D Combined ByteString for processing: 79 6f 75 e2 96 81 6c 69 6b 65 e2 96 81 74 6f e2 96
02:45:31.546 D ByteString is incomplete, storing for next processing
02:45:31.610 I I0000 00:00:1712569531.610408 32190 llm.cc:66] Response: ?tell▁
02:45:31.610 I I0000 00:00:1712569531.610449 32190 llm.cc:69] Output: responses: "\201tell\342\226\201"
02:45:31.610 I I0000 00:00:1712569531.610473 32190 llm.cc:73] Serialized response context size: 10
02:45:31.610 I I0000 00:00:1712569531.610483 32190 llm.cc:99] response_context_bytes: 0x575
02:45:31.610 D Response bytes size: 10
02:45:31.610 D Processing ByteString: 81 74 65 6c 6c e2 96 81
02:45:31.610 D Combined ByteString for processing: 79 6f 75 e2 96 81 6c 69 6b 65 e2 96 81 74 6f e2 96 81 74 65 6c 6c e2 96 81
02:45:31.610 D ByteString is well-formed
02:45:31.610 D Final sanitized responses: [you▁like▁to▁tell▁]
02:45:31.693 I I0000 00:00:1712569531.693338 32190 llm.cc:66] Response: m
02:45:31.693 I I0000 00:00:1712569531.693385 32190 llm.cc:69] Output: responses: "m"
02:45:31.693 I I0000 00:00:1712569531.693407 32190 llm.cc:73] Serialized response context size: 3
02:45:31.693 I I0000 00:00:1712569531.693419 32190 llm.cc:99] response_context_bytes: 0x595
02:45:31.693 D Response bytes size: 3
02:45:31.693 D Processing ByteString: 6d
02:45:31.693 D Combined ByteString for processing: 6d
02:45:31.693 D ByteString is well-formed
02:45:31.693 D Final sanitized responses: [m]
02:45:31.776 I I0000 00:00:1712569531.776074 32190 llm.cc:66] Response: e▁tod
02:45:31.776 I I0000 00:00:1712569531.776125 32190 llm.cc:69] Output: responses: "e\342\226\201tod"
02:45:31.776 I I0000 00:00:1712569531.776144 32190 llm.cc:73] Serialized response context size: 9
02:45:31.776 I I0000 00:00:1712569531.776154 32190 llm.cc:99] response_context_bytes: 0x5b5
02:45:31.776 D Response bytes size: 9
02:45:31.776 D Processing ByteString: 65 e2 96 81 74 6f 64
02:45:31.776 D Combined ByteString for processing: 65 e2 96 81 74 6f 64
02:45:31.776 D ByteString is well-formed
02:45:31.776 D Final sanitized responses: [e▁tod]
02:45:31.853 I I0000 00:00:1712569531.853847 32190 llm.cc:66] Response:
02:45:31.853 I I0000 00:00:1712569531.853892 32190 llm.cc:69] Output: responses: ""
done: true
02:45:31.853 I I0000 00:00:1712569531.853916 32190 llm.cc:73] Serialized response context size: 4
02:45:31.853 I I0000 00:00:1712569531.853927 32190 llm.cc:99] response_context_bytes: 0x5d5
02:45:31.854 D Response bytes size: 4
02:45:31.854 D Processing ByteString:
02:45:31.854 D Combined ByteString for processing:
02:45:31.854 D ByteString is well-formed
02:45:31.854 D Message marked as done, processing any remaining bytes
02:45:31.854 D No remaining bytes to force process
02:45:31.854 D Final sanitized responses: []
02:45:31.854 I I0000 00:00:1712569531.854326 32190 jni_util.cc:86] Exiting thread. Detach thread.
Let us know if you are still having trouble with this. I don't see an obvious culprit, but we haven't tested our CPU codepath on Android x86.
Hi @keyur2maru,
Could you kindly review the previous comment and provide us with the current status?
Thank you!!
This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.
This issue was closed due to lack of activity after being marked stale for past 7 days.