GPULlama3.java icon indicating copy to clipboard operation
GPULlama3.java copied to clipboard

TornadoTaskRuntimeException when using Phi-3-mini-4k-instruct-fp16.gguf during TornadoVM initialization

Open yrq0208 opened this issue 1 month ago • 5 comments

Describe the bug Tokenizer: Phi3Tokenizer Loading model weights in TornadoVM format (loading F16)

Starting TornadoVM initialization... TornadoVM GPU execution plan creation: 619.22 ms Java to GPU JIT compiler warmup: 6147.02 ms Exception in thread "main" uk.ac.manchester.tornado.api.exceptions.TornadoTaskRuntimeException: Parameter #4 uk.ac.manchester.tornado.api.types.arrays.HalfFloatArray@ebaa6cb from task not specified either in transferToDevice or transferToHost functions at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.checkAllArgumentsPerTask(TornadoTaskGraph.java:1516) at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.execute(TornadoTaskGraph.java:1599) at [email protected]/uk.ac.manchester.tornado.runtime.tasks.TornadoTaskGraph.execute(TornadoTaskGraph.java:1626) at [email protected]/uk.ac.manchester.tornado.api.TaskGraph.execute(TaskGraph.java:804) at [email protected]/uk.ac.manchester.tornado.api.ImmutableTaskGraph.execute(ImmutableTaskGraph.java:50) at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutor.lambda$execute$0(TornadoExecutor.java:49) at java.base/java.util.ArrayList.forEach(ArrayList.java:1596) at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutor.execute(TornadoExecutor.java:49) at [email protected]/uk.ac.manchester.tornado.api.TornadoExecutionPlan.execute(TornadoExecutionPlan.java:181) at org.beehive.gpullama3.tornadovm.TornadoVMMasterPlan.forceCopyInReadOnlyDataLayered(TornadoVMMasterPlan.java:190) at org.beehive.gpullama3.tornadovm.TornadoVMMasterPlan.initializeTornadoVMPlan(TornadoVMMasterPlan.java:67) at org.beehive.gpullama3.model.Model.runInstructOnce(Model.java:205) at org.beehive.gpullama3.LlamaApp.runSingleInstruction(LlamaApp.java:18) at org.beehive.gpullama3.LlamaApp.main(LlamaApp.java:44) Error: Command failed with return code 1

To Reproduce ./llama-tornado --gpu --verbose-init --opencl --model Phi-3-mini-4k-instruct-fp16.gguf --prompt "tell me a joke"

Expected behavior Initialize successfully and tell me a joke.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Version: 24.04.3 LTS

Additional context Using the latest build with TornadoVM and GPULlama3.java. Other models such as beehive-llama-3.2-1b-instruct-fp16.gguf, beehive-llama-3.2-3b-instruct-fp16.gguf, DeepSeek-R1-Distill-Qwen-1.5B-F16.gguf, Qwen2.5-0.5B-Instruct-f16.gguf, qwen2.5-1.5b-instruct-fp16.gguf, Qwen3-0.6B-f16.gguf all works with me.

yrq0208 avatar Nov 25 '25 15:11 yrq0208

@yrq0208 thank you for the issue! I just pushed a patch for it. Indeed it was an issue. Can you test the latest main. Thanks

mikepapadim avatar Nov 27 '25 09:11 mikepapadim

Thanks for the patch. I git pull the latest changes in main and I run into this make error (I tried a clean make):

[ERROR] COMPILATION ERROR : [INFO] ------------------------------------------------------------- [ERROR] /home/ruiqi/GPULlama3.java/src/main/java/org/beehive/gpullama3/tensor/tornado/FP32TornadoTensor.java:[16,48] cannot find symbol symbol: method fromSegmentShallow(java.lang.foreign.MemorySegment) location: class uk.ac.manchester.tornado.api.types.arrays.FloatArray [ERROR] /home/ruiqi/GPULlama3.java/src/main/java/org/beehive/gpullama3/tensor/tornado/FP16TornadoTensor.java:[17,52] cannot find symbol symbol: method fromSegmentShallow(java.lang.foreign.MemorySegment) location: class uk.ac.manchester.tornado.api.types.arrays.HalfFloatArray

yrq0208 avatar Nov 27 '25 11:11 yrq0208

you dont use the latest TornadoVM -> fromSegmentShallow is in the latest tornadovm master

mikepapadim avatar Nov 27 '25 11:11 mikepapadim

ok I can now compile and run Phi-3-mini-4k-instruct-fp16.gguf, but it is currently producing gibberish output such as:

WARNING: Using incubator modules: jdk.incubator.vector Tokenizer: Phi3Tokenizer Loading model weights in TornadoVM format (loading F16)

Starting TornadoVM initialization... TornadoVM GPU execution plan creation: 550.98 ms Java to GPU JIT compiler warmup: 1812.78 ms Transfer read-only weights to GPU: 928.03 ms Finished TornadoVM initialization...

To --> "*:",

",

",",

",",", ", ",

", "",", ",",", ",", ",

",",",",",", ",",",",",",",",",",",",",",",",",", ",",",",",",",",",",",""",""",", ",",",",",",",","",

"",",", "",",",", ", ",",",",",",

",", " ""," ",",",",", ",",","", "",",",

""",

",",",

"", ", ",

"",

"

"

",

"

" "

"
""""

" "

" " "" "" " " " """ "

"

""",""" """ " """ "" " "

" "

" "

"

"

" " " " "

""

"" "

" "", " "" """"" "

" "" " "

"" "

", ","

","",""",""," ",""

","

"

"

" ""

"","

achieved tok/s: 21.76. Tokens: 512, seconds: 23.53

yrq0208 avatar Nov 27 '25 13:11 yrq0208

@orionpapadakis

mikepapadim avatar Nov 27 '25 14:11 mikepapadim