Ruiqi Ye issues

Results 12 issues of


                                            Ruiqi Ye

Unable to convert custom tensorflow model using hls4ml

I am trying to convert a custom tensorflow model using hls4ml but I have encountered this error, Traceback (most recent call last): File "/home/ruiqi/.local/lib/python3.7/site-packages/hls4ml/converters/tf_to_hls.py", line 136, in tf_to_hls graph_def.ParseFromString(f.read()) google.protobuf.message.DecodeError:...

question

Unable to make FLaME, errors occurred during linking

Hi all, I have encountered the following errors when trying to make FLaME, [100%] Linking CXX executable ../bin/flame_test CMakeFiles/flame_test.dir/stereo/epipolar_geometry_test.cc.o: In function `__static_initialization_and_destruction_0(int, int) [clone .constprop.643]': epipolar_geometry_test.cc:(.text.startup+0x17d): undefined reference to `testing::internal::MakeAndRegisterTestInfo(char...

Unable to load the pre-trained model using tensorflow

Hi, I have been trying to load the pre-trained model provided in the pretrained_models folder, but each time when I try to load the saved_model.pb file, I encountered the "Check...

How to allow SVO pipeline to spend more time on processing each frame?

Hi all, I was wondering what can I do to allow SVO to spent more time on processing one frame? I was told to change the value of timestamp, which...

JVM optimization flag -Dtornado.enable.fma=true doesn't seem to work with PTX backend

--------------------------------------------------------------------- ### Describe the bug -Dtornado.enable.fma=true does not force the use of fma instruction in the generated PTX code. ### How To Reproduce tornado-test -V --jvm="-Ds0.t0.device=1:0 -Dtornado.enable.fma=true" --enableProfiler console -pk...

PTX

JVM optimization flag -Dtornado.experimental.partial.unroll=true provides minimum performance improvement with PTX backend

--------------------------------------------------------------------- ### Describe the bug When using the unit test TestMatrixMultiplicationKernelContext#mxm1DKernelContext, for a 512x512 matrix, grid size 1x1, block size 512x1, RTX Titan GPU, the performance improvement in terms of...

PTX

Unable to control loop unrolling factor with JVM optimization flag -Dtornado.partial.unroll.factor=FACTOR with PTX backend

--------------------------------------------------------------------- ### Describe the bug The JVM optimization flag -Dtornado.partial.unroll.factor=FACTOR does not seem to be able to control the unrolling factor and generate different PTX assembly. ### How To Reproduce...

bug

PTX

RMS normalization kernel optimization by fusing the reduction kernel and context mapping kernel

The main optimization is done by using all the threads, instead of only the first global thread, to calculate the scaling factor. This avoids thread divergence and the need to...

Wrong download link for Qwen3 (1.7B) - FP16, Qwen3 (4B) - FP16 and Qwen3 (8B) - FP16

Correct links should be: Qwen3 (1.7B) - FP16: https://huggingface.co/ggml-org/Qwen3-1.7B-GGUF/resolve/main/Qwen3-1.7B-f16.gguf Qwen3 (4B) - FP16: https://huggingface.co/ggml-org/Qwen3-4B-GGUF/resolve/main/Qwen3-4B-f16.gguf Qwen3 (8B) - FP16: https://huggingface.co/ggml-org/Qwen3-8B-GGUF/resolve/main/Qwen3-8B-f16.gguf

TornadoTaskRuntimeException when using Phi-3-mini-4k-instruct-fp16.gguf during TornadoVM initialization

**Describe the bug** Tokenizer: Phi3Tokenizer Loading model weights in TornadoVM format (loading F16) Starting TornadoVM initialization... TornadoVM GPU execution plan creation: 619.22 ms Java to GPU JIT compiler warmup: 6147.02...

bug