server
server copied to clipboard
How to run Qwen3-VL-4B on Jetson Orin Nano Super?
Hi, I’d like to deploy Qwen3-VL-4B on a Jetson Orin Nano Super.
Could you please advise:
Which inference framework works best (e.g., TensorRT-LLM, ONNX, Transformers + accelerate)? Is jetson-inference compatible with Qwen-VL models? What environment setup (JetPack version, CUDA, quantization, etc.) is recommended to fit the model in memory? Any example or guidance would be very helpful. Thanks!