Simon
Simon
Hey, Thank you for the info. I'll try that out
Hey, I'm running in the same issue: ``` 23.34 Collecting DracoPy==1.4.0 (from -r /app/requirements.txt (line 34)) 23.34 Downloading DracoPy-1.4.0.tar.gz (158 kB) 23.34 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 159.0/159.0 kB 14.1 MB/s eta 0:00:00 23.34...
That's weird because I don't seem to find any aarch64 binaries here: https://pypi.org/project/DracoPy/1.4.0/#files Also it's not really an issue for me since I can also build my docker with the...
Update: `conv.cc` in `providers/cuda/nn` does have `cached_benchmark_results` which works. I've verified that after the first run, the execution plan is always found in `cached_benchmark_results`
I've verified that all models are running on CUDA. I've also profiled the model and the execution stack looks like this: I'm viewing it with chrome://tracing/ . Does this look...
So I have to set `CUDA_LAUNCH_BLOCKING=1` since otherwise the profiler doesn't properly track CUDA execution times. With that I get this: The slow performance thus comes from the `Resize` op....
Yes, the input is: data: Tensor[1, 128, 48, 48, 48] scale: [1, 1, 2, 2, 2] roi: empty mode: "nearest" nearest_mode: "floor" Thank you for looking into this
Hey @tianleiwu , I was wondering if you managed to reproduce the issue and if there is still anything you need from me?
Hey, I understand that this might not be the highest priority but I was wondering when you'd be able to look into this. This issue is currently blocking us from...
That sounds good. Thank you for looking into this :)