Haifeng Li
Haifeng Li
Thanks for reporting. The fix is in the master branch now.
Please let's know if it address your issues. If so, we will make a new release. Thanks!
v3.1.1 is released.
I cannot reproduce such a 10x latency jump. Here is one of my run outputs: ``` size = 90 time = 487590219 size = 91 time = 1182113 size =...
It helps a lot by set `OMP_NUM_THREADS=12` on linux. The training speed is on par with mac (4 threads). Without it, torch.get_num_threads() returns 48. So the slowness may be caused...
The default is 48 with JavaCPP build, which is too high. It should be 24 for this case.
libtorch sets it to 24 by default on my box. And it works well. Why does JavaCPP build libtorch from source? Why not package the precompiled libtorch library from pytorch.org?
2.3.0 doesn't work on Windows. Failed to load `jnitorch.dll`. I think that it is the same issue as #1500.
 See the screenshot. PyTorch 2.3.0 cannot find libomp.dll
Thanks for hard working!