lietorch
lietorch copied to clipboard
Not available on win10
Hello, I can successfully compile and install lietorch on Windows, but I cannot pass the gpu related tests. When running the simplest multiplication, likes X1 = Ts * X0,
it will get stuck and exit.
I'm having the same issue. Did you find the solution?
I'm still not sure why it should be, but I managed to fix this on Windows by making all the CUDA kernels in lietorch_gpu.cu accept regular pointers where currently they accept const pointers. Hopefully someone who knows more about CUDA and Windows can make a proper pull request.
Hello @yclicc , I have almost knowlegde of this type of programming. Can you tell how to change the const pointer to regular ones, like you said?
Or could you share your lietorch_gpu.cu?
thanks
@carlosedubarreto well you just get rid of the word "const" in the function arguments. So for example https://github.com/princeton-vl/lietorch/blob/0fa9ce8ffca86d985eca9e189a99690d6f3d4df6/lietorch/src/lietorch_gpu.cu#L21
becomes __global__ void exp_forward_kernel(scalar_t* a_ptr, scalar_t* X_ptr, int num_threads) {
In practice if you want to maintain the old behaviour (which presumably has some benefit) on non-windows platforms you can add before the first template the following:
#ifdef _WIN32
#define NON_WINDOWS_CONST
#else
#define NON_WINDOWS_CONST const
#endif
and then replace all the consts in function parameter lists with NON_WINDOWS_CONST. Then on Windows it will remove the const at compile time but keep it in on other platforms. So for our example above you would end up with __global__ void exp_forward_kernel(NON_WINDOWS_CONST scalar_t* a_ptr, scalar_t* X_ptr, int num_threads) {
I hope that helps!
@yclicc thanks a lot for the detailed info.
I gave a shot just by replacing the "const" from nothing na it didnt work, maybe I did something wrong, I'll try again.
thanks a lot!!!!
Did you replace all the consts in every function in lietorch_gpu.cu?
On Thu, 1 Feb 2024, 5:43 pm Carlos Barreto, @.***> wrote:
@yclicc https://github.com/yclicc thanks a lot for the detailed info.
I gave a shot just by replacing the "const" from nothing na it didnt work, maybe I did something wrong, I'll try again.
thanks a lot!!!!
— Reply to this email directly, view it on GitHub https://github.com/princeton-vl/lietorch/issues/22#issuecomment-1921869337, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACK24JZUFC5ELV2CUCLT6H3YRPH25AVCNFSM6AAAAAAQT6JKX6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRRHA3DSMZTG4 . You are receiving this because you were mentioned.Message ID: @.***>
@yclicc after reading the old comment you did, I replaced the text "const " to "" and compiled it again. I had no success. but it was a quick test.
Now I'm gong to do it all from the start with more attention.
I tested again doing a clean install, and it seems to be working, thanks A LOT @yclicc !!!!! I'm sharing here the final file I changed. lietorch_gpu.zip
Awesome, glad it worked for you as well as for me! Now we just need to ask someone with more knowledge of windows and cuda why it works!
Im actually more curious to know how could you guess that removing the "const" would make it work 😊
Lots of trial and error, and the Nsight debugger
Oh Nsight debugger. Now I know that serves for something 😀, thanks a lot for the info 🦾
I tested again doing a clean install, and it seems to be working, thanks A LOT @yclicc !!!!! I'm sharing here the final file I changed. lietorch_gpu.zip
Thanks for your sharing! I solve same problem by this solution.