BUGOUT icon indicating copy to clipboard operation
BUGOUT copied to clipboard

Running on NVIDIA Jetson Nano 2GB

Open Dreamkeeper66666 opened this issue 3 years ago • 15 comments

Hi, is it able to run on NVIDIA Jetson Nano 2GB? Thanks!

Dreamkeeper66666 avatar Jun 18 '21 12:06 Dreamkeeper66666

I am optimistic about that even though I don't own one...

On Fri, Jun 18, 2021, 08:51 Jonathan_Go @.***> wrote:

Is it able to run on NVIDIA Jetson Nano 2GB? Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Terkwood/BUGOUT/issues/511, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJIPHCCGVJJR3UWILYNROATTTM6LZANCNFSM465T2WMQ .

Terkwood avatar Jun 18 '21 17:06 Terkwood

I would be more skeptical about BUGOUT's fragile build process (tinybrain/KataGo subsystem) than I would about NVIDIA's new Jetson product ☺

Terkwood avatar Jun 18 '21 17:06 Terkwood

I am optimistic about that even though I don't own one...

Thanks! Btw, what's the size of model you are running katago with? Is it 20b or 40b?

Dreamkeeper66666 avatar Jun 19 '21 03:06 Dreamkeeper66666

The 20b! Model is a little bit outdated. I had tried an upgrade but encountered some instability for whatever reason

Terkwood avatar Jun 19 '21 11:06 Terkwood

https://github.com/Terkwood/BUGOUT/blob/527943911159f0ab32e7cd2cfb2db9da83192a89/tinybrain/src/env.rs#L8

The model file is specified here

Terkwood avatar Jun 19 '21 11:06 Terkwood

The 20b! Model is a little bit outdated. I had tried an upgrade but encountered some instability for whatever reason

So what's the benchmark for 20b?

Dreamkeeper66666 avatar Jun 19 '21 12:06 Dreamkeeper66666

???

Terkwood avatar Jun 19 '21 12:06 Terkwood

Oh, hi, sorry : I haven't tried benchmarking KataGo on any of the NVIDIA Jetson products. Suffice to say it's pretty slow overall, just from a human-interaction perspective. BUGOUT offers "KataGo One-Star" mode so that you get a weak play which considers the minimum possible number of positions, but cuts down on the time taken to compute moves.

Terkwood avatar Jun 19 '21 17:06 Terkwood

Oh, hi, sorry : I haven't tried benchmarking KataGo on any of the NVIDIA Jetson products. Suffice to say it's pretty slow overall, just from a human-interaction perspective. BUGOUT offers "KataGo One-Star" mode so that you get a weak play which considers the minimum possible number of positions, but cuts down on the time taken to compute moves.

Cool, thanks!I am having some problems compiling KataGo. So could you possibly share the executable binary and its dependencies(except the cuda libraries)

Dreamkeeper66666 avatar Jun 20 '21 17:06 Dreamkeeper66666

Here is the katago binary. I'll post the specific neural net that I'm currently using also (my uploads are slow please bear with me)

https://github.com/Terkwood/BUGOUT/releases/download/v1.4.1/katago

Terkwood avatar Jun 21 '21 01:06 Terkwood

And here's the net: https://github.com/Terkwood/BUGOUT/releases/download/v1.4.1/g170e-b20c256x2-s2430231552-d525879064.bin.gz

Hopefully you can compile tinybrain without major difficulty. When you start that application, it'll look for the katago bin and the net in its current working directory.

You can experiment with downloading other , more recent nets from the katago Github, but I have no idea if they'll behave within BUGOUT. If you do that, you need to update a .env file in the same directory as the tinybrain binary to point to a different net:

MODEL_FILE="g170e-b20--snip--.bin.gz"

There are some notes in https://github.com/Terkwood/BUGOUT/blob/unstable/tinybrain/README.md which may be of use during this effort (or which may just confuse you)

Terkwood avatar Jun 21 '21 01:06 Terkwood

Here is the katago binary. I'll post the specific neural net that I'm currently using also (my uploads are slow please bear with me)

https://github.com/Terkwood/BUGOUT/releases/download/v1.4.1/katago

Thanks! Actually I am having problems installing the dependencies. Could you please also include the dependent libraries as well (e.g. libssl.so, libstdc++.so, libc.so, ......) which could be seen via ldd katago. image

Dreamkeeper66666 avatar Jun 21 '21 03:06 Dreamkeeper66666

Some of those should be installable via (eg)

 sudo apt install zlib1g-dev libzip-dev libboost-filesystem-dev
 sudo apt install libgoogle-perftools-dev  # for TCMALLOC

See https://github.com/Terkwood/BUGOUT/blob/unstable/tinybrain/README.md#build-steps

But you're on your own for tracking down the rest of the deps that aren't covered in those notes. Good luck!

Terkwood avatar Jun 21 '21 09:06 Terkwood

@Terkwood Thanks for your instructions! I successfully compiled KataGo finally. I tried the latest 40b model, and it worked. But after I set the cudaUseFP16 to be true. It gives errors like these:

terminate called after throwing an instance of 
'StringError'
  what():  CUDA Error, for getOutput file /home/Dreamkeeper/KataGo/cpp/neuralnet/cudabackend.cpp, func cudaMemcpy(inputBuffers->policyResults, buffers->policyBuf, inputBuffers->singlePolicyResultBytes*batchSize, cudaMemcpyDeviceToHost), line 2921, error the launch timed out and was terminated" 

Maybe it's just because of memory issues or something while running 40b model using FP16. Anyways. It works perfectly fine. Thanks again!

Dreamkeeper66666 avatar Jun 26 '21 05:06 Dreamkeeper66666

Very nice! I'm glad you were able to succeed. At some point I might look into purchasing one of these and setting up a way to distribute move requests across multiple Nanos. Having the new model in the mix would be fun. 😁

Terkwood avatar Jun 26 '21 11:06 Terkwood