Bernardo Ramos comments

Results 218 comments of


                                            Bernardo Ramos

Support for Apple M1 arm64/aarch64 chips

I suspect that currently only `clang` compiles to Apple Silicon, requiring Xcode 12.2 on Mac. `gcc` has a [suspended ticket](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96168) for its support, but there is a work-in-progress implementation [here](https://github.com/iains/gcc-darwin-arm64#readme).

Update README.md

Can someone review this? The current README is bad formatted, not clear to understand

Update README.md

@andpor Done. I think that the README is still confusing because it mix instructions for many versions of RN. In my opinion the README should contain instructions for only the...

Update README.md

Lets do it in steps. You can merge this one first. Maybe someone else that understands React Native could do it better than me. My world is C :P

Add to readme: do not use root

It is not possible to run OpenDevin as root. Check #969 and #887

llama2.cu - a simple cuda implementation

Good job! I suspect this would be better as a separate repo, as it may have different instructions to run it, and other people may create different implementations Suggestions: *...

llama2.cu - a simple cuda implementation

@ankan-ban Would not be better to make the computations in FP16 as well? Currently it has lots of conversions BTW, I am learning a lot with your code. Thank you!

llama2.cu - a simple cuda implementation

Same with me. It is failing somewhere: ``` [email protected]:~/llama2.cu$ ./llama2 stories110M.bin Model params:- dim: 768 hidden_dim: 2048 n_heads: 12 n_kv_heads: 12 n_layers: 12 seq_len: 1024 vocab_size: 32000 [email protected]:~/llama2.cu$ echo $?...

llama2.cu - a simple cuda implementation

It works when we use a `tokenizer.bin` from previous commit like [this](568a651c45038712859b606b51176b33061fd353) But the output is gibberish: ``` hidden_dim: 2048 n_heads: 12 n_kv_heads: 12 n_layers: 12 seq_len: 1024 vocab_size: 32000...

llama2.cu - a simple cuda implementation

The instructions I followed: ``` git clone https://github.com/ankan-ban/llama2.cu cd llama2.cu/ wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.bin nvcc llama2.cu -o llama2 ./llama2 stories110M.bin ```