grok-1
grok-1 copied to clipboard
Grok open release
Add double quotes to the `huggingface_hub[hf_transfer]` and `ckpt-0/*`. ```shell git clone https://github.com/xai-org/grok-1.git && cd grok-1 pip install "huggingface_hub[hf_transfer]" huggingface-cli download xai-org/grok-1 --repo-type model --include "ckpt-0/*" --local-dir checkpoints --local-dir-use-symlinks False ```
I am trying to serve the model here. https://www.fedml.ai/models/941?owner=xai-org I used pytorch model that is transfered from JAX, so I can use the huggingface pipeline. I used the llama chat...
https://huggingface.co/hpcai-tech/grok-1
 should I create the file at that path?
The command line does not advance beyond that - Is everything okay? Is it supposed to be like this?
D:\grok-1>python run.py INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda': INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': UNIMPLEMENTED: LoadPjrtPlugin is not implemented on windows...
For me, the inference of grok cost 8*A800 and each GPU costs 65G memory. After my experiments, the deploy environment has to be python3.11+cuda12.3+cudnn8.9+jax[cuda12_pip]==0.4.23, otherwise there will be many problems,...
There are use cases where one might want to shard the quantized model across fewer devices than the number of experts. However, doing so would result in a shape mismatch...
I got it running, but stuck here for like 8-10 hours. Hardware: - 8xA800 80G - 1T SSD Freespace - 500G RAM - Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz...
INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda': python3.11 and https://storage.googleapis.com/jax-releases/cuda12/jaxlib-0.4.25+cuda12.cudnn89-cp311-cp311-manylinux2014_x86_64.whl and https://developer.nvidia.com/downloads/compute/cudnn/secure/8.9.7/local_installers/12.x/cudnn-local-repo-ubuntu2204-8.9.7.29_1.0-1_amd64.deb