grok-1
grok-1 copied to clipboard
【SUCCESS ENV】python3.11+cuda12.3+cudnn8.9+jax[cuda12_pip]==0.4.23
For me, the inference of grok cost 8*A800 and each GPU costs 65G memory.
After my experiments, the deploy environment has to be python3.11+cuda12.3+cudnn8.9+jax[cuda12_pip]==0.4.23, otherwise there will be many problems, such as:
- Unable to initialize backend 'cuda': Found cuSPARSE version 12103, but JAX was built against version 12200, which is newer
- 'type' object is not subscriptable
- jaxlib.xla_extension.XlaRuntimeError: INTERNAL: external/xla/xla/service/gpu/nccl_api.cc:395
AND the running command has to be "JAX_TRACEBACK_FILTERING=off python run.py". Then you can successfully run the program.
But I found that the model performs not well and cannot chat, for example:
Is there any plan to make the chat version model public?