grok-1 icon indicating copy to clipboard operation
grok-1 copied to clipboard

【SUCCESS ENV】python3.11+cuda12.3+cudnn8.9+jax[cuda12_pip]==0.4.23

Open chenyzh28 opened this issue 1 year ago • 9 comments

For me, the inference of grok cost 8*A800 and each GPU costs 65G memory.

After my experiments, the deploy environment has to be python3.11+cuda12.3+cudnn8.9+jax[cuda12_pip]==0.4.23, otherwise there will be many problems, such as:

  • Unable to initialize backend 'cuda': Found cuSPARSE version 12103, but JAX was built against version 12200, which is newer
  • 'type' object is not subscriptable
  • jaxlib.xla_extension.XlaRuntimeError: INTERNAL: external/xla/xla/service/gpu/nccl_api.cc:395

AND the running command has to be "JAX_TRACEBACK_FILTERING=off python run.py". Then you can successfully run the program.

But I found that the model performs not well and cannot chat, for example: E7E95CA0-C379-4ABD-A288-8015FF1BFAA8

Is there any plan to make the chat version model public?

chenyzh28 avatar Mar 21 '24 11:03 chenyzh28