Huang Xin
Huang Xin
I'm getting this error after pulling the Docker image with CUDA 11.8 and installation of vLLM: from vllm import LLM, SamplingParams Traceback (most recent call last): File "", line 1,...
> ```shell > pip install vllm > ``` Thanks, it resolved my issue
Simply add following code after allocation of optimizer in `optimizers.py` support the gradient accumulation: ``` if config.accumulate_gradient_steps > 1: optimizer = optax.MultiSteps(optimizer, config.accumulate_gradient_steps) ```
Hi @A9isha , I found two bugs in your conversion code, and I have fixed it and validated the weights converted from maxtext version of llama3-8b with the HF one....