Yoonsoo Kim
Yoonsoo Kim
Official b6 and b7 weights are released. Any updates? Thanks ;)
Thank you for integrating our Solar model into vLLM, @shing100! It will be very helpful for users exploring Solar. Just to clarify: **`BSKCN` actually stands for `Block level SKip CoNnection`**,...
I found that the fewshot prompt is also wrong. I checked model input with following command. ``` lm_eval --model hf --batch_size 16 --model_args pretrained=meta-llama/Llama-3.2-3B,trust_remote_code=True --write_out --output_path results --log_samples --tasks agieval_en...
Thanks for the review, but I still have the same issue after upgrading to 5.0.0b6 too.
- browser: chrome / edge - Python: 3.10 - Node: v20.18.0 Yes I'm on linux. I'm testing with aws Lightsail instance.
I found that the error was not present when minimally doing forward/backward pass on torch._grouped_mm with specified tensor shapes. The error occured when using torchtitan. Also when I chunked the...
It works fine without compile. (To avoid oom that happens on ce loss on single node without compile, I set seq_len=1024 & top_k=32. In this setting, compile->error, no compile->no error)