Yili Hong
Yili Hong
Same question. The [sciworld_test.json](https://huggingface.co/datasets/AgentGym/AgentEval/blob/main/sciworld_test.json) is even in the format of training set. Could you please update it to the correct version?
I have the same error. Have you solved the issue?
@juliaparedesq I tried to remove `--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'` and the exception was gone. But after finetuning, the model's ability declined significantly. It seems that fastchat can only be used to deploy...
So does any one have a solution for cuda12.8 + torch 2.7 with pip?
Same issue
```bash ValueError: vllm version 0.6.3.post1 not supported. Currently supported versions are 0.3.1, 0.4.2, 0.5.4, 0.6.3 and 0.7.0+ ```
How to set rule-based rewards? I only find model-based reward examples.
```python def reward_func(queries, prompts, labels): # queries is prompts + responses # labels is answers print(queries) return torch.randn(len(queries)) ``` @dubanx Could you give me an example of `prompts`,`queries` and `labels`?...
Same issue, has any one found a solution?