Yili Hong

Results 9 comments of Yili Hong

Same question. The [sciworld_test.json](https://huggingface.co/datasets/AgentGym/AgentEval/blob/main/sciworld_test.json) is even in the format of training set. Could you please update it to the correct version?

I have the same error. Have you solved the issue?

@juliaparedesq I tried to remove `--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer'` and the exception was gone. But after finetuning, the model's ability declined significantly. It seems that fastchat can only be used to deploy...

```bash ValueError: vllm version 0.6.3.post1 not supported. Currently supported versions are 0.3.1, 0.4.2, 0.5.4, 0.6.3 and 0.7.0+ ```

How to set rule-based rewards? I only find model-based reward examples.

```python def reward_func(queries, prompts, labels): # queries is prompts + responses # labels is answers print(queries) return torch.randn(len(queries)) ``` @dubanx Could you give me an example of `prompts`,`queries` and `labels`?...