kartik
kartik
hi,How do I specify different Settings in the code generation script? baseline, retrieval, retrieval w/ ref. I see that the data set only contains three different retrievers: bm25, UniXCoder and...
Hello, the data set of livecodebench is Python, would you consider supporting multi-language data set evaluation? Especially Java. thanks.
verify: `python -m lcb_runner.runner.main --model "qwen/Qwen2-72B-Instruct-GPTQ-Int4" --model_type http_api --api_url http://XXX:8000/v1 --api_key sk-123456 --scenario codegeneration --evaluate`
Could you share the evaluation scripts for the bge-code-v1 model on the CoIR and CodeRAG benchmarks?
I noticed that the official evaluation results of the bge-code-v1 model on the CoIR and CodeRAG benchmarks are relatively high. I'm curious about some of the test configurations and steps....
When evaluating the BGE-Code-v1 model using the CoIR dataset, why is the result in the Apps section so poor, only around 20? Below are the main configurations. The results are...