bigcodebench icon indicating copy to clipboard operation
bigcodebench copied to clipboard

[Roadmap] BigCodeBench Q3 2024 Roadmap

Open terryyz opened this issue 1 year ago • 0 comments
trafficstars

This document includes the features of BigCodeBench Q3 2024. Please feel free to discuss and contribute, as this roadmap is shaped by the BigCodeBench community.

Help Wanted

  • [x] Lingering processes #42
  • [ ] Better documentation #40, #41
  • [ ] Dataset Repair, e.g., #33
  • [ ] Tests & CI/CD
  • [x] Flexible Pass@k Support #50

Feature

  • [x] #46, #36
  • [x] Customized direct completion setup (to be released)
  • [x] Catch up on the progress of EvalPlus

Dataset

  • [ ] More investigations on the BigCodeBench tasks

Ongoing Research

  • [ ] Benchmarking more languages -- Verilog & R, cc @shailja-thakur @ThreeCirclesK @marianna13
  • [ ] Agentic Evaluation (proof-of-concept infra to be scaled up), cc @JoshuaPurtell
  • [ ] Grounded Zero-Shot Tool Use, cc @terryyz @siviltaram

terryyz avatar Sep 10 '24 17:09 terryyz