Huanzhi Mao
Huanzhi Mao
It's not an urgent request, but can we remove those emoji from the leaderboard? It's distracting and I cannot really tell what that column is about by just looking at...
As mentioned in #426, this PR addes 4 new models to the leaderboard. The model costs are also updated accordingly. This PR **DOES** change the leaderboard ranking. This PR **DOES...
The installer is the `Gorilla-CLI.exe` in the `dist_exe` folder. The scripts that are used to generate the installer are also attached in the `dist_exe/scripts` folder.
The following models are intended to be evaluated using `bfloat16` precision instead of `float16` according to their model card on HuggingFace. We should change the default precision setting for their...
This PR introduces multi-threading to parallel the API call to the hosted model endpoints and significantly speeds up the model response generation process. User can specify the number of threads...
This PR updates the leaderboard to reflect the changes in score due to the following PR merge: - #557 - #568 and the addition of the following models: - #569...
We need to be consistent in our metrics to determine the cost for OSS models. If a model is hosted locally and has `OSS_LATENCY`, then it should not belong to...
The current BFCL leaderboard table is built using basic HTML, which has made it increasingly difficult to add new functionalities. To address this, the leaderboard table is overhauled to use...
The mapping from test category name to test file path is repeated three times, which is bad. - `test_files` in `eval_data_compilation.py` - `test_categories` in `openfunctions_evaluation.py` - `TEST_CATEGORIES` in `model_handler/constant.py`
This PR adds a sticky effect for the first three columns (rank, overall accuracy and model name) on the leaderboard. This feature is handy when viewing the leaderboard on small...