deepsparse
deepsparse copied to clipboard
Changes to support pass@k evaluation on the HumanEval dataset
Example:
numactl -C0-15 python deepsparse/src/deepsparse/transformers/eval_downstream.py \
<model_path>\
--num-cores 16 \
--dataset openai_humaneval \
--humaneval-method pass_at_k \
--engine deepsparse \
--start 0 \
--max-samples 2
- This will create a subset of the HumanEval dataset starting at index 0 (
start) and pick 2 samples (max-samples) to run the evaluation on. - If
benchmark-humanevalargument is supplied, the evaluation will run on a pre-selected smaller subset of the dataset that contains 11 samples and will ignorestartandmax-samples. - Set
humaneval-methodtoperplexityto evaluate perplexity instead ofpass@k. - Add
--n-solutions <n>to specify the number of solutions required per task . Default is 1.
Note: Remove numactl -C0-15 if you don't need to specify which cores to run on.