inference icon indicating copy to clipboard operation
inference copied to clipboard

Number of sample runs for Llama-2

Open mahmoodn opened this issue 1 year ago • 0 comments

For Llama-2 inference, I set the total sample count to 1024 to see how much time is needed and what is the final output. I see multiple sample runs:

Samples run: 2
	BatchMaker time: 0.556433916091919
	Inference time: 9502.616274356842
	Postprocess time: 0.36942481994628906
	==== Total time: 9503.54213309288
Saving outputs to run_outputs/q182_77.pkl
Samples run: 4
	BatchMaker time: 0.00017786026000976562
	Inference time: 13552.655136823654
	Postprocess time: 0.18449878692626953
	==== Total time: 13552.83981347084
Saving outputs to run_outputs/q257_571.pkl
Samples run: 6
	BatchMaker time: 0.0001952648162841797
	Inference time: 9649.980134010315
	Postprocess time: 0.10514378547668457
	==== Total time: 9650.085473060608
...

Questions are: 1- Why sample runs are even numbers? What does that mean? 2- How can I limit the total run number to 1? I didn't see any option in the command line for that.

mahmoodn avatar Oct 05 '24 16:10 mahmoodn