GLiNER icon indicating copy to clipboard operation
GLiNER copied to clipboard

Advices for inference speedup

Open yishusong opened this issue 9 months ago • 14 comments

Hi team,

I'm running inference on a g5.24xlarge GPU instance. The data is currently structured in a Pandas dataframe. I use Pandas apply method to apply the predict_entities function. When the df gets fairly large (~1.5M rows), it takes days to run the inference.

I'm wondering if there is a way to increase GPU utilization? I suppose Pandas df is not the most efficient data structure... or maybe there is a parameter I missed that can boost GPU utilization?

Any advice is much appreciated!

yishusong avatar May 14 '24 17:05 yishusong