Julia Longtin

Results 105 comments of Julia Longtin
trafficstars

> > goes from a token every 0.18 seconds on mistral 7B instruct to a token every 0.82 seconds. > > Are you showing a performance regression? Or are the...

> > > goes from a token every 0.18 seconds on mistral 7B instruct to a token every 0.82 seconds. > > > > > > Are you showing a...

> > > > goes from a token every 0.18 seconds on mistral 7B instruct to a token every 0.82 seconds. > > > > > > > > >...

> > goes from 0.18 tokens per second on mistral 7B instruct (Q5K) to 0.82 tokens per second. > > How many threads is that with? Since Xeon Phi has...

now runs at 1.2 tokens per second.