David MacLeod

Results 6 comments of David MacLeod

Agree that this would be very useful

> Can you share a code snippet you used for loading GPT? Also, currently, DS-inference uses fp16 special CUDA kernels for inference which is not the case for int8. int8...

@yaozhewei any news on this?

Thanks @yaozhewei! Do you know whether there is a rough timeline for this? e.g. 1 month, 6 months, 1 year? It would be very useful to know as we'd like...

Is there any developments here? If I was to contribute this change would it be considered? Would an environment variable or a CLI arg be more appropriate here for disabling...