BitNet-Transformers
BitNet-Transformers copied to clipboard
How long does inference on CPU cost?
Training may be on CPU, but deployment has to be on CPU for high scalability.