petals
petals copied to clipboard
Remove smaller limit for legacy bfloat16 serialization
Revert #251 since it's not needed after #311. This may improve fine-tuning efficiency for medium-sized batches.
TODO:
- [ ] Test it with increasingly larger batches. Watch that we switch from
rpc_forward
torpc_forward_stream
(can be distinguished using server logs) without errors.