Sujan Kumar Gonugondla
Results
2
comments of
Sujan Kumar Gonugondla
Looks like the inference kernels for zeroquant is not released.
Try changing the following : around line 161 in deepspeed/runtime/weight_quantizer.py ``` else: for plcy in replace_policies: _ = plcy(None) # line added policy.update({plcy._orig_layer_class: (quantize_fn, plcy)}) ```