tensor_parallel
tensor_parallel copied to clipboard
tensor_parallel int4 LLM is not working since release v2.0.0
It works fine on v1.3.2, however
RuntimeError: Trying to shard a model containing 'meta' parameters. Please set `sharded=False` during model creation and call `.apply_sharding()` only after dispatch
occurres when calling
tp.TensorParallelPreTrainedModel(...)
on v2.0.0