HongHuang
HongHuang
I found model without DataParallel wrapping it will fail to prune. i.e. ``--load-serialized`` will disable pruning. When I run ``` python compress_classifier.py -a=resnet20_cifar -p=50 ../../../data/cifar10/ -j=22 --epochs=1 --lr=0.001 --masks-sparsity --compress=../agp-pruning/resnet18.schedule_agp.yaml...
### System Info CUDA version: 11.8 torch version: 2.0.0 ### Reproduction ```python from bitsandbytes.functional import igemm def test_igemm(): inner_dim = 10 X = torch.randint(0,10, (1024, inner_dim), dtype=torch.int8).cuda() W = torch.randint(0,10,...
### Feature request / 功能建议 目前BitCPM只提供了量化后的三值权重,能否开源量化前的权重。
I downloaded the ```1bitLLM/bitnet_b1_58-large``` and ```1bitLLM/bitnet_b1_58-3B``` models. When I run the inference commands, the model does not error out but produces incorrect outputs. ``` python setup_env.py -md models/bitnet_b1_58-3B -q tl1...