zzc
zzc
> Add tests. added 388fd80af56469302e2dac907cf13d973582df93
> Let's call it `replace_tokens`. > > Also in the tests, you are changing `tree2` in-place, which could break other tests if run in a different order. done 545b8087ff521e536100a080f9b663c2175f6e7c, caea7808abdc4dfff34d22b411ff736deb025279
> *deepcopy 5ab7313081a8d7cc004796c7038591d7937ad17c
> True, probably not needed. > > To get the type checking to work: > > * replace `Token` in the signature with the Leaf generic parameter. > * instead...
> Lambda Web Adapter will repeatlly send HTTP GET requests to your web app during cold start to check if the app is ready. By default, the GET request is...
> try `--device_map cpu` > > Will only use cuda:0 for quantization. @Jintao-Huang VRAM OOM When using Single GPU ### script ```shell OMP_NUM_THREADS=14 \ swift export \ --model ${MODEL} \...
> CUDA_VISIBLE_DEVICES=2,3,4,5 MAX_PIXELS=117600 swift export --model Qwen2.5-VL-7B --dataset 'listwise_sft_0923-1_2.2w.sampled1000.jsonl' --quant_n_samples 256 --quant_batch_size -1 --max_length 16384 --quant_method awq --quant_bits 4 --output_dir /media/Qwen2.5-VL-7B-1009-4-AWQ > > 我这也是同样的问题, 4卡h800 量化qwen2.5-vl-7b > > > 我换成--device_map...
> > CUDA_VISIBLE_DEVICES=2,3,4,5 MAX_PIXELS=117600 swift export --model Qwen2.5-VL-7B --dataset 'listwise_sft_0923-1_2.2w.sampled1000.jsonl' --quant_n_samples 256 --quant_batch_size -1 --max_length 16384 --quant_method awq --quant_bits 4 --output_dir /media/Qwen2.5-VL-7B-1009-4-AWQ > > 我这也是同样的问题, 4卡h800 量化qwen2.5-vl-7b > > >...