liuzhiyong01
liuzhiyong01
The code in the file's wgan_conv has little error: the class G_conv generate [-1, 32, 32, 1] shape of inputs, but class D_conv's input is [-1, 28, 28, 1],the dimension...
hparams.logger.info("split connect") if idx != len(hparams.cross_layer_sizes) - 1: next_hidden, direct_connect = tf.split(curr_out, 2 * [int(layer_size / 2)], 1) final_len += int(layer_size / 2) else: direct_connect = curr_out next_hidden = 0...
  My environments setting: deepspeed==0.9.0, torch==2.0.0+cu117 CUDA Version: 11.0 pretrained model is facebook/opt-350m Who can help me solve this problem? Thanks
您好,想请教一下,finetue bge-m3模型与finetune bge系列模型区别有什么区别吗?对数据量有什么控制要求吗?finetune的step控制多少比较合适?finetue bge-m3的脚本与finetune bge模型的脚本一样吗?
python script: selected_mixture = seqio.get_mixture_or_task('ag_news_subset_template_0_five_shot') INPUT_SEQ_LEN = 2056 TARGET_SEQ_LEN = 512 dataset = selected_mixture.get_dataset( sequence_length={"inputs": INPUT_SEQ_LEN, "targets": TARGET_SEQ_LEN}, # split="train", shuffle=True, num_epochs=1, # shard_info=seqio.ShardInfo(index=0, num_shards=10), use_cached=False, seed=42 ) for i,...
我使用的是下面这个训练好的adpter文件,但inference的结果与github主页结果差异比较大  使用的是generate.py文件,没有修改generate参数  红色方框的是结果:  github上的结果:  请问一下,这个差异点在哪?麻烦帮忙回答一下, 非常感谢!
## Bug Description when i use torch_tensorrt.compile transformer module with dynamic_shapes, it will occur this error, ## To Reproduce def test_compile_v1(): model = AutoModel.from_pretrained('bert-base-case', use_cache=False) # Enabled precision for TensorRT...