Flish Wang
Flish Wang
Hi, thanks for your work. It's wonderful. But I noticed that the bias arg in all nn.Linear layer in models/simsiam.py is set as its default value True. To my knowledge,...
Hi, I'm new to triton. I noticed that we use tl.make_block_ptr in the forward kernel: ``` Q_block_ptr = tl.make_block_ptr( base=Q + qvk_offset, shape=(N_CTX, BLOCK_DMODEL), strides=(stride_qm, stride_qk), offsets=(start_m * BLOCK_M, 0),...
### 🐛 Describe the bug Compete codes uploaded to: [minifer.py](https://gist.github.com/flishwang/9e561371966ab12c7f3709f7315aea14) Key codes (Line 990 to Line 1068): ``` use_block_attn = sys.argv[-1] == '1' print(f'use_block_attn = {use_block_attn} {sys.argv[-1]}') model = ViTA(...
in models/build.py:41, the keyword passed to partial(SwinTransformer,...) is norm_befor_mlp, while the keyword in SwinTransformer (models/swin_transformer.py:497) is norm_before_mlp. The formmer missed a letter E compared with the latter. Therefore, the 'bn'...
### 🐛 Describe the bug The following code may failed: ``` import torch from torch import nn class A(nn.Module): def __init__(self): super().__init__() self.p = nn.Parameter(torch.zeros((1, 8, 1, 1, 256))) a=A().to(memory_format=torch.channels_last)...
I want to compile a triton kernel and use it on different machines. Is there any argument options in tools/compile.py or env variables (just like the TORCH_CUDA_ARCH_LIST variable when building...