HollrayChan
HollrayChan
> What's your n ? n is embed_dim_=768 s is seq_len=193
I call vit in tensorrt,so the following modifications are also required. ``` // vit.cc FT_CHECK(output_tensors->at(0).shape.size() == 2); // ViTPlugin.cpp std::vector output_tensors = std::vector{ Tensor{MEMORY_GPU, getTensorType(), // std::vector{(size_t)batch_size, (size_t)settings_.seq_len, (size_t)settings_.embed_dim}, std::vector{(size_t)batch_size,...
I modified it to the following form, but the result is nan. ``` template __global__ void splitout(const half* in, half* out, const int m, const int n, const int s)...
ok, I set const int data_type_factor = 1, but still nan.
Thx,something wrong in my splitout, I have fixed it.
My vit has been changed to customize the input size, the input of the image in the code is 384x128
Thx,I have referred to the above two examples, and the shape of the torch weight is consistent with ViT-B_16.npz, but avg diff : 1.1434973 max diff : 4.220703, I wonder...
Yes, .contiguous() is required, and there are some bugs in MultiHeadDotProductAttention, some places don't need to be transposed, now it work. Here is my codes:) ``` def th2np(weights, conv=False, tp=False):...
If I want to implement some custom layers, such as batchnorm, or ibn, but I haven't seen the relevant content in FasterTransformer, what projects could I refer to to implement...
Hello, I am trying to use a public pedestrian data luperson to train a vit-base dinov2 pretrain, about 250w pictures, this is my training script, configuration file and loss changes,...