Alexander issues

Results 7 issues of


                                            Alexander

✨[Feature] support other methods compile besides 'foward' method

TRTorch now can only compile and keep forward method of torchscript. It is not transparency to users. If users exports other methods in torchscript, and their program is based on...

feature request

component: core

why the fan's arch is different from officials?

I review the fan.py, find the arch of fan is different from officials, what makes you do this?

num_priors_ += (aspect_ratios_.size()+1) * (pow(densitys_[i],2)-1)

(aspect_ratios_.size()+1) is a typo ? I think right equation is : num_priors_ += aspect_ratios_.size() * (pow(densitys_[i],2)-1)

Distillation degrades the accuracy

I add distillation when training resnet18. But the Top-1 Acc degrades from 68.150 % to 67.364%。 Hyperparameters as follow: 4gpu epochs: 90 learning_rate: 0.01 momentum: 0.9 weight_decay: 0.0001 mode: step...

[bug] group_size = 64 has bug

只修改这两行代码，LLaMa-2-7B 模型无法得到正确的输出。 ` Generate(kernels_); group_sizes_.push_back(64);` FP16 的输出： @Input: The first time I saw the movie, I was like, 'Oh my God, _Output: this is so cool.' I was like, INT4...

backlog

channel_wise group or filter_wise group

In file losses.py `grouped_sum = tf.sqrt(tf.reduce_sum(tf.pow(W,2),axis=[0,1,2]))` I think it's filter wise group, not input channel wise group, but the comment is channel wise group

Pruning crash at iteration 592.

@xiamengzhou [batch=592/3200] Train time/batch: 591 Train time/sample: 18912 Train time/batch_in_epoch: 591 Train time/sample_in_epoch: 18912 Train time/token: 77463552 Train time/token_in_epoch: 77463552 Train metrics/train/cc_weight: 0.2292 Train metrics/train/github_weight: 0.0121 Train metrics/train/book_weight: 0.0220 Train...