cccpr
cccpr
@fmassa Hi, after I trained a model from scratch, I found that the epoch seems not enough, so I reload the model and try to finetune it for a bit...
@chenyuntc is 4gb gpu enough?
@zchrissirhcz @longcw so what does the author mean by "tune the loss by ourselves in faster_rcnn.py" to get 0.699 mAP?
@zchrissirhcz https://github.com/ruotianluo/pytorch-faster-rcnn this implementation seems to be more consistent with rbg's original caffe version implementation. But I like longcw's implementation more. But still, I have no idea where to modify...
@longcw @JeffCHEN2017 does this project support batch_size larger than 1?
@HantingChen if eta is not set as 0.2, for example we set it as 0.1 or 0.4, will the result be quiet different from 0.2?
@HantingChen 论文里resnet20-cifar10 的性能是91.84, 并且是无任何乘法的, 但是我看到你们的代码里, 首尾层是普通卷积层, 所以 如果首尾层也是加法层,resnet20-cifar10的性能也是91.84吗? 这个是论文的笔误吗?
@HantingChen 此外, 如果是在imagenet和cifar上, eta的影响也是像mnist上一样, 只有0.2个点的波动吗? 这个你们有对比试验吗?
@HantingChen 如果你们暂时没有cuda版本的addernet开源计划, 可否提供你们的resnet-imagenet和resnet-cifar的训练log以供参考呢?
@StudyingShao there are three forward in the file you mentioned. first : https://github.com/NVIDIA/TensorRT-LLM/blob/main/tests/model/test_llama.py#L114 --------this is just for building trt-network, not real model.forward the rest two: https://github.com/NVIDIA/TensorRT-LLM/blob/main/tests/model/test_llama.py#L301 https://github.com/NVIDIA/TensorRT-LLM/blob/main/tests/model/test_llama.py#L381 ------ these two...