BK-SDM
BK-SDM copied to clipboard
why training ...?
in the paper, you found the unimportant SD block/layer. In that case, you may not have to retrain the model (because if you erase unimportant block/layer, the performance is almost preserved)
Can you explain why you train again after erasing unimportant block ?
Thanks!