Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation about training time

about training time

Open doublejtoh opened this issue 2 years ago • 0 comments

hi, @llrtt It seems that you have implemented vild image distillation via cropping proposals from original image & forward them to clip image encoder. Since every proposal is resized to be 224x224 resolution, it might be burdensome in terms of training time. How did you deal with it? How long did it take to fully train?

Sep 08 '22 16:09 doublejtoh

Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation copied to clipboard

about training time

Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation
Zero-Shot-Detection-via-Vision-and-Language-Knowledge-Distillation copied to clipboard