BUI Van Tuan

Results 2 issues of BUI Van Tuan

1. To align the same shape with image features model = Captioner(tokenizer, feature_extractor=mobilenet, output_layer=output_layer, units=256, dropout_rate=0.5, num_layers=2, num_heads=2) --> model = Captioner(tokenizer, feature_extractor=mobilenet, output_layer=output_layer, units=576, dropout_rate=0.5, num_layers=2, num_heads=2) 2. Utilize...

# What does this PR do? Fixes #38061 The backbone is initialized only once when `use_pretrained_backbone=True`, that means `_is_hf_initialized=True`, so we should do nothing in the `_initialize_weights` function when `_is_hf_initialized=True`....