BRB
BRB
did u use both the voc2007 & voc2012? could u share the corresponding result?
unzip captions_train-val2014.zip -d data/ should be: unzip ./data/captions_train-val2014.zip -d data/ not big deal, anyway
main train script configuration to: ` model = Xnet(backbone_name=config.backbone, input_shape=(config.input_deps, config.input_rows, config.input_cols), n_upsample_blocks=4, decoder_filters=(64,64,128,256,512), encoder_weights=config.weights, decoder_block_type=config.decoder_block_type, classes=config.nb_class, activation=config.activation)` and builder.py in xnet model to: ` if downterm[i+1] is not None:...