chainer-caption icon indicating copy to clipboard operation
chainer-caption copied to clipboard

Loss json in download.sh

Open dljzx opened this issue 5 years ago • 6 comments

captions ./data/MSCOCO/MSCOCO/mscoco_caption_train2014_cn_processed.json file is not downloaded by the shell command file.Or it could be instead by captions_train2014_cn_translation.json ?

dljzx avatar Mar 28 '19 14:03 dljzx

and these's no ..._processed_dic.json in download.sh which make my training pro lots of TypeError.

dljzx avatar Mar 30 '19 11:03 dljzx

you can generate these files based on the script in the code

cd ./code/
python preprocess_MSCOCO_captions.py \
--input ../data/MSCOCO/captions_train2014_cn_translation.json \
--output ../data/MSCOCO/captions_train2014_cn_translation_processed.json \
--outdic ../data/MSCOCO/captions_train2014_cn_translation_processed_dic.json \
--outfreq ../data/MSCOCO/captions_train2014_cn_translation_processed_freq.json \
--cut 5 \
--char True \
cd ../

Then

python train_caption_model.py --savedir ./experiment1cn --epoch 50 --batch 120 --gpu 0 \
--vocab ./data/MSCOCO/captions_train2014_cn_translation_processed_dic.json \
--captions ./data/MSCOCO/captions_train2014_cn_translation_processed.json\

apple2373 avatar Mar 30 '19 23:03 apple2373

Well, in general, this codebase is outdated and I don't recommend for training. I put a note on the readme.

apple2373 avatar Mar 30 '19 23:03 apple2373

Oh thank you so much for your reply. I am just little confused that many json files you used were not downloaded by download.sh. For example no json files suffixed with dlc. Could you tell me where you get em?

dljzx avatar Mar 31 '19 03:03 dljzx

The jsons suffixed with _dic are at data/MSCOCO. I don't get them from somewhere else but I generate by myself. You can generate with code/preprocess_MSCOCO_captions.py .

apple2373 avatar Apr 01 '19 00:04 apple2373

Oh, That's my fault. Thanku so much

dljzx avatar Apr 04 '19 14:04 dljzx