coco-caption
coco-caption copied to clipboard
Python-3.x support
Hi all,
I'd like to know whether you have plans to port the codebase to Python-3. Since most of the people have switched to Python-3, it'd be nice to have Python-3 support so that other projects (for e.g. ImageCaptioning PyTorch ) dependent on coco-caption
can also be implemented in Python-3.
Thanks!
I have ported it to python3 version, but meteor metrix doesn't work. You can have a see. coco-caption
I have implemented Python 3 support for the evaluation metrics. Have a look at my comment here: https://github.com/ruotianluo/ImageCaptioning.pytorch/issues/36#issuecomment-363442083
I am using my version of the eval tools together with the pycocotools from here: https://github.com/cocodataset/cocoapi
I have created a fork that is both Python 3 compatible and that uses the new Word Mover's Distance metric. It would be nice to merge with this repository.
https://github.com/mtanti/coco-caption
I just modified the code to support Python 3, with support for Chinese. https://github.com/entalent/coco-caption-py3/blob/master/README.md It was created in a hurry...so there might be bugs.
What's the status on this? :)
@rubencart They said "We are currently focusing on more of the object detection / segmentation challenges, and have decided to leave the captioning leaderboard open but not make additional updates to it."
Another pure Python 3.x fork with no support for Python 2 with some tiny bugs fixed as well --> https://github.com/ozancaglayan/coco-caption
Thanks for your contribution. Based on @mtanti 's implementation, I modified two places to support meteor evalution for both py2 and py3.
- It seems that the code of
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str))
self.meteor_p.stdin.write(score_line+'\n')
cannot support py2 and I changed it to
if sys.version_info[0] == 2: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
else: # assume python3+
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
- Add a judgement in
compute_score
# There's a situation that the prediction is all punctuations
# (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a'
The complete code of meteor.py
is as following
#!/usr/bin/env python
# Python wrapper for METEOR implementation, by Xinlei Chen
# Acknowledge Michael Denkowski for the generous discussion and help
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import os
import sys
import subprocess
import threading
# Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed.
METEOR_JAR = 'meteor-1.5.jar'
# print METEOR_JAR
class Meteor:
def __init__(self):
self.env = os.environ
self.env['LC_ALL'] = 'en_US.UTF_8'
self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR,
'-', '-', '-stdio', '-l', 'en', '-norm']
self.meteor_p = subprocess.Popen(self.meteor_cmd,
cwd=os.path.dirname(os.path.abspath(__file__)),
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
env=self.env, universal_newlines=True, bufsize=1)
# Used to guarantee thread safety
self.lock = threading.Lock()
def compute_score(self, gts, res):
assert(gts.keys() == res.keys())
imgIds = sorted(list(gts.keys()))
scores = []
eval_line = 'EVAL'
self.lock.acquire()
for i in imgIds:
assert(len(res[i]) == 1)
# There's a situation that the prediction is all punctuations
# (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py)
# then the prediction will become [''] after tokenization
# which means res[i][0] == '' and self._stat will failed with this input
if len(res[i][0]) == 0:
res[i][0] = 'a'
stat = self._stat(res[i][0], gts[i])
eval_line += ' ||| {}'.format(stat)
# Send to METEOR
self.meteor_p.stdin.write(eval_line + '\n')
# Collect segment scores
for i in range(len(imgIds)):
score = float(self.meteor_p.stdout.readline().strip())
scores.append(score)
# Final score
final_score = float(self.meteor_p.stdout.readline().strip())
self.lock.release()
return final_score, scores
def method(self):
return "METEOR"
def _stat(self, hypothesis_str, reference_list):
# SCORE ||| reference 1 words ||| reference n words ||| hypothesis words
hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ')
if sys.version_info[0] == 2: # python2
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip()
self.meteor_p.stdin.write(str(score_line+b'\n'))
else: # assume python3+
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip()
self.meteor_p.stdin.write(score_line+'\n')
return self.meteor_p.stdout.readline().strip()
def __del__(self):
self.lock.acquire()
self.meteor_p.stdin.close()
self.meteor_p.kill()
self.meteor_p.wait()
self.lock.release()
Your code assumes that there will only ever be a version 2 and 3 for python. Don't assume that if the version is not 3 then it is 2. Instead check if it is 2 and if not then assume that the code for version 3 will work in the future as well. So switch your if/else around to 'if sys.version_info[0] == 2: ... else: ...
On Tue, 8 Oct 2019, 09:42 Yupan Huang, [email protected] wrote:
Thanks for your contribution. Based on @mtanti https://github.com/mtanti 's implementation, I modified two places to support meteor evalution for both py2 and py3.
It seems that the code of
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)) self.meteor_p.stdin.write(score_line+'\n')
cannot support py2 and I changed it to
if sys.version_info[0] == 3: # python3 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n') else: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n'))
Add a judgement in compute_score
# There's a situation that the prediction is all puctuations # see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a'
The complete code of meteor.py is as following
#!/usr/bin/env python
Python wrapper for METEOR implementation, by Xinlei Chen# Acknowledge Michael Denkowski for the generous discussion and help from future import absolute_importfrom future import divisionfrom future import print_function
import osimport sysimport subprocessimport threading
Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed.METEOR_JAR = 'meteor-1.5.jar'# print METEOR_JAR
class Meteor:
def __init__(self): self.env = os.environ self.env['LC_ALL'] = 'en_US.UTF_8' self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR, '-', '-', '-stdio', '-l', 'en', '-norm'] self.meteor_p = subprocess.Popen(self.meteor_cmd, cwd=os.path.dirname(os.path.abspath(__file__)), stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=self.env, universal_newlines=True, bufsize=1) # Used to guarantee thread safety self.lock = threading.Lock() def compute_score(self, gts, res): assert(gts.keys() == res.keys()) imgIds = sorted(list(gts.keys())) scores = [] eval_line = 'EVAL' self.lock.acquire() for i in imgIds: assert(len(res[i]) == 1) # There's a situation that the prediction is all puctuations # see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a' stat = self._stat(res[i][0], gts[i]) eval_line += ' ||| {}'.format(stat) # Send to METEOR self.meteor_p.stdin.write(eval_line + '\n') # Collect segment scores for i in range(len(imgIds)): score = float(self.meteor_p.stdout.readline().strip()) scores.append(score) # Final score final_score = float(self.meteor_p.stdout.readline().strip()) self.lock.release() return final_score, scores def method(self): return "METEOR" def _stat(self, hypothesis_str, reference_list): # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ') if sys.version_info[0] == 3: # python3 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n') else: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n')) return self.meteor_p.stdout.readline().strip() def __del__(self): self.lock.acquire() self.meteor_p.stdin.close() self.meteor_p.kill() self.meteor_p.wait() self.lock.release()
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tylin/coco-caption/issues/27?email_source=notifications&email_token=ABLFWDZA7EXTKJ5V6TN75SDQNQ2YDA5CNFSM4EMTXEC2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEATHB5Y#issuecomment-539390199, or mute the thread https://github.com/notifications/unsubscribe-auth/ABLFWD4E2MNIUXJV3RLSCVDQNQ2YDANCNFSM4EMTXECQ .
Python 2 will be end-of-life next year. Why do you bother supporting it still?
Thanks @mtanti for pointing it out! I've modified the code. @ozancaglayan Since I use the code of some repositories which only support python2 originally, I am transferring to python3 and switch between them to test the performance.
Thanks for your contribution. Based on @mtanti 's implementation, I modified two places to support meteor evalution for both py2 and py3.
- It seems that the code of
score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)) self.meteor_p.stdin.write(score_line+'\n')
cannot support py2 and I changed it to
if sys.version_info[0] == 2: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n')) else: # assume python3+ score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n')
- Add a judgement in
compute_score
# There's a situation that the prediction is all punctuations # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py) # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a'
The complete code of
meteor.py
is as following#!/usr/bin/env python # Python wrapper for METEOR implementation, by Xinlei Chen # Acknowledge Michael Denkowski for the generous discussion and help from __future__ import absolute_import from __future__ import division from __future__ import print_function import os import sys import subprocess import threading # Assumes meteor-1.5.jar is in the same directory as meteor.py. Change as needed. METEOR_JAR = 'meteor-1.5.jar' # print METEOR_JAR class Meteor: def __init__(self): self.env = os.environ self.env['LC_ALL'] = 'en_US.UTF_8' self.meteor_cmd = ['java', '-jar', '-Xmx2G', METEOR_JAR, '-', '-', '-stdio', '-l', 'en', '-norm'] self.meteor_p = subprocess.Popen(self.meteor_cmd, cwd=os.path.dirname(os.path.abspath(__file__)), stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=self.env, universal_newlines=True, bufsize=1) # Used to guarantee thread safety self.lock = threading.Lock() def compute_score(self, gts, res): assert(gts.keys() == res.keys()) imgIds = sorted(list(gts.keys())) scores = [] eval_line = 'EVAL' self.lock.acquire() for i in imgIds: assert(len(res[i]) == 1) # There's a situation that the prediction is all punctuations # (see definition of PUNCTUATIONS in pycocoevalcap/tokenizer/ptbtokenizer.py) # then the prediction will become [''] after tokenization # which means res[i][0] == '' and self._stat will failed with this input if len(res[i][0]) == 0: res[i][0] = 'a' stat = self._stat(res[i][0], gts[i]) eval_line += ' ||| {}'.format(stat) # Send to METEOR self.meteor_p.stdin.write(eval_line + '\n') # Collect segment scores for i in range(len(imgIds)): score = float(self.meteor_p.stdout.readline().strip()) scores.append(score) # Final score final_score = float(self.meteor_p.stdout.readline().strip()) self.lock.release() return final_score, scores def method(self): return "METEOR" def _stat(self, hypothesis_str, reference_list): # SCORE ||| reference 1 words ||| reference n words ||| hypothesis words hypothesis_str = hypothesis_str.replace('|||', '').replace(' ', ' ') if sys.version_info[0] == 2: # python2 score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).encode('utf-8').strip() self.meteor_p.stdin.write(str(score_line+b'\n')) else: # assume python3+ score_line = ' ||| '.join(('SCORE', ' ||| '.join(reference_list), hypothesis_str)).strip() self.meteor_p.stdin.write(score_line+'\n') return self.meteor_p.stdout.readline().strip() def __del__(self): self.lock.acquire() self.meteor_p.stdin.close() self.meteor_p.kill() self.meteor_p.wait() self.lock.release()
Thanks, your solution help me solve the proc.stdout.readline() hanged problem!
I just stumbled across this and our https://github.com/Maluuba/nlg-eval supports Python 3
Hi all, I'd like to know whether you have plans to port the codebase to Python-3. Since most of the people have switched to Python-3, it'd be nice to have Python-3 support so that other projects (for e.g. ImageCaptioning PyTorch ) dependent on
coco-caption
can also be implemented in Python-3.Thanks!
It has been 3 years since I first commented and a lot has changed in the meantime. So, I'm now working with a much more elegant toolkit, facebookresearch/vizseq, which supports visualization with extension to multiple modalities (video, audio) and more recent embedding-based metrics.