label-studio-converter
label-studio-converter copied to clipboard
url-encode file names in order to create valid file URLs?
First of all, I would like to thank everyone involved in creation of this utility and such a cool instrument as label-studio at all :)
Everything works great. Besides this minor issue.
I tried to label-studio-converter import yolo
a dataset with an image which name contained a +
sign. It seems like to create file URLs the utility just leaves file names as-is and just prepends image-root-url
. So for above-named file it produced a URL which also contained a +
sign, and it appeared to be invalid. Manually changing +
to %2B
in this URL works.
Hence I have a suggestion - to url-encode all file-names forcibly or by specifying some cli flag, if there are use-cases when it is not needed.
Thanks!
@nicky1038 Great thanks! Could you contribute it and make a PR? It's here: https://github.com/heartexlabs/label-studio-converter/blob/master/label_studio_converter/imports/yolo.py#L60
@makseq This change introduced a regression if the root URL is in the form of s3://bucket_name
- the :
gets encoded as well.
@ligaz thank you for pointing me to this issue. Can you create a PR with a fix? seems like we need something like
"image": pathname2url(os.path.join(image_root_url, image_file_base)) #eg. '../../foo+you.py' -> '../../foo%2Byou.py'
=>
if '://' in image_root_url:
prefix, root_url = image_root_url.split('://')
else:
prefix = ''
root_url = image_root_url
"image": prefix + pathname2url(os.path.join(root_url, image_file_base)) #eg. '../../foo+you.py' -> '../../foo%2Byou.py'