label-studio-converter icon indicating copy to clipboard operation
label-studio-converter copied to clipboard

url-encode file names in order to create valid file URLs?

Open nicky1038 opened this issue 2 years ago • 3 comments

First of all, I would like to thank everyone involved in creation of this utility and such a cool instrument as label-studio at all :)

Everything works great. Besides this minor issue.

I tried to label-studio-converter import yolo a dataset with an image which name contained a + sign. It seems like to create file URLs the utility just leaves file names as-is and just prepends image-root-url. So for above-named file it produced a URL which also contained a + sign, and it appeared to be invalid. Manually changing + to %2B in this URL works.

Hence I have a suggestion - to url-encode all file-names forcibly or by specifying some cli flag, if there are use-cases when it is not needed.

Thanks!

nicky1038 avatar Nov 08 '22 01:11 nicky1038

@nicky1038 Great thanks! Could you contribute it and make a PR? It's here: https://github.com/heartexlabs/label-studio-converter/blob/master/label_studio_converter/imports/yolo.py#L60

makseq avatar Nov 10 '22 02:11 makseq

@makseq This change introduced a regression if the root URL is in the form of s3://bucket_name - the : gets encoded as well.

ligaz avatar Jan 18 '23 17:01 ligaz

@ligaz thank you for pointing me to this issue. Can you create a PR with a fix? seems like we need something like

"image": pathname2url(os.path.join(image_root_url, image_file_base))  #eg. '../../foo+you.py' -> '../../foo%2Byou.py'

=>

if '://' in image_root_url:
  prefix, root_url = image_root_url.split('://')
else:
  prefix = ''
  root_url = image_root_url
"image": prefix + pathname2url(os.path.join(root_url, image_file_base))  #eg. '../../foo+you.py' -> '../../foo%2Byou.py'

makseq avatar Jan 24 '23 02:01 makseq