datachain icon indicating copy to clipboard operation
datachain copied to clipboard

Need File.from_uri() method

Open volkfox opened this issue 1 year ago • 0 comments

Description

There are multiple situations in datasets calling for introducing a new file object.

For one example, reading a dataset like Google OpenImage produces a URI column which needs conversion to File. For another example, extracting a bounding box from an image can be saved locally and added to dataset.

In all these cases, a dataset needs a new File/ImageFile/TextFile object created from URI, and a method to automate it, similar to:

TextFile.from_uri(uri, resolve=True)

where URI can be any supported locator (local, cloud, https), and resolve argument determines whether the URI is tested and more information gathered (e.g. filesize).

volkfox avatar Aug 14 '24 21:08 volkfox