chroma icon indicating copy to clipboard operation
chroma copied to clipboard

ENH: updated the ImageLoader data loader to work with local URIs and remote URLs

Open ahron1 opened this issue 1 year ago • 4 comments

Updated ImageLoader in data_loaders.py to accept images from both local URIs and remote URLs. Raised as feature request

Description of changes

  • Improvements & Bug fixes
    • No bug fixes. 1 improvement (new functionality, described below)
  • New functionality
    • Currently, the ImageLoader data loader only accepts local URIs. When working with multimodal embeddings, many users need to pass remote images as URLs. This PR makes a small update to load_image in class ImageLoader in utils/data_loaders.py to detect whether the passed URI is a remote URL. If it is, then it fetches the remote URL's data and then processes it. If it is not a remote URL, it works the same as before (currently).

Test plan

How are these changes tested?

  • By running pytest locally.

Documentation Changes

Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs repository? Happy to update the documentation myself after the PR is merged.

ahron1 avatar Jun 22 '24 14:06 ahron1

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • [ ] Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • [ ] Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • [ ] If appropriate, are there adequate property based tests?
  • [ ] If appropriate, are there adequate unit tests?
  • [ ] Should any logging, debugging, tracing information be added or removed?
  • [ ] Are error messages user-friendly?
  • [ ] Have all documentation changes needed been made?
  • [ ] Have all non-obvious changes been commented?

System Compatibility

  • [ ] Are there any potential impacts on other parts of the system or backward compatibility?
  • [ ] Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • [ ] Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

github-actions[bot] avatar Jun 22 '24 14:06 github-actions[bot]

@tazarov Ok, done.

There's now a new ImageURLoader class.

It works without auth data_loaders.ImageURLoader("https://images.pexels.com/photos/10981514/pexels-photo-10981514.jpeg")

and also with auth: data_loaders.ImageURLoader("https://url_to_image", "username", "password")

Please check.

ahron1 avatar Jun 25 '24 16:06 ahron1

@tazarov Hi, any update on this yet?

ahron1 avatar Jul 09 '24 11:07 ahron1

@tazarov Hi Trayan, any remaining concerns before merging this?

ahron1 avatar Aug 08 '24 09:08 ahron1

@ahron1 why are there certs checked into this PR?

jeffchuber avatar Sep 16 '24 00:09 jeffchuber

@ahron1 why are there certs checked into this PR?

@jeffchuber Not sure tbh, must have been by mistake. Just removed the certs and pushed a new commit.

ahron1 avatar Sep 16 '24 04:09 ahron1