mem0 icon indicating copy to clipboard operation
mem0 copied to clipboard

[Feature] Google Drive Folder support as a data source

Open JoeSL opened this issue 1 year ago • 2 comments

Description

This change allows embedchain users to use a google drive folder as a data source. This is done by building a wrapper around Langchain's GoogleDriveLoader.

Fixes #525

Type of change

Please delete options that are not relevant.

  • [x] New feature (non-breaking change which adds functionality)

How Has This Been Tested?

This feature was tested by creating the appropriate unit tests and testing the feature through documentation client code by setting the data_type as "google_drive_folder" It is important to correctly setup the google API credentials in order for this feature to work.

Please delete options that are not relevant.

  • Unit Test

Checklist:

  • [x] My code follows the style guidelines of this project
  • [x] I have performed a self-review of my own code
  • [x] I have commented my code, particularly in hard-to-understand areas
  • [x] I have made corresponding changes to the documentation
  • [x] My changes generate no new warnings
  • [x] I have added tests that prove my fix is effective or that my feature works
  • [x] New and existing unit tests pass locally with my changes
  • [] Any dependent changes have been merged and published in downstream modules
  • [x] I have checked my code and corrected any misspellings

Maintainer Checklist

  • [x] closes #525 (Replace xxxx with the GitHub issue number)
  • [x] Made sure Checks passed

JoeSL avatar Jan 02 '24 20:01 JoeSL

Thanks @JoeSL for the PR. This is going to be really useful for the community.

Can you please do the following before we can review the PR:

  • Resolve conflicts
  • Rename the files, classes and other things from Google Drive Folder to Google Drive? We want to keep the effort low for the users when adding data sources. So, google_drive_folder will become google_drive, GoogleDriveFolderChunker will become GoogleDriveChunker and so on.

Thanks again for this PR.

deshraj avatar Jan 02 '24 20:01 deshraj

Hello @deshraj, Kindly find the PR again with the requested changes.

JoeSL avatar Jan 03 '24 13:01 JoeSL

Codecov Report

Attention: 18 lines in your changes are missing coverage. Please review.

Comparison is base (ae2e9cb) 57.81% compared to head (2b00e3c) 57.89%. Report is 3 commits behind head on main.

Files Patch % Lines
embedchain/loaders/google_drive.py 50.00% 16 Missing :warning:
embedchain/utils.py 71.42% 2 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1106      +/-   ##
==========================================
+ Coverage   57.81%   57.89%   +0.07%     
==========================================
  Files         131      133       +2     
  Lines        5149     5201      +52     
==========================================
+ Hits         2977     3011      +34     
- Misses       2172     2190      +18     

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jan 05 '24 06:01 codecov[bot]

Looks good. Thanks for adding this feature.

The pleasure is mine. Embedchain is a great framework! Please let me know if you have any other issues I can help with.

JoeSL avatar Jan 05 '24 11:01 JoeSL