DriveFS icon indicating copy to clipboard operation
DriveFS copied to clipboard

Project goals

Open EricTheMagician opened this issue 6 years ago • 22 comments

The goal of this project is to provide a quality fuse filesystem experience with google drive, that supports both upload and download, Team Drives, good support for home users witht assymetric internet connections, and most importantly, for it to bear the seal of approval from the Wife.

What currently works is a file upload / download ( writing / reading from GDrive )

  • Upload supports resuming, and can be "streamed" while in the upload queue. Currently does not support writing in the middle of a file, whiich is normally fine if you are planning on using it for backups and storage and not as a real filesystem.

  • Downloads can be cached to disk. Currently, the disk checking is only done once when the program is first started. This is the current feature being worked on.

  • Updating the filesystem if an external change has occured on GDrive.

  • If you would like to see a feature, please post in this issue for now.

Currently, the list of features that I plan on implementing (probably in order):

  • [x] Delete items when the download cache is full, not only on start up.
  • [ ] Headless login. This is technically already supported, but not cleanly. You can open the link on a different computer and then wget or curl the link with the redirected page with localhost.
  • [x] Write documentation for building DriveFS.
  • [x] Link mongocxx statically with DriveFS as it's a pain to compile and most OSs don't provide mongocxx
  • [x] Write a read me

Long term goal (no particular order, but higher priority for rclone):

  • [ ] Support rclone encryption natively. (This means that you should be able to decrypt your filesystem without a seperate rclone process on top of DriveFS to decrypt your filesystem ). This is a very long term goal, and comes with it's own set of challenges, particularly since I'm not fluent in GO and it's difficult for me to go through the rclone code.
  • [ ] Encfs support
  • [ ] Define a folder id as root
  • [ ] Support resizing of the downloaded cache files.
  • [ ] Support pinning files to drive.
  • [ ] Adding DriveFS to the Arch User Repository

EricTheMagician avatar Jun 01 '18 03:06 EricTheMagician

This is awesome! I am currently still running your GDriveF4JS as my main mount for GDrive. In my opinion it is still the best available option of mounting GDrive on Linux.

I love the idea of native rclone encryption, but could I request you also look into native encfs support as that is the encryption method that I am using, and I know other do as well.

Thanks for all your work.

somerandom48 avatar Jun 01 '18 05:06 somerandom48

Good to know that people still use my node-gdrive project. I found it had way too many issues for me was was unstable, so this should be much more stable, and run much more smoothly.

For encfs, it should be much simpler. The current incarnation of drivvefs is based on my acdfs + encfs, but I never released because it was also unstable, but i had a bug in the encfs portion. the name encrding/decoding worked, but the reading/writing was slightly different and made it incompatible.

~~I'll probably just commit the encfs code as is at somepoint and if someone wants to look into it, a pull request would be more than appreciated.~~ Actually, the code is lost. It would have been a good reference for both encfs and rclone. So I might do encfs first since that's already in c/c++, but man it was not fun to go through understanding the ssl library.

EricTheMagician avatar Jun 01 '18 12:06 EricTheMagician

Really looking forward to this! I also found that your GDriveF4JS has been by far the best. Plus easiest for me to make changes to, after making a few small adjustments it is stable as a rock! My only problem with your old solution was the ls speeds.

I assume that is because of the FUSE libraries you had to use? Will this version be faster in that regard?

hjone72 avatar Jun 02 '18 02:06 hjone72

Thanks @hjone72!

Can you elaborate on the speed of ls? I haven't used it in a while.

I don't recall it being slow, so if you could give a more concrete example, I can compare on my current setup. I'm using rclone to decrypt and it's slows it down, but natively, it's quite fast. I'm also running it in debug mode under gdb.

Finally, since people are still using the node version, what's the write throughput that you guys are getting, from copying a file to node gdrive. On my test laptop, it's super fast since I have an SSD but on my VPS, since it's mechanical, and I'm running it on unraid, it's super slow, like 10mbytes/s.

On Fri, Jun 1, 2018, 10:10 PM hjone72 [email protected] wrote:

Really looking forward to this! I also found that your GDriveF4JS has been by far the best. Plus easiest for me to make changes to, after making a few small adjustments it is stable as a rock! My only problem with your old solution was the ls speeds.

I assume that is because of the FUSE libraries you had to use? Will this version be faster in that regard?

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/thejinx0r/DriveFS/issues/1#issuecomment-394049280, or mute the thread https://github.com/notifications/unsubscribe-auth/AATvbBG5kePrKWSbtX6_cLqQiLIp7MRdks5t4fQSgaJpZM4UWC8m .

EricTheMagician avatar Jun 02 '18 02:06 EricTheMagician

hmmm not sure how best to explain it. I have a fairly large GDrive with lots of files and folders (200,000+). When software that scans over those files it can take quite a significant amount of time. Or when running ls -R it can hault for various periods. This could also be to do with the nature of node (being single threaded). I guess the slowness mostly occurs when trying to perform many actions at once...

Download speed is the most important for me, and the download speed with the current node version has been fantastic! I don't use the upload feature so can't comment.

hjone72 avatar Jun 02 '18 03:06 hjone72

@hjone72 Ok I see. I don't have such a large library on GDrive. ~10k files with gdb attached for debugging, and code compiled in debug mode (not reldeb), it takes about 3 secs using find to find all the files, and ~0.5 seconds if I pipe the output to wc (word count).

In the past, I had ~100k files on ACD, and using a similar filesystem, I managed to get it close to ~5 seconds to go through all those files, with code compiled in release, and I think I had got it down to about 1 second when I stopped using smart pointers.

So I'll go right ahead and assume that when you say a significant amount of time, it's way more than 10 seconds. Maybe 10-15 minutes, if not more?

EricTheMagician avatar Jun 02 '18 05:06 EricTheMagician

I have never actually let it finish, because it has taken so long. The very fast times you are talking about sound amazing! Happy to run tests for you on my larger drive when you get to that stage.

As I said before, the most important things for me is having a cache and high download speed.

Really looking forward to trying this out!

hjone72 avatar Jun 02 '18 06:06 hjone72

I found that when using the high-level libfuse API within node, listing files is magnitudes faster than when using the low-level API. I put it to the fact that the high-level API only calls node a few times whereas the low-level API is constantly calling hooks within node.. However the high-level API being synchronous is terrible for reads/writes due to all the IO blocking... So I found neither really ideal. :(

Just my two cents.

Gawthorne avatar Jul 03 '18 10:07 Gawthorne

It's not using node. It's pure C++, so there shouldn't be any performance problems with the lowlevel api here.

I wrote some build instructions on the wiki: https://github.com/thejinx0r/DriveFS/wiki/Compiling

I'm still testing it out. Since I'm mostly testing uploads right now, I'm not sure how it will behave once the cache is full while downloading.

It should clear it, but I haven't tested it yet. Just note that it will show a message that the cache is full and it will start deleting, though it will only update the size once it has cleared up ~10% of the cache. So if you download or upload while the cache is clearing, it will look like the cache is increasing, when it's really not.

EricTheMagician avatar Jul 03 '18 15:07 EricTheMagician

I was referring to the slow-downs hjone72 was facing when using your node-gdrive project.

But awesome! I attempted to get this to build a little while ago but couldn't find the special sauce to get it to compile. Thanks for the instructions.

On Wed., 4 Jul. 2018, 1:58 am Eric Yen, [email protected] wrote:

It's not using node. It's pure C++, so there shouldn't be any performance problems with the lowlevel api here.

I wrote some build instructions on the wiki: https://github.com/thejinx0r/DriveFS/wiki/Compiling

I'm still testing it out. Since I'm mostly testing uploads right now, I'm not sure how it will behave once the cache is full while downloading.

It should clear it, but I haven't tested it yet. Just note that it will show a message that the cache is full and it will start deleting, though it will only update the size once it has cleared up ~10% of the cache. So if you download or upload while the cache is clearing, it will look like the cache is increasing, when it's really not.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/thejinx0r/DriveFS/issues/1#issuecomment-402207407, or mute the thread https://github.com/notifications/unsubscribe-auth/ACCzZ8sFOnKHIy7Lypu9ftoL1YEdQVrHks5uC5SvgaJpZM4UWC8m .

Gawthorne avatar Jul 03 '18 21:07 Gawthorne

Wondering if there is any chance you could post a sample config file?

hjone72 avatar Jul 10 '18 11:07 hjone72

@hjone72 https://github.com/thejinx0r/DriveFS/wiki/Sample-Config

EricTheMagician avatar Jul 10 '18 15:07 EricTheMagician

Maybe you could add "Adding DriveFS to the AUR" as a long term goal?

BerriJ avatar Aug 19 '18 12:08 BerriJ

Hello, looking forward to use this project, I used your node version before switching to rclone a while back, but not the best regarding stability. I just had a quick question, I could not find it in the README or in the code, but do this mount refresh the list of dirs / files automatically if the some files were uploaded to the Gdrive from a different machine?

Thanks

mcadam avatar Aug 25 '18 15:08 mcadam

Yes. It's every five minutes.

On Sat, Aug 25, 2018, 11:48 AM Adam Guldemann [email protected] wrote:

Hello, looking forward to use this project, I used your node version before switching to rclone a while back, but not the best regarding stability. I just had a quick question, I could not find it in the README or in the code, but do this mount refresh the list of dirs / files automatically if the some files were uploaded to the Gdrive from a different machine?

Thanks

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/thejinx0r/DriveFS/issues/1#issuecomment-415978277, or mute the thread https://github.com/notifications/unsubscribe-auth/AATvbHTV6qAdc_Uflx1SyqJs-tsKYlfiks5uUXHlgaJpZM4UWC8m .

EricTheMagician avatar Aug 25 '18 15:08 EricTheMagician

Hello, quick other question about design and implementation, I am currently running a cluster and mounting the gdrive on each node, and I was thinking could I use the same mongodb for all of them or would that maybe pose issues if the cache information is also in the database for example? Or maybe the way the updates works they can be out of sync or something maybe?

Thanks

mcadam avatar Sep 06 '18 10:09 mcadam

@mcadam Its mostly the cache. I put in the db the location of the cached files and only check that since I was worried that scanning disks would be slow. You could just edit the code to have it store on a different table on each node.

Everything else should be ok since the values for the updating are stored in memory. Each node will do about the same work updating the database, but I don't think they should be out of sync, beyond the 5 minute refresh rate.

On Thu, Sep 6, 2018, 6:20 AM Adam Guldemann [email protected] wrote:

Hello, quick other question about design and implementation, I am currently running a cluster and mounting the gdrive on each node, and I was thinking could I use the same mongodb for all of them or would that maybe pose issues if the cache information is also in the database for example? Or maybe the way the updates works they can be out of sync or something maybe?

Thanks

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/thejinx0r/DriveFS/issues/1#issuecomment-419042545, or mute the thread https://github.com/notifications/unsubscribe-auth/AATvbPjcfbKvq-KOA43isX7bRLOFqNc2ks5uYPbUgaJpZM4UWC8m .

EricTheMagician avatar Sep 06 '18 11:09 EricTheMagician

Thanks for the quick answer :)

mcadam avatar Sep 06 '18 11:09 mcadam

I am trying to ship the executable without having to install everything, I am trying to statically link the libs but am getting some errors, don't usually work in c++ any help would be appreciated :)

Tried something like that cmake -DUSE_FUSE3=1 -DCMAKE_EXE_LINKER_FLAGS=-static .

and got those errors

[ 58%] Linking CXX executable DriveFS
/usr/sbin/ld: cannot find -lfuse3
/usr/sbin/ld: attempted static link of dynamic object `/usr/lib/libssl.so'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/DriveFS.dir/build.make:262: DriveFS] Error 1
make[1]: *** [CMakeFiles/Makefile2:106: CMakeFiles/DriveFS.dir/all] Error 2
make: *** [Makefile:84: all] Error 2```

mcadam avatar Sep 06 '18 15:09 mcadam

@mcadam Do you have fuse3 installed or just fuse2? Most distribution only have fuse2 in general.

If you have it, it's not finding it for some reason. Open a new issue and post your "CMakeCache.txt" file from your build folder.

EricTheMagician avatar Sep 06 '18 20:09 EricTheMagician

I installed specifically fuse3, if I compile using this

cmake -DUSE_FUSE3=1 .
make -j 8

then its compiling fine, just I tried the option to statically link the libs so I can just move around the executable without reinstalling everything but didn't work when doing

cmake -DUSE_FUSE3=1 -DCMAKE_EXE_LINKER_FLAGS=-static .
make -j 8

I am pretty noob in C++ so pretty sure I am doing something wrong or its missing some things and is not that easy :)

But this can be done later on or added to your Todo list, for now I got a docker image to work, just it could be lighter but can work on that in the future.

mcadam avatar Sep 07 '18 08:09 mcadam

Hello, maybe, as one of the future goal, to speed up the first startup time to generate the drive tree, instead of getting changes from day one of the drive which for old drive can be quite long, you could on first startup create the list using files api to only get the files present on disk at that time, and then switch to changes and using first getStartPageToken to get the token to retrieve only from now on the new changes?

Still trying it out for now but looks like I will be adopting this one, thanks for sharing your work 👍

mcadam avatar Sep 09 '18 08:09 mcadam