GDriveFS icon indicating copy to clipboard operation
GDriveFS copied to clipboard

Corrupted downloads when GD_DEBUG=0

Open arniotis opened this issue 10 years ago • 8 comments

Hello. I'm new to gdrivefs, and the initial tests with moderately-sized files went great. Now I need to sync a set of very large files (1GB to 10GB) from GDrive to a linux machine. The problem is that the files get corrupted on download. This happens on two machines I've tried, a Linode VM and a Raspberry PI, with the same results.

What I've been able to determine so far is:

(1) The file downloads OK into the cache. The corruption happens when passing the data from the cache to the program trying to read it (I've tried with "cp", "rsync" and "md5sum")

(2) The corruption is in the form of 64KB blocks of data from the original file that are duplicated in random locations in the corrupt data. In this example here, block a20000-a2FFFF from the source file has also been copied over block 9E0000-9EFFFF on the target file

io:[/mnt/tmp/Z] xxd GOOD | grep "92a5 8a45 59eb" 0a20000: 92a5 8a45 59eb 3c78 8aff 1d16 6e94 8177 ...EY.<x....n..w

io:[/mnt/tmp/Z] xxd BAD | grep "92a5 8a45 59eb" 09e0000: 92a5 8a45 59eb 3c78 8aff 1d16 6e94 8177 ...EY.<x....n..w 0a20000: 92a5 8a45 59eb 3c78 8aff 1d16 6e94 8177 ...EY.<x....n..w

(3) The corruption is different every time

io:[/mnt/tmp/Z] md5sum ~/GoogleDrive/MECA/2005-04.rar 11d68bf7067077aa54f81726a2a975b5 /home/michael/GoogleDrive/MECA/2005-04.rar

io:[/mnt/tmp/Z] md5sum ~/GoogleDrive/MECA/2005-04.rar a5e810ac55ef44ce30b6e3b7dc2f0fb7 /home/michael/GoogleDrive/MECA/2005-04.rar

io:[/mnt/tmp/Z] md5sum ~/GoogleDrive/MECA/2005-04.rar 5c567e5e20a4b1d5c6483d998a6ddc33 /home/michael/GoogleDrive/MECA/2005-04.rar

io:[/mnt/tmp/Z] getfattr --only-values -n user.original.md5Checksum ~/GoogleDrive/MECA/2005-04.rar getfattr: Removing leading '/' from absolute path names 0c45711387083359d39bc859e6cddcde

(4) If I enable GD_DEBUG=1, everything works fine! I know I should include a log with this problem, but it does not happen when I'm logging... As a workaround, I now have debug permanently enabled. So far I've successfully downloaded about 80GB worth of files this way.

(5) The failure rate is about one 64K block out of every 1200 blocks (average from a limited number of tests)

(6) Enabling or disabling big_writes makes no difference

(7) This happens in two different machines, thousands of miles apart:

RPi2 running Raspbian : Linux sigma 3.18.7-v7+ #755 SMP PREEMPT Thu Feb 12 17:20:48 GMT 2015 armv7l GNU/Linux Linode VM running Ubuntu 14.04: Linux io 3.15.2-x86_64-linode43 #1 SMP Mon Jun 30 10:08:39 EDT 2014 x86_64 x86_64 x86_64 GNU/Linux

Please let me know if there is any other information I can provide

Thanks,

Michael

arniotis avatar May 29 '15 12:05 arniotis

I hit this same issue. Auto mounting GDrive through autofs+gdrifs, I uploaded some directories, then made a recursive diff between the uploaded dir and the local dir. diff fails, and not always the same way.

As I suspected the read was wrong, then I was a bit concerned the write was wrong as well. But when using GD_DEBUG=1, the diff succeeded.

I'm not sure this means the write/upload was OK, but confirms the read is getting corrupted. If you upload an ogg file and try to listen it, you'll find out the sound is not OK, of course related with the corrupted reads.

This sounds like an important issue to fix, and it would be nice if the writes get confirmed to be OK while doing so as well.

je-vv avatar Oct 11 '15 02:10 je-vv

One more note, my files are not huge, in the order of MB...

je-vv avatar Oct 13 '15 15:10 je-vv

Hmm, looking into what gets affected due to GD_DEBUG, I just found in both:

GDriveFS-0.14.3/gdrivefs/resources/scripts/gdfs GDriveFS-0.14.3/gdrivefs/resources/scripts/gdfstool

The following:

gdrivefs.gdfs.gdfuse.mount(
    auth_storage_filepath=args.auth_storage_file, 
    mountpoint=args.mountpoint, 
    debug=gdrivefs.config.IS_DEBUG, 
    nothreads=gdrivefs.config.IS_DEBUG, 
    option_string=option_string)

The only difference I see is that when set, besides debugging and monitoring stuff, threading seems to be disabled.

Then fuse is called:

fuse = FUSE(
        GDriveFS(), 
        mountpoint, 
        debug=debug, 
        foreground=debug, 
        nothreads=nothreads, 
        fsname=name, 
        **fuse_opts)

Notice fuse is also called with "debug" enabled, but an easy trial would be to edit both "gdrivefs.gdfs.gdfuse.mount" functions, and use in them:

nothreads=gdrivefs.config.0,

Rather than:

nothreads=gdrivefs.config.IS_DEBUG,

If that works, then one could also generate logs with nothread unseat, and debug at the same time.

Also, I would make the threading option independent of the debug one might necessarily imply the other. I'd suggest creating a different env variable for threading, and decoupled from the debug one.

Can't try right now, but I'll try later. If someone tries 1st, not a problem, :-)

Thanks.

je-vv avatar Oct 13 '15 15:10 je-vv

One more thing, if it's the threading thing, what would that imply, less performance on reads? If not, what else? Also, if that was the case, would it be python FUSE what's wrong (that's the only making use the "nothread" option), or would it be that when enabling threading something else needs to be also enabled for it to work (still a problem with gdrivefs)?

je-vv avatar Oct 13 '15 15:10 je-vv

OK, confirmed. When disabling threads on the FUSE call, everything works OK.

I made a pull request 147, which by default sets NO threads to 1, but still allowing setting it at will through an environment variable. Also decoupling threading from debugging settings. The old behavior setting NO threads while on debug still remains, given the default is NO threads, but with the pull request there's more freedom to test, :-)

Eventually it could be study how to enable FUSE with threading. But at least with the pull request there won't be more data corruption.

je-vv avatar Oct 18 '15 23:10 je-vv

Thanks for tracking this down, @je-vv, I'm building this tonight and testing it.

arniotis avatar Oct 19 '15 18:10 arniotis

My old PR to work around this by disabling multi threading was no longer able to merge... I created a new PR, which becomes able to merge:

#202

je-vv avatar Jan 22 '19 02:01 je-vv

Somehow I missed this issue. We need to actually investigate why this issue is happening.. To fix multithreading rather than disabling it.

dsoprea avatar Jan 27 '19 22:01 dsoprea