docker-osx-dev
docker-osx-dev copied to clipboard
sync init is taking too long on projects with lots of files
listening approach I think is preferable in this situation.. any thoughts?
I'd be curious as to the source of the slowness. Is it caused by a large number of small files? The use of ssh? Something else entirely? Is it any faster if you tar it all up and rsync just the tar ball?
I believe there is an outstanding PR for a sync_only
command. I could see a watch_only
command being useful too.
Yes, due to the large number of files. Good to know that there's improvement for this on the way.
The PR is only for a sync_only
command. Someone else would need to do a PR for the watch_only
command. I'm in the middle of a different project and can't switch to it at the moment, but it shouldn't take more than a minute or two to do the change.
I created a PR for "sync-only" and "watch-only" (as there were unaddressed comments from July in the other PR I created a new one).
https://github.com/brikis98/docker-osx-dev/pull/105
watch-only unfortunately does not help because initially we still need to sync the files and there are a lot of files to be synced initially and it's taking several minutes to finish.
Are there other alternatives to this?
You'd have to profile it to see where the time is being spent. If we're bottlenecked by the number of files, the only solution I can think of is something that tars it up, rsyncs a single tarball, and then untars it on the other end. If we're bottlenecked by the amount of data, then you may have to search for some sort of mountable file system alternative to vboxsf. Some people use nfs, maybe you'd have more luck with that.
@brikis98 can you add option --use-gzip or sth to the initial sync? I have 113793 files (1.6GB) to initial sync :-) It takes about 45min to sync :(
Can you benchmark it and see if it actually helps? I'd try all files separate, all files in a tarball, and all files in a tarball + gzip.
I excluded bunch of files (logs, compiled templates, git etc):
commands | partial time | total time |
---|---|---|
docker-osx-dev sync-only | -- | 10min 5sec |
tar czf .. + scp + untar | 6min 42sec + 24sec + 56sec | 8min 2sec |
tar cf + scp + untar | 5min 48sec + 56sec + 33sec | 7min 20sec |
Nice research! Looks like for a large number of files, using tar
can lead to a ~30% speed up. It's not obvious how that would scale up to an even larger project, but let's assume that reduced the initial sync time down from 45 min to ~30 min. Is that still be too slow to be useful?
:+1:
30% speed up is huge! It still be slow but usable (for now - it's slow & unusable).
Fair enough. I don't have time at the moment to add that functionality, but would definitely be open to a PR that adds a --tar
style flag that does the initial sync via tar & untar. I suspect it would only take a few lines of shell script to do it.
Yep, if you use NodeJS that have node_modules/
folder with subfolders and etc, it takes too much time. There's no way to watch only without need to sync? Thanks! @brikis98
@thalesfsp: use the watch-only
command.
@brikis98 When I did it without run sync before it happens:
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
2015-12-28 12:39:45 [INFO] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
2015-12-28 12:39:45 [INFO] rsync error: unexplained error (code 255) at /SourceCache/rsync/rsync-45/rsync/io.c(453) [sender=2.6.9]
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
2015-12-28 12:39:45 [INFO] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
2015-12-28 12:39:45 [INFO] rsync error: unexplained error (code 255) at /SourceCache/rsync/rsync-45/rsync/io.c(453) [sender=2.6.9]
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
2015-12-28 12:39:45 [INFO] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
2015-12-28 12:39:45 [INFO] rsync error: unexplained error (code 255) at /SourceCache/rsync/rsync-45/rsync/io.c(453) [sender=2.6.9]
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
2015-12-28 12:39:45 [INFO] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
2015-12-28 12:39:45 [INFO] rsync error: unexplained error (code 255) at /SourceCache/rsync/rsync-45/rsync/io.c(453) [sender=2.6.9]
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
2015-12-28 12:39:45 [INFO] rsync: connection unexpectedly closed (0 bytes received so far) [sender]
2015-12-28 12:39:45 [INFO] rsync error: error in rsync protocol data stream (code 12) at /SourceCache/rsync/rsync-45/rsync/io.c(453) [sender=2.6.9]
2015-12-28 12:39:45 [INFO] Warning: Identity file -o not accessible: No such file or directory.
2015-12-28 12:39:45 [INFO] ssh: Could not resolve hostname IdentitiesOnly=yes: nodename nor servname provided, or not known
@thalesfsp: Does it work if you do the sync first? That error seems to indicate a messed up SSH configuration, not sure what it would have to do with the initial sync.
@brikis98 To work, I need to sync first :( And it take too much time. I would like to adopt docker-osx-dev in our company, but this time lost syncing will not be accepted by the other developers. There's no way to watch only, without the need of syncing?
But if you start syncing, does it work or do you get a similar error?