B2_Command_Line_Tool
B2_Command_Line_Tool copied to clipboard
B2 Sync Connection Pool Is Full
Hello,
I hope you are having a wonderful day.
I am having a problem with the B2 Sync program. I have been encountering this error with increased frequency, especially when uploading or download millions of files.
Message: WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: f001.backblazeb2.com dnload working3/999/1030.jpg
I have tried reducing my connection threads down to as little as 30 but I still encounter the issue. Additionally, I have noticed that the more files that are being synced, the worse the issue becomes. I eventually begin encountering these errors to the point they hold up the sync process for tens of minutes at a time. The errors will appear than the sync will continue...then the errors appear again.
If you need to sync large numbers of files, this is making the sync command borderline useless.
I will appreciate any and all assistance you can offer.
Blessings, Steven Dunn
PS: I have submitted this support request here as per BackBlaze support's recommendation.
It may be that the message you are seeing is an indirect indicator of another issue.
Please run the sync with logs enabled. If you know how to do it, please collect data about the network transfer during an upload using tcpdump or wireshark and then use wireshark to plot troughput in time. Please also keep a ping running to a host in the LA area, while collecting the network trace.
Ppolewicz,
I have enabled logging and started a new sync. I will update you when it is complete.
Blessings, Steven Dunn
Ppolewicz,
I reviewed the log and noticed that these are the errors that are stopping the sync process. It happens more often the longer the sync progresses and more quickly if I use more than the default 10 threads.
2017-02-14 21:17:54,922 20766 139658897905408 b2.bucket ERROR error when uploading, upload_url was https://pod-000-1033-18.backblaze.com/b2api/v1/b2_upload_file/(censored) Traceback (most recent call last): File "build/bdist.linux-x86_64/egg/b2/bucket.py", line 392, in _upload_small_file sha1_sum, file_info, input_stream File "build/bdist.linux-x86_64/egg/b2/raw_api.py", line 455, in upload_file return self.b2_http.post_content_return_json(upload_url, headers, data_stream) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 259, in post_content_return_json response = _translate_and_retry(do_post, try_count, post_params) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 103, in _translate_and_retry return _translate_errors(fcn, post_params) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 52, in _translate_errors int(error['status']), error['code'], error['message'], post_params ServiceError: 503 service_unavailable c001_v0001033_t0020 is too busy
Another error I discovered...
2017-02-14 21:17:49,414 20766 139658864334592 b2.bucket ERROR error when uploading, upload_url was https://pod-000-1032-07.backblaze.com/b2api/v1/b2_upload_file/(censored) Traceback (most recent call last): File "build/bdist.linux-x86_64/egg/b2/bucket.py", line 392, in _upload_small_file sha1_sum, file_info, input_stream File "build/bdist.linux-x86_64/egg/b2/raw_api.py", line 455, in upload_file return self.b2_http.post_content_return_json(upload_url, headers, data_stream) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 259, in post_content_return_json response = _translate_and_retry(do_post, try_count, post_params) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 103, in _translate_and_retry return _translate_errors(fcn, post_params) File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 75, in _translate_errors raise BrokenPipe() BrokenPipe: Broken pipe: unable to send entire request
Please advise. I really want to use this service to quickly backup and, if need be, restore millions of files.
Blessings, Steven Dunn
These errors are both normal, an the command-line-tool should be retrying. If it's not retrying enough, we can fix it.
Did the sync give up on those files?
Hello,
Thank you for the reply.
It appears to retry the files but it eventually jams the sync process until I have to restart it. It will start going dormant for tens of minutes at a time. The issue gets worse the longer it runs.
Although, if you run with only 10 threads, the default, it does not occur near as often...though that few threads does slow down the sync significantly when transferring large numbers of small files.
I will appreciate your continued assistance.
Blessings, Steven Dunn
On Feb 14, 2017, at 4:59 PM, Brian Beach [email protected] wrote:
These errors are both normal, an the command-line-tool should be retrying. If it's not retrying enough, we can fix it.
Did the sync give up on those files?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Looking through the code, I think that the retry code could be improved. I'll try to get to it today or tomorrow.
Some errors can be retried by getting a new upload URL to upload to a different host, and then uploading there. In those cases, there's no point in sleeping and retrying with the same URL. The 503 error above is one such case.
The other factor that probably hit you today is that Tuesday is the day we deploy updated software to our storage pods, which causes a slightly higher number of 503 status codes. The deployment is all done for today, so it may work better if you try now.
Steven, I think you have found a bug, but in order to confirm, I need to see the network trace. The data will be encrypted - but it is ok, I just need to see the flow of packets in time to understand why the sync jams up. Can you please collect the network trace?
Hello,
Thank you for reaching out.
I am running CentOS on this server. If you can send me the command you need run, I will be glad to help.
Blessings, Steven Dunn
On Feb 14, 2017, at 6:14 PM, Paweł Polewicz [email protected] wrote:
Steven, I think you have found a bug, but in order to confirm, I need to see the network trace. The data will be encrypted - but it is ok, I just need to see the flow of packets in time to understand why the sync jams up. Can you please collect the network trace?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
Brian,
Thank you for the reply.
Also, it might help to know that these issues have been occurring for the last several days.
Any resolutions that can be applied will be sincerely appreciated.
Blessings, Steven Dunn
On Feb 14, 2017, at 6:03 PM, Brian Beach [email protected] wrote:
Looking through the code, I think that the retry code could be improved. I'll try to get to it today or tomorrow.
Some errors can be retried by getting a new upload URL to upload to a different host, and then uploading there. In those cases, there's no point in sleeping and retrying with the same URL. The 503 error above is one such case.
The other factor that probably hit you today is that Tuesday is the day we deploy updated software to our storage pods, which causes a slightly higher number of 503 status codes. The deployment is all done for today, so it may work better if you try now.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
tcpdump -n -C20 -s 68 -W 10 -w issue_323.pcap net 162.244.56.0/21
It needs to be run from root and it will consume about 200MB of disk space. Please stop the tcpdump with ctrl+c at least a few minutes after the sync has hanged. If tcpdump fails with an error immediately after start, please remove the -s 68
parameter and try again.
Pawel,
Thank you for the reply.
Will that command only log the traffic for B2 or all server traffic?
Blessings, Steven Dunn
On Feb 14, 2017, at 8:17 PM, Paweł Polewicz [email protected] wrote:
tcpdump -n -C20 -s 68 -W 10 -w issue_323.pcap net 162.244.56.0/21 or net 162.244.56.0/24
It needs to be run from root and it will consume about 200MB of disk space. Please stop the tcpdump with ctrl+c at least a few minutes after the sync has hanged. If tcpdump fails with an error immediately after start, please remove the -s 68 parameter and try again.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.
The net 162.244.56.0/21
part makes it log only the traffic to the IP address space of Backblaze. B2 traffic is all encrypted. Even if it wasn't, -s 68
cuts the packet just enough for the analysis to be done, while not saving the packet payload.
I really, really don't want to see your stuff.
Ppolewicz,
I appreciate the clarification.
The command failed with the following errors.
[root@localhost ~]# tcpdump -n -C20 -s 68 -W 10 -w issue_323.pcap net 162.244.56.0/21
tcpdump: WARNING: SIOCGIFADDR: nflog: No such device
tcpdump: NFLOG link-layer type filtering not implemented
Please advise.
Blessings, Steven Dunn
Try to change the command to
tcpdump -n -i any -C20 -s 68 -W 10 -w issue_323.pcap net 162.244.56.0/21
Pawel,
I entered the command and received the following error:
tcpdump: issue_323.pcap0: Permission denied
Please advise.
Blessings,
Steven Dunn
From: Paweł Polewicz [mailto:[email protected]] Sent: Tuesday, February 14, 2017 9:48 PM To: Backblaze/B2_Command_Line_Tool [email protected] Cc: Myflag [email protected]; Author [email protected] Subject: Re: [Backblaze/B2_Command_Line_Tool] B2 Sync Connection Pool Is Full (#323)
Try to change the command to
tcpdump -n -i any -C20 -s 68 -W 10 -w issue_323.pcap net 162.244.56.0/21
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/Backblaze/B2_Command_Line_Tool/issues/323#issuecomment-279902381 , or mute the thread https://github.com/notifications/unsubscribe-auth/ATZuKLJgeeU1GyVZ97pb1x1DrPuipvXkks5rcmdTgaJpZM4L_tUz . https://github.com/notifications/beacon/ATZuKOvbBqCJdKuei7DKp-BS-JqwrkbIks5rcmdTgaJpZM4L_tUz.gif
It looks like you are running the tcpdump from a directory, to which you cannot write or you are not running it from root.
@Myflag, is there anything I could do to help you collect the trace?
Hello,
Thank you for reaching out.
The last time I ran the sync at 30 threads it did not jam up with the errors. It has now finished syncing all 3.8 million files and I don't have any others I need to push to B2 at the moment.
If I discover the issue occurring again, I will update this thread.
I appreciate all of your assistance.
Blessings, Steven Dunn
On Feb 17, 2017, at 6:00 AM, Paweł Polewicz [email protected] wrote:
@Myflag, is there anything I could do to help you collect the trace?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Hi!
I'm running a AWS EC2 instance doing some backup to B2.
This night I got:
ERROR:b2.bucket:error when uploading, upload_url was https://pod-000-1042-07.backblaze.com/b2api/v1/b2_upload_file/************************/c001_v0001042_t0049
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/b2/bucket.py", line 394, in _upload_small_file
HEX_DIGITS_AT_END, file_info, hashing_stream
File "build/bdist.linux-x86_64/egg/b2/raw_api.py", line 458, in upload_file
return self.b2_http.post_content_return_json(upload_url, headers, data_stream)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 259, in post_content_return_json
response = _translate_and_retry(do_post, try_count, post_params)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 103, in _translate_and_retry
return _translate_errors(fcn, post_params)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 52, in _translate_errors
int(error['status']), error['code'], error['message'], post_params
ServiceError: 503 service_unavailable c001_v0001042_t0049 is too busy
ERROR:b2.bucket:error when uploading, upload_url was https://pod-000-1038-14.backblaze.com/b2api/v1/b2_upload_file/*********************/c001_v0001038_t0001
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/b2/bucket.py", line 394, in _upload_small_file
HEX_DIGITS_AT_END, file_info, hashing_stream
File "build/bdist.linux-x86_64/egg/b2/raw_api.py", line 458, in upload_file
return self.b2_http.post_content_return_json(upload_url, headers, data_stream)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 259, in post_content_return_json
response = _translate_and_retry(do_post, try_count, post_params)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 103, in _translate_and_retry
return _translate_errors(fcn, post_params)
File "build/bdist.linux-x86_64/egg/b2/b2http.py", line 52, in _translate_errors
int(error['status']), error['code'], error['message'], post_params
ServiceError: 503 service_unavailable Tome is overly busy
There is a lot more errors like this, some mention Tome is overly busy
, some c001_v0001042_t0049
, c001_v0001041_t0043
, c001_v0001042_t0033
, etc...
I believe this issue applies to my case. Is there anything we can do to fix this. I believe that this simply shouldn't happen.
Command line:
b2 sync --noProgress --delete --threads 1 /etc/nginx/ b2://MY_BUCKET_NAME/backup-nginx/etc/
Uname:
Linux cloud 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u1 (2017-06-18) x86_64 GNU/Linux
Thanks.
The "Tome is overly busy" message is a sign that the load balancing among the B2 servers at Backblaze is not working right. We'll do our best to get it fixed.
This issue is still a problem when deleting a large number of files in the bucket. See output:
delete <file 1>
delete <file 2>
...
delete <file n>
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: api001.backblazeb2.com
delete <more files>
upload <some file>
ERROR:b2.console_tool:ConsoleTool command error
Traceback (most recent call last):
File "/usr/local/Cellar/b2-tools/1.1.0/libexec/lib/python2.7/site-packages/b2/console_tool.py", line 1074, in run_command
return command.run(args)
File "/usr/local/Cellar/b2-tools/1.1.0/libexec/lib/python2.7/site-packages/b2/console_tool.py", line 857, in run
allow_empty_source=allow_empty_source
File "/usr/local/Cellar/b2-tools/1.1.0/libexec/lib/python2.7/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/Cellar/b2-tools/1.1.0/libexec/lib/python2.7/site-packages/b2/sync/sync.py", line 273, in sync_folders
raise CommandError('sync is incomplete')
CommandError: sync is incomplete
ERROR: sync is incomplete
Looks like an issue with that library - perhaps a handle isn't being released?
Perhaps we should increase the size of the connection pool so that it is as large as our thread pool. @bwbeach, what do you think?
I think that would make sense.
It's not clear to me that the connection pool warning is related to the sync failure. I think it's just a warning that the connection could not be returned to the pool because the pool was full.
This happens after deleting more than 100k files, and this is the only warning just before a failure that terminates the process.
I'm also receiving this issue.
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: f000.backblazeb2.com dnload myfile.sql.gz ERROR:b2.console_tool:ConsoleTool command error Traceback (most recent call last): File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 1476, in run_command return command.run(args) File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 1304, in run allow_empty_source=allow_empty_source File "/home/debian/.local/lib/python3.5/site-packages/logfury/v0_1/trace_call.py", line 84, in wrapper return function(*wrapee_args, **wrapee_kwargs) File "/home/debian/.local/lib/python3.5/site-packages/b2sdk/sync/sync.py", line 273, in sync_folders raise CommandError('sync is incomplete') b2sdk.exception.CommandError: sync is incomplete ERROR: sync is incomplete
@rafatwork is that the full output? It looks like there was an error before the warnings that you didn't show. Can you run the sync with logs enabled?
@ppolewicz @bwbeach Hello!
I am having the same problem with Connection Pools. I did not want to open up a new issue for this, so I am adding my comment here.
When I run b2 sync --threads 10 <bucket> <destination.
I also receive the Connection Pool error.
I then have followed @ppolewicz instructions:
Can you run the sync with logs enabled?
When I run the sync with the --debugLogs
flag, I receive these errors which are 'new' compared to running the sync without the flag --debugLogs
.
It will not let me attach the file here, so here is a link to the log file with the errors which occur during indexing.
Here is a link to the log file (this is the same 'log' as the other shared file, it is larger because it ran longer) that has the errors which occur only when I include the flag --debugLogs
. It does not include the error Connection Pool is full.
Error Image that corresponds with the log file:
I have just run the sync command again without the flag --debugLogs
, and as you can see now the 'connection pool is full' error appears:
Conclusion
I do not think that I can generate the log file for the 'Connection Pool is Full' until the errors above are fixed...
What do you think?
PS: All sync run's above have been ran with (10) threads active using the --threads
flag (maybe spelled wrong I didn't look)
Currently running this with (1) thread active and it is very very slow (screenshot of current process with (1) thread is below). I will report the results (if / when) it finishes. I say 'if' because it is very slow. See screenshot:
@jeffward01 the connection pool being full warning (not error) doesn't have anything to do with your sync failure, which I think is actually caused by a filename decoding error. You have a few files with really weird names. The operating system told python those filenames can be decoded using encoding cp1252, but that didn't really work, and before anything is sent to B2, it needs to have its filename translated to utf-8 (something the CLI could not do). I think the amount of files in question is very limited, you might want to just try renaming them.
@jeffward01 the connection pool being full warning (not error) doesn't have anything to do with your sync failure, which I think is actually caused by a filename decoding error. You have a few files with really weird names. The operating system told python those filenames can be decoded using encoding cp1252, but that didn't really work, and before anything is sent to B2, it needs to have its filename translated to utf-8 (something the CLI could not do). I think the amount of files in question is very limited, you might want to just try renaming them.
Thanks for getting back to me so quickly! The files are already hosted on B2
and I am pulling them down.
Is it possible to rename them remotely while they are on B2
, then I can pull down the 'cleaned' files?
Thanks for the info!
Also, it would be nice to have a flag where if this case happens where the file is named strangely, that BackBlaze will replace it with some 'temp-name', then BackBlaze will create a file called renamed.txt
with a legend of the new new-temp-names -> original file name
.
Something like this would be great
Ok then it looks like you have an utf-8 encoded filename in B2 that cannot be translated to cp1252.
There is no atomic remote rename operation, but you can copy an object and remove the old one. Please see the manual for b2 copy-file-by-id and delete-file-version.
As for the feature you suggested, I agree that some sort of a solution would be nice, but I'm not sure if the one you suggested is right. renamed.txt
would need to be at least a csv
file, but then processing it with a script would be hard, so you'd need to do some programming to fix the situation. Ideally we'd have something that would allow the user to recover from such situations without any programming.
If you'd like to contribute a solution for this problem, please create a new github issue so that we can discuss it without polluting the "connection is full" issue any further :)
I think at the very least a detector that doesn't actually do anything would be nice - something that looks at your current filesystem encoding, looks at the other end and figures out which files and file versions will cause trouble when we are going to try to transfer them. Wouldn't actually transfer anything, would require a scan, but would allow the user to troubleshoot the situation with a script. Actually it could return a csv that could be then parsed with csvformat reliably, considering that B2 doesn't allow BELL or NEWLINE characters, but bash allows it in $IFS. It's a bit of a stretch though, a python solution would be way cleaner.