sync of a lot of small files fails
I tried to sync about 1 million small files (Combined size of ~40GB) to backblaze and got a lot of 408 request_timeout errors:
b2_upload(/local/path/to/file, b2/path/to/file, 1499162591000): UnknownError('408 request_timeout Error reading uploaded data: SocketTimeoutException(Read timed out)',) Unknown error: 408 request_timeout Error reading uploaded data: SocketTimeoutException(Read timed out)
And also some other errors:
b2_upload(/local/path/to/file, b2/path/to/file, 1499161555000): IndexError('tuple index out of range',) tuple index out of range
The command I used is:
b2 sync --delete --threads 100 /path/to/local/directory b2://bucketname/foldername
So all of the old file versions were deleted, while all the files that got an error were not uploaded. I would expect that it would retry all the failed uploads.
Did you rerun the command after the failed uploads or was this from a single run? If it was from a single run, does executing the same command again upload the missing files? The B2 CLI does retry uploading after some errors which indicate temporary problems, which should include the timeout errors. However, the IndexError seems strange and probably shoudn't happen.
Everything in my last commant was from just a singel run.
Running it two more times got it to upload all files, on the first rerun it exited wit some strange error, that I unfortunatly only have the last part of:
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 992, in run_command
return command.run(args)
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 781, in run
dry_run=args.dryRun,
File "/usr/local/lib/python3.5/dist-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 263, in sync_folders
raise CommandError('sync is incomplete')
b2.exception.CommandError: sync is incomplete
ERROR: sync is incomplete
The second time it ran perfectly.
If you can get the rest of that error, it would be very helpful.
When I try with 100 threads, I exceed my ulimit and get this error:
B2ConnectionError: Connection error: [Errno 24] Too many open files
With 10 threads, things work for me. Are you set up to allow lots of file descriptors?
Apperently I am set up to use a lot of file descriptors. (I'm using the windows subsystem for linux, and didn't get that error ever). Unfortunatly I don't have the rest of the error message, but that was a second diffrent error. Sorry for that :(
I actually got the same error again, this time I got all of it:
ERROR:b2.console_tool:ConsoleTool command error
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 992, in run_command
return command.run(args)
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 781, in run
dry_run=args.dryRun,
File "/usr/local/lib/python3.5/dist-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 263, in sync_folders
raise CommandError('sync is incomplete')
b2.exception.CommandError: sync is incomplete
ERROR: sync is incomplete
I also only used 25 Threads not 100 for this sync.
I also got a few warnings:
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: api001.backblazeb2.com
The "sync is incomplete" is just to signal that not all files from the sync finished successfully. It would be interesting so know why some files did not finish. Do you see any more errors before this last one?
I got this:
ERROR:b2.bucket:error when uploading, upload_url was https://pod-000-1041-10.backblaze.com/b2api/v1/b2_upload_file/<file_id>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/b2/bucket.py", line 392, in _upload_small_file
sha1_sum, file_info, input_stream
File "/usr/local/lib/python3.5/dist-packages/b2/raw_api.py", line 455, in upload_file
return self.b2_http.post_content_return_json(upload_url, headers, data_stream)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 259, in post_content_return_json
response = _translate_and_retry(do_post, try_count, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 103, in _translate_and_retry
return _translate_errors(fcn, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 52, in _translate_errors
int(error['status']), error['code'], error['message'], post_params
b2.exception.ServiceError: 503 service_unavailable c001_v0001041_t0037 is too busy
with diffrent files and diffrent upload urls at least 17 times, but thats all the Errors I can see. I will try to get all of the errors into a file next time I run it.
Edit: Also sorry for my bad Englisch :(
I got another error just now, I think this time my pipe broke if I understand it correctly:
ERROR:b2.console_tool:ConsoleTool unexpected exception
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 649, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.5/dist-packages/urllib3/util/retry.py", line 357, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.5/dist-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 47, in _translate_errors
response = fcn()
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 255, in do_post
response = self.session.post(url, headers=headers, data=data)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 549, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 992, in run_command
return command.run(args)
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 781, in run
dry_run=args.dryRun,
File "/usr/local/lib/python3.5/dist-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 251, in sync_folders
source_folder, dest_folder, args, now_millis, reporter
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 136, in make_folder_sync_actions
dest_file) in zip_folders(source_folder, dest_folder, reporter, exclusions, inclusions):
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 93, in zip_folders
current_b = next_or_none(iter_b)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 36, in next_or_none
return six.advance_iterator(iterator)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/folder.py", line 192, in all_files
self.folder_name, show_versions=True, recursive=True, fetch_count=1000
File "/usr/local/lib/python3.5/dist-packages/b2/bucket.py", line 207, in ls
self.id_, start_file_name, start_file_id, fetch_count
File "/usr/local/lib/python3.5/dist-packages/b2/session.py", line 38, in wrapper
return f(api_url, account_auth_token, *args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/raw_api.py", line 358, in list_file_versions
maxFileCount=max_file_count
File "/usr/local/lib/python3.5/dist-packages/b2/raw_api.py", line 135, in _post_json
return self.b2_http.post_json_return_json(url, headers, params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 286, in post_json_return_json
return self.post_content_return_json(url, headers, data, try_count, params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 259, in post_content_return_json
response = _translate_and_retry(do_post, try_count, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 103, in _translate_and_retry
return _translate_errors(fcn, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 71, in _translate_errors
if e2.args[1] == 'Broken pipe':
IndexError: tuple index out of range
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 649, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.5/dist-packages/urllib3/util/retry.py", line 357, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.5/dist-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 386, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.5/dist-packages/urllib3/connectionpool.py", line 382, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.5/http/client.py", line 1197, in getresponse
response.begin()
File "/usr/lib/python3.5/http/client.py", line 297, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.5/http/client.py", line 266, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 47, in _translate_errors
response = fcn()
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 255, in do_post
response = self.session.post(url, headers=headers, data=data)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 549, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 502, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/sessions.py", line 612, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/b2", line 9, in <module>
load_entry_point('b2==0.7.2', 'console_scripts', 'b2')()
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 1104, in main
exit_status = ct.run_command(decoded_argv)
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 992, in run_command
return command.run(args)
File "/usr/local/lib/python3.5/dist-packages/b2/console_tool.py", line 781, in run
dry_run=args.dryRun,
File "/usr/local/lib/python3.5/dist-packages/logfury/v0_1/trace_call.py", line 84, in wrapper
return function(*wrapee_args, **wrapee_kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 251, in sync_folders
source_folder, dest_folder, args, now_millis, reporter
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 136, in make_folder_sync_actions
dest_file) in zip_folders(source_folder, dest_folder, reporter, exclusions, inclusions):
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 93, in zip_folders
current_b = next_or_none(iter_b)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/sync.py", line 36, in next_or_none
return six.advance_iterator(iterator)
File "/usr/local/lib/python3.5/dist-packages/b2/sync/folder.py", line 192, in all_files
self.folder_name, show_versions=True, recursive=True, fetch_count=1000
File "/usr/local/lib/python3.5/dist-packages/b2/bucket.py", line 207, in ls
self.id_, start_file_name, start_file_id, fetch_count
File "/usr/local/lib/python3.5/dist-packages/b2/session.py", line 38, in wrapper
return f(api_url, account_auth_token, *args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/b2/raw_api.py", line 358, in list_file_versions
maxFileCount=max_file_count
File "/usr/local/lib/python3.5/dist-packages/b2/raw_api.py", line 135, in _post_json
return self.b2_http.post_json_return_json(url, headers, params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 286, in post_json_return_json
return self.post_content_return_json(url, headers, data, try_count, params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 259, in post_content_return_json
response = _translate_and_retry(do_post, try_count, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 103, in _translate_and_retry
return _translate_errors(fcn, post_params)
File "/usr/local/lib/python3.5/dist-packages/b2/b2http.py", line 71, in _translate_errors
if e2.args[1] == 'Broken pipe':
IndexError: tuple index out of range
Hello, I'm having issues too with uploading LOTS of small files. Right now we would like to use B2 to backup around 1TB of really small files. Given the amount of data we'll have to use a lot of threads to speed-up the upload (we're testing around 200 threads).
With this configuration, we get 2 kind of errors that I think are related to this issue.
One is "Pool is closed" and the other is that " is too busy"
Traceback (most recent call last): File "/usr/lib/python3.4/site-packages/b2/bucket.py", line 415, in _upload_small_file content_type, HEX_DIGITS_AT_END, file_info, hashing_stream File "/usr/lib/python3.4/site-packages/b2/raw_api.py", line 497, in upload_file return self.b2_http.post_content_return_json(upload_url, headers, data_stream) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 275, in post_content_return_json response = _translate_and_retry(do_post, try_count, post_params) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 119, in _translate_and_retry return _translate_errors(fcn, post_params) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 79, in _translate_errors raise B2ConnectionError(str(e0)) b2.exception.B2ConnectionError: Connection error: HTTPSConnectionPool(host='xxxx.backblaze.com', port=443): Pool is closed.
and the other one
ERROR:b2.bucket:error when uploading, upload_url was https://XXXXX Traceback (most recent call last): File "/usr/lib/python3.4/site-packages/b2/bucket.py", line 415, in _upload_small_file content_type, HEX_DIGITS_AT_END, file_info, hashing_stream File "/usr/lib/python3.4/site-packages/b2/raw_api.py", line 497, in upload_file return self.b2_http.post_content_return_json(upload_url, headers, data_stream) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 275, in post_content_return_json response = _translate_and_retry(do_post, try_count, post_params) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 119, in _translate_and_retry return _translate_errors(fcn, post_params) File "/usr/lib/python3.4/site-packages/b2/b2http.py", line 55, in _translate_errors int(error['status']), error['code'], error['message'], post_params b2.exception.ServiceError: 503 service_unavailable XXXXXX is too busy
Dunno if this helps debugging the problem, I'm available to do tests of any kind if it helps fixing the problem ;)
I don't have any obvious answers. I do have some questions...
How long does it run before failing?
Have you tried with different numbers of threads?
What's the XXX in 503 service_unavailable XXXXXX is too busy? (That might help track down a problem on the service side.)
Hi @bwbeach thanks for the reply! I redid some test, so here are the answers to your question
- Before failing it waits a couple of minutes
- Tested with 200 thread first, then tried with 100 threads, the error appeared around minute 2'40s, with 50 threads it appears around 2'40' again ... with 25 threads the first error come up around 2'50s, (before the error a warning I copied below gets printed)
- the full error is "503 service_unavailable c002_v0001111_t0056 is too busy"
Some extra info Before the errors what gets printed is "WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: api002.backblazeb2.com"
in case of the 200/100 threads it gets printed lots of times (20+), in case of the 50 only once before starting outputting some errros.
Another error printed is b2.exception.B2ConnectionError: Connection error: HTTPSConnectionPool(host='pod-000-1111-11.backblaze.com', port=443): Pool is closed.
Lemme know if you need some extra data.
WARNING:requests.packages.urllib3.connectionpool:Connection pool is full, discarding connection: api002.backblazeb2.com is mostly harmless, it just means you are using lots of connections. Might affect performance a bit, but I doubt it will.
Pool is closed looks like a consequence of the 503 errors. I think we need a better retry policy or something.
@ppolewicz got it, but isn't the 503 error also something server side? A better retry policy would make it more resistant to failure, but if the server doesn't stop throwing the errors we'll probably need something else, or am I missing something?
if you have lots of small files to upload, you'd expect sync to take reasonable measures to complete the operation even if something fails for a few minutes. As it is now, the retries are greedy and per-file, so if one file cannot be uploaded a few times (5 iirc), then no further attempts at uploading it are made.
I think the retry policy should be more global. If we are uploading thousands of files and one of them would have to be retried 6 times, it should still be ok. To be blunt, the current code accepts retrying every file a few times, for thousands of retries total, but does not accept one file that needs to be retried more than 5 times.
An elegant retry policy is difficult to design, but I feel the current state has some room for improvement.
for future references: With Restic the problem doesn't seem to come up. I suppose it's because it's packaging the files so therefore we're putting less stress on the B2 services.
Right now I'm going with that, but I would really hope to see the B2 client have some improvements to allow a classic copy of files for those users that don't need a full fledged backup/snapshot solution like restic/etc.
Edit: I couldn't uses rclone because it uses too much Class C operations. (adding just in case someone is searching for alternatives to fix the upload issue)