rclone icon indicating copy to clipboard operation
rclone copied to clipboard

panic during copy

Open buengese opened this issue 1 year ago • 6 comments

What is the problem you are having with rclone?

I've noticed some panic errors caught by pacer when looking at the log of an rclone copy from onedrive to ulozto. I'm not sure where the originate. I'll try to reproduce this.

2024/09/16 10:22:06 DEBUG : Too many requests. Trying again in 1268 seconds.
2024/09/16 10:22:06 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/16 10:22:06 DEBUG : pacer: Rate limited, increasing sleep to 21m8s
2024/09/16 10:22:06 DEBUG : Too many requests. Trying again in 1413 seconds.
2024/09/16 10:22:06 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/16 10:22:06 DEBUG : pacer: Rate limited, increasing sleep to 23m33s

What is your rclone version (output from rclone version)

rclone v1.68.0
- os/version: ubuntu 22.04 (64 bit)
- os/kernel: 6.5.0-1027-oracle (aarch64)
- os/type: linux
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.23.1
- go/linking: static
- go/tags: none

Which OS you are using and how many bits (e.g. Windows 7, 64 bit)

Ubuntu 22.04 aarch64

Which cloud storage system are you using? (e.g. Google Drive)

OneDrive and uloz.to

The command you were trying to run (e.g. rclone copy /tmp remote:tmp)

rclone copy onedrive:/path/to/dir/ uloz:/path/to/dir/ --progress -vv

How to use GitHub

  • Please use the 👍 reaction to show that you are affected by the same issue.
  • Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
  • Subscribe to receive notifications on status change and new comments.

buengese avatar Sep 16 '24 11:09 buengese

If I had to bet, OneDrive as the throttling there is real and running defaults, I'd imagine that was doing it.

animosity022 avatar Sep 16 '24 12:09 animosity022

I didn't think we caught panics anywhere, but we must be... Or something is...

ncw avatar Sep 16 '24 12:09 ncw

Grepping the rclone source for recover(), I don't think a plain rclone copy can catch panics

Neither onedrive or ulozto use any external modules.

If it was run via the rc then yes it could.

@buengese are you using a patched rclone? Struggling to understand what is going on here!

ncw avatar Sep 17 '24 12:09 ncw

Nope, this rclone version is neither patched nor was it running via rc. I've also not been able to reproduce that.

A detail I forgot in the initial post ist that both the onedrive and ulozto backend very wrapped by a crypt backend.

The exact command being run was: rclone copy onedrive-crypt:<redacted> uloz-crypt:<redacted> --transfers 8 --retries 10 --low-level-retries 10 -vv --log-file=/log/rclone-copy-1726176156.log

buengese avatar Sep 20 '24 13:09 buengese

The only places we recover panics are here

$ git grep "recover()" | grep -v test
backend/qingstor/qingstor.go:		if r := recover(); r != nil {
cmd/cmount/mount.go:			if r := recover(); r != nil {
cmd/serve/dlna/dlna_util.go:			err := recover()
fs/rc/jobs/job.go:		if r := recover(); r != nil {
librclone/librclone/librclone.go:		if r := recover(); r != nil {
vfs/dir.go:		_ = recover()

For a plain rclone copy onedrive-crypt:<redacted> uloz-crypt:<redacted> I don't think any of those should be running...

So maybe in a dependency....

Unless you can reproduce it I don't think we'll track it down like this!

ncw avatar Sep 20 '24 15:09 ncw

I've seen the error yet again in a new transfer. Once again deep into a large transfer, always twice in a row and in very much seems to be related to the rate limiting. Still now clue where its coming from. I guess I'm going to retry that transfer with a debugger attached.

2024/09/21 01:11:00 DEBUG : Too many requests. Trying again in 61 seconds.
2024/09/21 01:11:00 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:11:00 DEBUG : pacer: Rate limited, increasing sleep to 1m1s
2024/09/21 01:11:00 DEBUG : Too many requests. Trying again in 78 seconds.
2024/09/21 01:11:00 DEBUG : pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:11:00 DEBUG : pacer: Rate limited, increasing sleep to 1m18s
...
2024/09/21 01:21:00 DEBUG : Too many requests. Trying again in 101 seconds.
2024/09/21 01:21:00 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:21:00 DEBUG : pacer: Rate limited, increasing sleep to 1m41s
2024/09/21 01:21:00 DEBUG : Too many requests. Trying again in 103 seconds.
2024/09/21 01:21:00 DEBUG : pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:21:00 DEBUG : pacer: Rate limited, increasing sleep to 1m43s
...
2024/09/21 01:33:30 DEBUG : Too many requests. Trying again in 111 seconds.
2024/09/21 01:33:30 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:33:30 DEBUG : pacer: Rate limited, increasing sleep to 1m51s
2024/09/21 01:33:30 DEBUG : pacer: Reducing sleep to 1m23.25s
2024/09/21 01:33:30 DEBUG : Too many requests. Trying again in 96 seconds.
2024/09/21 01:33:30 DEBUG : pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:33:30 DEBUG : pacer: Rate limited, increasing sleep to 1m36s
2024/09/21 01:33:30 DEBUG : pacer: Reducing sleep to 1m12s
...
2024/09/21 01:44:00 DEBUG : Too many requests. Trying again in 102 seconds.
2024/09/21 01:44:00 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:44:00 DEBUG : pacer: Rate limited, increasing sleep to 1m42s
2024/09/21 01:44:00 DEBUG : Couldn't decode error response: EOF
2024/09/21 01:44:00 DEBUG : pacer: Reducing sleep to 1m16.5s
2024/09/21 01:44:00 DEBUG : Too many requests. Trying again in 102 seconds.
2024/09/21 01:44:00 DEBUG : pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 01:44:00 DEBUG : pacer: Rate limited, increasing sleep to 1m42s
...
2024/09/21 02:15:00 DEBUG : Too many requests. Trying again in 74 seconds.
2024/09/21 02:15:00 DEBUG : pacer: low level retry 1/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 02:15:00 DEBUG : pacer: Rate limited, increasing sleep to 1m14s
2024/09/21 02:15:00 DEBUG : Couldn't decode error response: EOF
2024/09/21 02:15:00 DEBUG : pacer: Reducing sleep to 55.5s
2024/09/21 02:15:00 DEBUG : Too many requests. Trying again in 78 seconds.
2024/09/21 02:15:00 DEBUG : pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))
2024/09/21 02:15:00 DEBUG : pacer: Rate limited, increasing sleep to 1m18s

buengese avatar Sep 21 '24 12:09 buengese

facing pretty much the same issue with v1.70.1

  • os/version: debian 12.11 (64 bit)
  • os/kernel: 6.1.0-37-amd64 (x86_64)
  • os/type: linux
  • os/arch: amd64
  • go/version: go1.24.4
  • go/linking: static
  • go/tags: none

...when trying to copy ~2 TB from OneDrive to Hetzner StorageBox on a VPS (specs above) with the following command:

rclone copy [source] [target] -vv -u --progress --bwlimit 50M --checkers=2 --transfers=2

which occasionally results in

Too many requests
pacer: low level retry 2/10 (error %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference))

I played around with connections (checkers/transfers) and bwlimits. I got it running rather stable now with bwlimit < 20M - which is not really satisfying, given that both ends, the server and the connection should be able to do 5x that workload.

degudejung avatar Jun 25 '25 18:06 degudejung

I think I've finally worked out what that error is telling us

It is from a fmt.Printf or similar statement that tried to call the Error() method but got a nil pointer deref

Here is a repro

package main

import "fmt"

type E struct{}

func (e E) Error() string {
	var p *string
	return *p
}

func main() {
	f := E{}
	var err error = f
	err = fmt.Errorf("e = %w\n", f)
	fmt.Println(err)
}

Prints this rather than throwing a nil pointer exception

e = %!v(PANIC=Error method: runtime error: invalid memory address or nil pointer dereference)

So what this means is that an Error method, most likely called as part of the pacer is raising a nil exception.

Given that there is an associated "Too many requests" error then this is almost certainly to do with pacer.RetryAfterError being called with a nil error. These are in turn called from shouldRetry functions which can be called with a nil error.

Please give this a try

v1.71.0-beta.8819.4df974ccc.fix-8077-retry-panic on branch fix-8077-retry-panic (uploaded in 15-30 mins)

ncw avatar Jun 25 '25 20:06 ncw

Thank you @ncw for addressing this! Immediately installed it.

It's been running stable with no warnings nor errors, even without bwlimit for about 10 minutes / 40 GB now, which is very promising. I'll keep the job running over night and update my feedback after that!

Update Now, when the pacer takes over, it does no longer go into panic for me:

2025/06/25 21:11:45 DEBUG : Too many requests. Trying again in 538 seconds.
2025/06/25 21:11:45 DEBUG : pacer: low level retry 1/10 (error too many requests: trying again in 8m58s)
2025/06/25 21:11:45 DEBUG : pacer: Rate limited, increasing sleep to 8m58s
2025/06/25 21:11:45 DEBUG : duplicati-b916f0fd98c2c486bb8af1efee5a52d09.dblock.zip.aes.fdb8fe9.partial: renamed to: duplicati-b916f0fd98c2c486bb8af1efee5a52d09.dblock.zip.aes
2025/06/25 21:11:45 INFO  : duplicati-b916f0fd98c2c486bb8af1efee5a52d09.dblock.zip.aes: Copied (new)
2025/06/25 21:11:45 DEBUG : duplicati-i27a7139bd8f640d4ae6de4a7b7e66842.dindex.zip.aes: Need to transfer - File not found at Destination
2025/06/25 21:11:45 DEBUG : pacer: Reducing sleep to 6m43.5s
2025/06/25 21:11:45 DEBUG : duplicati-b916ffada2cf14f98abeec54f82eaee5f.dblock.zip.aes.ffa9fc1f.partial: renamed to: duplicati-b916ffada2cf14f98abeec54f82eaee5f.dblock.zip.aes
2025/06/25 21:11:45 INFO  : duplicati-b916ffada2cf14f98abeec54f82eaee5f.dblock.zip.aes: Copied (new)
2025/06/25 21:11:45 DEBUG : duplicati-i27b119dbdcb9434ebb47f23b4ba45959.dindex.zip.aes: Need to transfer - File not found at Destination
2025/06/25 21:11:46 DEBUG : duplicati-b9170e1a7f09e44d4b92657659f3d4a86.dblock.zip.aes.53193b0d.partial: renamed to: duplicati-b9170e1a7f09e44d4b92657659f3d4a86.dblock.zip.aes
2025/06/25 21:11:46 INFO  : duplicati-b9170e1a7f09e44d4b92657659f3d4a86.dblock.zip.aes: Copied (new)
2025/06/25 21:11:46 DEBUG : duplicati-i27b55067c25540eaadb654969cc07168.dindex.zip.aes: Need to transfer - File not found at Destination
2025/06/25 21:11:46 DEBUG : duplicati-b9172ddbbef4d40ef868900b858f4f40e.dblock.zip.aes.ec4ec2f5.partial: renamed to: duplicati-b9172ddbbef4d40ef868900b858f4f40e.dblock.zip.aes
2025/06/25 21:11:46 INFO  : duplicati-b9172ddbbef4d40ef868900b858f4f40e.dblock.zip.aes: Copied (new)
2025/06/25 21:11:46 DEBUG : duplicati-i27b96f8b9093489888a0ccefced73621.dindex.zip.aes: Need to transfer - File not found at Destination
2025/06/25 21:11:47 DEBUG : duplicati-b91751ad8a2cd47fb9b04bac02d2a9566.dblock.zip.aes.25a12e35.partial: renamed to: duplicati-b91751ad8a2cd47fb9b04bac02d2a9566.dblock.zip.aes
2025/06/25 21:11:47 INFO  : duplicati-b91751ad8a2cd47fb9b04bac02d2a9566.dblock.zip.aes: Copied (new)
2025/06/25 21:11:47 DEBUG : duplicati-i27be5535b7ec4aa9b597c32e0c30dcbf.dindex.zip.aes: Need to transfer - File not found at Destination
2025/06/25 21:12:47 DEBUG : sftp://[email protected]:23/bak2/: Closing 5 unused connections
2025/06/25 21:20:43 DEBUG : pacer: Reducing sleep to 5m2.625s
Transferred:       55.032 GiB / 412.677 GiB, 13%, 0 B/s, ETA -
Checks:              8653 / 8653, 100%, Listed 42861
Transferred:         1129 / 11135, 10%
Elapsed time:     26m43.5s
Transferring:
 * duplicati-b9176b50b3e9…46a2f44.dblock.zip.aes: transferring
 * duplicati-b9178446f18e…35466fe.dblock.zip.aes: transferring
 * duplicati-b917cb1bda58…fad857c.dblock.zip.aes: transferring
 * duplicati-b917daf69e18…b0958b1.dblock.zip.aes: transferring
 * duplicati-b917f40aea33…74f1bcc.dblock.zip.aes: transferring

However, it does not seem to recover after the sleep.

degudejung avatar Jun 25 '25 21:06 degudejung

That is the new code working - great!

However, it does not seem to recover after the sleep.

Looks like it is having the retry time on each success which I don't think is correct, but it should get going eventually (after 16m)

ncw avatar Jun 25 '25 22:06 ncw

Yep, the old error definitely disappeared, thank you! I am still fiddling around with the "Too many requests" problem. Hetzer (=target) responded they would not limit the storage on the writing end. And, of course, MS does not respond at all whether they limit the OneDrive on the reading end.

What's surprising is that all bwlimits, tpslimits, transfers and checkers seem to have very little effect (if not none at all) on the stability of the transfer. Is there a way to dig a little deeper into what party (server, source or target) is causing the error? Also, I searched through your older posts regarding OneDrive and its throttling. Most people complain about how many seconds the sleep should be. I encounter sleep times of ~30 min and more and I observe how they come down over the course of a day of copy operation. But honestly, the majority of time the process appear to be in sleep and I wonder if it has to be that way?

Here's my most recent:

2025/06/26 11:15:06 DEBUG : Too many requests. Trying again in 1893 seconds.
2025/06/26 11:15:06 DEBUG : pacer: low level retry 1/10 (error too many requests: trying again in 31m33s)
2025/06/26 11:15:06 DEBUG : pacer: Rate limited, increasing sleep to 31m33s
2025/06/26 11:15:07 DEBUG : duplicati-b4e1c8f42de544b72ab156f2b5e5959fa.dblock.zip.aes.c985fba6.partial: renamed to: duplicati-b4e1c8f42de544b72ab156f2b5e5959fa.dblock.zip.aes
2025/06/26 11:15:07 INFO  : duplicati-b4e1c8f42de544b72ab156f2b5e5959fa.dblock.zip.aes: Copied (new)
2025/06/26 11:15:07 DEBUG : duplicati-b94a9a22c43594a4ea03a888f5d740fce.dblock.zip.aes: Need to transfer - File not found at Destination
2025/06/26 11:15:07 DEBUG : Too many requests. Trying again in 1828 seconds.
2025/06/26 11:15:07 DEBUG : pacer: low level retry 2/10 (error too many requests: trying again in 30m28s)
2025/06/26 11:15:07 DEBUG : pacer: Rate limited, increasing sleep to 30m28s
Transferred:      213.811 GiB / 701.638 GiB, 30%, 0 B/s, ETA -
Checks:              6850 / 6850, 100%, Listed 80210
Transferred:         4385 / 14389, 30%
Elapsed time:    4h39m2.5s

degudejung avatar Jun 26 '25 12:06 degudejung

Thanks for testing @degudejung

I've merged this to master now which means it will be in the latest beta in 15-30 minutes and released in v1.71 and v1.70.2

You didn't say what protocol you are using for the storage box, but I'd say this is onedrive throttling. People have reported improvments using --user-agent - see https://rclone.org/onedrive/#excessive-throttling-or-blocked-on-sharepoint

--tpslimit is usually quite effective, but it might need to be quite low, eg --tpslimit 1

ncw avatar Jun 26 '25 15:06 ncw