velero icon indicating copy to clipboard operation
velero copied to clipboard

Slow backup because of the little QPS and burst numbers

Open dariodsa opened this issue 9 months ago • 5 comments

What steps did you take and what happened: A Velero process is setting up the Factory settings if a user is passing client burst and client qps parameters, so that later on if a KubeClient is requested from the Factory instance, KubeClient will have client burst and client qps parameters as intended. So if a user pass a client-burst or client-qps parameters those parameters will be added in the Factory settings of a main Velero process. https://github.com/vmware-tanzu/velero/blob/f654188243950d129d001783e58af5af7a1fbd8c/pkg/cmd/server/server.go#L298-L304

Also Velero have 5 minutes cache of a discovered resources, but all of that is not used in the plugin section of a code. In pkg/cmd/server/plugin/plugin.go there is a multiple creation of instance discoveryHelper so when velero-plugin process is running it does discovery multiple times (newServiceAccountBackupItemAction function, newRemapCRDVersionAction function).

https://github.com/vmware-tanzu/velero/blob/f654188243950d129d001783e58af5af7a1fbd8c/pkg/cmd/server/plugin/plugin.go#L202-L207

https://github.com/vmware-tanzu/velero/blob/f654188243950d129d001783e58af5af7a1fbd8c/pkg/cmd/server/plugin/plugin.go#L236-L243

Also velero-plugin is running in a seperate process so factory settings like clientQPS or clientBurst are not passed. So KubeClient that is being created in the Plugin section are using default values of client-go library which are around 10 qps. By simply running only discovery in the plugin section only once and adding default values in the Factory instance, Velero backups can actually end up in only a couple of seconds, instead of 10-40 seconds.
To prove that we try adding default values in the Factory instance as you can see down there and our backups were around 10* times faster. We suggest to add default values of a clientBurst and clientQPS parameters direct into the Factory creation function.(newFactory) or passing the client-burst and client-qps parameters to the velero-plugin process by using flags, the same way as the Velero main process is using. Suggestion

pkg/client/factory.go
func NewFactory(baseName string, config VeleroConfig) Factory {
	f := &factory{
		flags:    pflag.NewFlagSet("", pflag.ContinueOnError),
		baseName: baseName,
	}
        f.clientQPS = 100;  // those numbers are already default in the main Velero process
        f.clientBurst = 100;  // those numbers are already default in the main Velero process
....

I did a presentation about that on DORS/CLUC conference in Zagreb. You can check details here.

  • :+1: for "I would like to see this bug fixed as soon as possible"
  • :-1: for "There are more important bugs to focus on right now"

dariodsa avatar May 18 '24 09:05 dariodsa