Contradiction in `save_for_backward`, what is permitted to be saved
https://pytorch.org/tutorials/beginner/examples_autograd/two_layer_net_custom_function.html "ctx is a context object that can be used to stash information for backward computation. You can cache arbitrary objects for use in the backward pass using the ctx.save_for_backward method."
https://pytorch.org/docs/stable/generated/torch.autograd.function.FunctionCtx.save_for_backward.html "save_for_backward should be called at most once, only from inside the forward() method, and only with tensors."
Most likely the second is correct, and the first is not. I haven't checked.
Suggestion: "You can cache tensors for use in the backward pass using the ctx.save_for_backward method. Other miscellaneous objects can be cached using ctx.my_object_name = object."
cc @albanD