FrankWolfe.jl
FrankWolfe.jl copied to clipboard
Relax assumption that gradient is in-place
In some cases, the gradient is computed out-of-place for some reason (some AD systems don't support in-place, gradient computations done by external programs, etc).
function grad!(storage, x)
g = grad_func(x)
# unnecessary copy
@. storage = g
end
Because the gradient interface currently assumes in-place modification, a copy to the storage
argument is always necessary.
We should assume the gradient is returned from the grad!
function. That way, the example above could just ignore the storage argument
Hi @matbesancon! So, change each definition of grad!
removing storage
as an argument and return the value that would have been assigned to storage
?
hi, no in fact we would keep a storage I believe, but not assume the grad! function returns it, so you would have something like
# out of place
function g_out(storage, x)
return 3x
end
# inplace
function g_in(storage, x)
@. storage = 3x
return storage
end
the second already works but the first wouldn't
the second version would be to have everything out of place, but then users would need to set up a closure and workspace to make it efficient and that's too much of a hassle
So, for example, instead of:
function grad!(storage, x)
@. storage = 2 * (x - xp)
end
Something along the lines of:
function grad_iip!(storage, x)
@. storage = 2 * (x - xp)
return storage
end
function grad_oop(storage, x)
return 2 * (x - xp)
end
Just making sure I understand what this is about
yes exactly. The point is that some procedures will perform an out of place anyway so forcing the API to be in-place only doesn't make sense
I have a lot of questions. I've opened #373 to facilitate communication and will drop some comments there