FrankWolfe.jl
FrankWolfe.jl copied to clipboard
In-place & out-of-place grad refactor
Closes #238
a general comment, I would not rename the grad! functions everywhere, just making sure that all gradients return their storage
Consistency
There are definitions like
grad!(storage, p) = storage .= 2p
and
function grad!(storage, p)
@. storage = 2p
return nothing
end
or
function grad!(storage, p)
@. storage = 2p
end
or
function grad!(storage, p)
@. storage = 2 * p
end
etc. Should these be written in the same manner? If yes, which one?
There's also some differences in capitalization, etc., for example using storage, x vs storage, X as arguments. Is this intended?
Implementation
There are grad! implementations that begin with storage .= 0, e.g.:
function grad!(storage, coefficients)
storage .= 0
for (x, y) in extended_training_data
p_i = dot(coefficients, x) - y
@. storage += x * p_i
end
storage ./= length(training_data)
return nothing
end
How would their out-of-place implementation look like?
function grad_oop(storage, coefficients)
r = zeros(length(storage))
...
return r
end
What about TrackingGradients?
~~I assume you would want grad! to not be renamed to grad_iip!.~~ Should I name the oop versions as grad_oop? Should I duplicate tests for the oop versions?
Differences in capitalization should be let as-is, this is typically when the input is a matrix.
Both forms:
function grad!(storage, p)
@. storage = 2p
end
or
function grad!(storage, p)
@. storage = 2 * p
end
are okay, no need to change them.
The important thing is returning the storage on the cases where there is a return nothing
Hi @xlxs4 and than you for the contribution. For the sake of simplicity we will for now keep the current API, this would be a breaking change that would be fairly subtle in some cases and without major use cases. Thanks again, and happy to take in other contributions :)