Mike J Innes
Mike J Innes
Seems good to me, perhaps good to turn it into a PR? The `zero` issue has come up before, I have a planned fix but haven't put it together yet.
As in, `parent(x::TrackedArray) = data(x)`? Problem is that it conflicts with the normal meaning `parent` when `x` is an `Adjoint` (that happens to be tracked). e.g. if I write `f(x)...
It does hold in the sense that if you pass a `SubArray` to that `f` it still behaves as the identity. I can use `parent` in generic code as long...
This doesn't convert Float32 to Float64: ```julia julia> float(1.0f0) 1.0f0 ``` I could see how you might expect `param(1) * param(1f0)` to be Float32 though. I'm not sure what we...
It's designed to be essentially similar but with explicit state, rather than using IdDicts everywhere. I haven't really figured out how to make it convenient yet, though, so help is...
Looks good to me, but can you do a quick rebase / merge master so that the tests pass?
I guess this is due to us calling broadcast; ``` julia> float.(SharedArray(rand(5,5))) 5×5 Array{Float64,2}: ``` We could avoid this, but why do you need it? For parameter updates across processes?
This is also problematic – ``` julia> zero(SharedArray(rand(5))) 5-element Array{Float64,1}: 0.0 ... ``` I don't know if it's valid to just create a new shared array for the gradients in...
I think you may as well just edit the original `mnist.jl` for this. Does this script work if you avoid the `xla` conversion? Presumably it needs to be modified to...
Ok, this needs to change to pass the optimiser state around explicitly though, rather than using a stateful IdDict internally; otherwise it can't work with XLA (or other immutable objects...