Flux.jl
Flux.jl copied to clipboard
update Embedding layer
Updates the Embedding layer to use gather
for AbstractVector
which earlier had an issue with repeated indices. Also adds special case for outputsize
.
Needs tests
ref. #1516
I rebase the ref. #1516 on master since there were some conflicts. So I guess you should do git rebase -i origin/cl/embed
here
Or maybe it is just easier if I merge #1516 and then this targets master
Or maybe it is just easier if I merge #1516 and then this targets master
Yeah sure that works
Maybe labels other than integers starting at 1 deserve some thought. Flux's one-hot encoding lets you specify the set of labels, which are not stored, onecold
returns integers starting at 1.
julia> Flux.onehotbatch(collect("hello"), 'a':'z')
26×5 Flux.OneHotArray{26,2,Vector{UInt32}}:
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 1 0 0 0
0 0 0 0 0
0 0 0 0 0
1 0 0 0 0
0 0 0 0 0
0 0 0 0 0
⋮
julia> Flux.onecold(ans) # does not remember
5-element Vector{Int64}:
8
5
12
12
15
Should this do something similar? Then m = Embedding(0:9 => 5)
would have to store something, so that m(0)
can give the first vector. Or is this unnecessary complication? Better handled by composing onehot
and Embedding
?
Mapping to an indices or one-hot space is a standard data transformation for using a layer like Embedding
. So I would say it's not a necessary feature for the layer. And we would always need to support 1:N which could get tricky if the "labels" are also an integer range.
If we did want to include it, I would store a "transform" function within the layer.
PS: we do support onecold(x, labels)
Ok, indeed this should probably be written something like Chain(onehot('a':'z'), Embedding(26 => 5))
rather than duplicating this.
Should this be re-targeted for master?
yes. Probably filing a new PR is easier
I changed the base branch to master.
Looks like you need a rebase too
@manikyabard are you still up for rebasing this and moving forward? We ought to close up this loose end.
@manikyabard are you still up for rebasing this and moving forward? We ought to close up this loose end.
Yeah I can continue working on this, although I am not sure about the approach we should take for outputsize
. Maybe we can discuss this further in the next community call.
Maybe we can discuss this further in the next community call.
Yeah, sounds good!
Summarizing what was discussed on the call:
- the issue is that the paths for
AbstractVector{<:Integer}
that dogetindex
calls don't make sense sinceVector{Nil}
is not a vector of indices - Michael's PR to make
Nil
a subtype ofReal
can help avoid the cases whereNil
should only be hitting theAbstractArray{<:Real}
paths - it still won't solve the issue when
outputsize
is used for a model utilizing theAbstractVector{<:Integer}
path- this will need an
outputsize
override rule forEmbedding
+AbstractVector{Nil}
- (this wasn't brought up during call but I just thought of it) a better solution would be to define
NNlib.gather
forNil
which will cover all indexing cases beyond justEmbedding
- this will need an
- (this wasn't brought up during call but I just thought of it) a better solution would be to define
NNlib.gather
forNil
which will cover all indexing cases beyond justEmbedding
You mean something like this?
NNlib.gather!(dst::AbstractArray, ::AbstractArray, ::AbstractArray{<:Nil}) = fill(nil, size(dst)...)
(m::Embedding)(x::AbstractVector{<:Nil}) = NNlib.gather(m.weight, x)
That looks right but you won't need to special case for Embedding
anymore. It should go through the NNlib.gather
rule automatically.
this needs a rebase in master
Codecov Report
Merging #1656 (de43bf5) into master (3cc9067) will decrease coverage by
0.07%
. The diff coverage is60.00%
.
@@ Coverage Diff @@
## master #1656 +/- ##
==========================================
- Coverage 84.58% 84.51% -0.08%
==========================================
Files 21 21
Lines 1486 1485 -1
==========================================
- Hits 1257 1255 -2
- Misses 229 230 +1
Impacted Files | Coverage Δ | |
---|---|---|
src/onehot.jl | 95.29% <ø> (-0.06%) |
:arrow_down: |
src/outputsize.jl | 82.05% <0.00%> (-2.16%) |
:arrow_down: |
src/layers/basic.jl | 80.99% <75.00%> (-0.16%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 3cc9067...de43bf5. Read the comment docs.
Can you add a test for outputsize
of Embedding
?