Significance of index offset in saveKNearestVocabs
Hello,
I had been looking through code and porting parts of it python. In saveKNearestVocabs there is a part that has an offset in a for loop over the vocab, at first I thought it was just because differences between julia being 1-indexed and python being 0-indexed but now I am not sure
function saveKNearestVocabs(region::SpatialRegion, datapath::String)
V = zeros(Int, region.k, region.vocab_size)
D = zeros(Float64, region.k, region.vocab_size)
for vocab in 0:region.vocab_start-1
V[:, vocab+1] .= vocab
D[:, vocab+1] .= 0.0
end
for vocab in region.vocab_start:region.vocab_size-1
cell = region.vocab2hotcell[vocab]
kcells, dists = knearestHotcells(region, cell, region.k)
kvocabs = map(x->region.hotcell2vocab[x], kcells)
V[:, vocab+1] .= kvocabs
D[:, vocab+1] .= dists
The resulting file just has an empty first entry in the V and D arrays, since the PAD token is actually at index 1 and then the vocab which starts at 4 is now at index 5. Is there a downstream motivation for this or just how it was first implemented?
It is a preserved interface at the time of its implementation just in case that we might want to change the vocab_start in the future. You can ignore it in your implementation.