DIM
DIM copied to clipboard
4 saved nets
Hi, when I load .t7 file, I can see 4 nets as dict in variable. 'Controller.encoder' 'conv2.classifier' 'fc4.classifier' 'glob-1.classifier' what is the role of each net in code? which one i can use to calculate Mutual Information between 2 images? best regards
then output of Controller.encoder is a vector in size (64), is this vector the encoded with max mutual information of input?
So the encoder
is where everything is happening, from encoding the MI to scoring pairs from the joint / product of marginals. Unfortunately it's not separable in a simple way because it was built to take advantage of dataparallel, but it's still relatively easy to use.
So the controller
first draws some data, then passes it through the encoder:
https://github.com/rdevon/DIM/blob/master/cortex_DIM/models/controller.py#L201-L208
The layer_outs
is what we want, which actually is a set of vectors in the space we're going to computing the score (related to the PMI). This is a little opaque, but check out the forward function of the BigEncoder:
https://github.com/rdevon/DIM/blob/master/cortex_DIM/models/controller.py#L77
So what this forward does is it calls a function to extract some tensors needed for various models one might attach on top of an encoder. Once again, this was done this way to make dataparallel faster, but it made things a little less clear.
So what the encoder forward is doing after passing the data through the encoder when you attach DIM to it is this: https://github.com/rdevon/DIM/blob/master/cortex_DIM/models/dim.py#L244
Which computes L
and G
which are the transformed versions of the local and global vectors respectively. We are going to use these to compute the score.
https://github.com/rdevon/DIM/blob/master/cortex_DIM/models/dim.py#L294-L301
So basically all you need to do is:
- pull data
- pass data through encoder like in the controller
- extract the local and global vectors from
layer_outs
dict (keyword should belocal
) like inroutine
method in DIM. - score according to the function of your choice here, e.g.,: https://github.com/rdevon/DIM/blob/master/cortex_DIM/functions/dim_losses.py#L13
Keep in mind that the PMI should be derivable from the scores u
. Let me know exactly what loss function you are using and I can help you out there.
thanks a lot for your response, when i load nets from .t7 how can I do part 3? I could get a 64 size vector from nets.encoder. as I understood we compute Mutual Information on encoded feature from Images not strict by images that it is a combination of local features and global features, but what is the final result for saying this sentence "This is the vector that has most mutual information from input image and we can use it as an alternative for input image", Is it possible to say this sentence for final output of nets.encoder? (64 size vector of output)
Did you do return_rkhs=True
and return_all_activations=True
?
https://github.com/rdevon/DIM/blob/master/cortex_DIM/models/controller.py#L77
yes , I did it. and this is the total values are saved in .t7 file
these are all variables that loaded with .t7. .
@rdevon
So when you pass data through the Controller.encoder
network with those arguments, you should get a tuple of dictionaries. Is this the case?
yes , i've gotten a tuple, 1 list as the 0 argument with length 5 and 1 dict as the 1 argument with 0 length @rdevon
ok, so the list should be all activations, the dict should have values that correspond to the local / global vectors from DIM.