vector-quantize-pytorch
vector-quantize-pytorch copied to clipboard
RQ-VAE: How can I get a list of all learned codebook vectors (as indexed in the "indices")?
Hi Lucid, i am working on quantizing CLIP image embeddings with your RQ-VAE. It works pretty well.
Next I want to take all learned codebook vectors and add them to the vocab of a GPT (as frozen token embeddings).
The idea is to train a GPT with CLIP image embeddings in between texts, e.g. IMAGE-CAPTION or TEXT-IMAGE-TEXT-IMAGE- ... Flamingo-style).
If this works, then GPT could maybe also learn to generate quantized CLIP IM embeddings token by token --> and then e.g. show images through a.) retrieval or b.) a DALLE 2 decoder :)
... So my question is: Once the RQ-VAE is trained and i can get the quantized reconstructions and indices - How can I get a list or tensor of the actual codebook? (all possible vectors from the rq-vocab) :)
+1 I can reverse engineer the forward function, but it'd be nice if there was an easy function call I'm missing
Edit: ended up reverse engineering it anyways :-) You can do codes from indices like:
quantizer.layers[i]._codebook.embed[0, tokens_ids[:, i]]
for each layer i
in the residual vector quantizer. As a bonus, you can reconstruct the input (image / audio / etc.) by doing:
decoded_vector = 0.0
for i, layer in enumerate(quantizer.layers):
vector = vector + layer._codebook.embed[0, tokens[:, i]]
@christophschuhmann @kradonneoh oh hey! nice to hear that the library is working well for your use case
I've added the feature to return all the codes across quantization layers here https://github.com/lucidrains/vector-quantize-pytorch/commit/ec2474608816a3752b13dc826a19b8966d98804e