Fix Semantic Feature Extraction from Incorrect Layer
This PR addresses an inconsistency between the code and the paper regarding which layer is used for semantic feature extraction in W2v-BERT 2.0. The paper specifies:
"In detail, we utilize the hidden states from the 17th layer of W2v-BERT 2.0."
However, in the current code, the features are extracted using feat = vq_emb.hidden_states[17], which actually points to the 18th layer due to zero-based indexing in Python. This PR updates the code to use vq_emb.hidden_states[16], aligning it with the paper's description if it indeed intended to refer to the 17th layer.
Question: Could you clarify if this is an error in the paper or the code? This PR assumes the paper’s statement is correct, but confirmation would be helpful.