notebooks Needed help with accessing multi-batch trf

Needed help with accessing multi-batch trf_data

Open JovanNj opened this issue 3 years ago • 13 comments

Hi,

First of all, thank you very much for the excellent course on NLP, it is very clearly written!

I got stuck on the Contextual embeddings from Transformers part of the Natural Language Processing for Linguists course. I tried to implement your custom class tensor2attr on longer texts, but it starts reporting errors, since your class is designed to only access the first batch. Could you please give an advice on how to implement tensor2attr class on docs with multiple batches?

I have tried to change your code in the part # Get Token tensors under tensors[0]; the second [0] accesses batch, in the following way:


def span_tensor(self, span):
        
        # Get alignment information for Span. This is achieved by using
        # the 'doc' attribute of Span that refers to the Doc that contains
        # this Span. We then use the 'start' and 'end' attributes of a Span
        # to retrieve the alignment information. Finally, we flatten the
        # resulting array to use it for indexing.
        tensor_ix = span.doc._.trf_data.align[span.start: span.end].data.flatten()
        
        # Get Token tensors under tensors[0]; the second [0] accesses batch
        tensor = span.doc._.trf_data.tensors[0][0][tensor_ix]
        
        # Sum vectors along axis 0 (columns). This yields a 768-dimensional
        # vector for each spaCy Token.
        return tensor.sum(axis=0)

with this one:


def span_tensor(self, span):
        
        # Get alignment information for Span. This is achieved by using
        # the 'doc' attribute of Span that refers to the Doc that contains
        # this Span. We then use the 'start' and 'end' attributes of a Span
        # to retrieve the alignment information. Finally, we flatten the
        # resulting array to use it for indexing.
        tensor_ix = span.doc._.trf_data.align[span.start: span.end].data.flatten()
        
        # Get Token tensors under tensors[0]; the second [0] accesses batch
        tensor = span.doc._.trf_data.tensors[0].reshape(-1, 768)[tensor_ix]
        
        # Sum vectors along axis 0 (columns). This yields a 768-dimensional
        # vector for each spaCy Token.
        return tensor.sum(axis=0)

But, I am not sure if that's the right way to do it, since I get some strange similarity scores. I would highly appreciate any advice or help you could share.

Thank you very much for your efforts!

Best, Jovan

Apr 01 '21 09:04 JovanNj

notebooks notebooks copied to clipboard

Needed help with accessing multi-batch trf_data

notebooks
notebooks copied to clipboard