Achyudh Ram

Results 13 comments of Achyudh Ram

Across hedwig, the --trained-model arg is used to point towards a snapshot. So to fix this should we add a flag to allow testing directly on pretrained models?

I've seen this happen in cases where there isn't enough system memory. Can you please check if that's the issue by monitoring memory usage?

> I also removed the hugging-face flag, although it would be nice if the weights were already in `hedwig-data`. Should I rename this PR and make a seperate one for...

@xdwang0726 For BERT, we do treat the entire document as a single sentence. For the hierarchical version of BERT (H-BERT), we split the document into its constituent sentences.

@tralfamadude I am not sure how you would be able to use the pre-trained models for more than a thousand tokens. Since the maximum sequence length of the pre-trained models...

@xdwang0726 yes, if I understand your question correctly, that is the case for BERT

Yeah https://github.com/castorini/hedwig/pull/38 adapts the model from https://arxiv.org/pdf/1607.01759.pdf for document classification, though you might have to dig into the implementation to see if there are differences between our model and Facebook's...

I see that the last three elements have values 0, 0, 0. Even though your input is of non-zero length, the length vector might not have been set properly. Could...

As suggested in the meeting today, let's split Castor and deal with cleaning up code for the single text sequence tasks first. It would be nice if we can have...