Santosh Gupta

Results 83 comments of Santosh Gupta

I would like to see a link to your notebook

> I would also find it helpful if there is a suggested 'max_subtoken_length' value. > > I haven't yet received a memory issue, but t2tdatagen is taking a long time...

I was looking for a way to load the model into huggingface's model. These need a config.json file and model.bin file. I was wondering what format these file is, and...

Has anyone been able to create a python object of the model? I made an attempt here https://colab.research.google.com/drive/15Gak_LmwEWPbJo3w_EVG8FyfLMWPwoGh?usp=sharing But wasn't able to successfully able to create the model.

Thanks Sean, very much appreciated!

Any update on how this would effect performance? I would love this.

It's a transformer model, with some custom heads. Maybe there's some loss issue I'm overlooking? The command like is just `bash deepspeed/deepspeed_ss.sh 4 ' And then file is ``` NGPU_PER_NODE=$1...

> Are you able to reproduce the hang? If so, we have had good luck debugging this sort of thing with py-spy: > > https://pypi.org/project/py-spy/ > > py-spy dump --pid...

I am wondering if you could just leave the summaries blank, or with a simple token. I believe during inference, the summary is not used at all.

The text you're sending to format_to_bert doesn't seem to be in the right format. Check out the sample format here. https://github.com/nlpyang/BertSum/issues/61 I noticed your sample text didn't' have a "@Highlight...