leejason

Results 13 issues of leejason

Thank you for a nice example, but I bumped into the following error when executing "estimator.train(input_fn=train_input_fn, max_steps=num_train_steps)" in run_classifier_multi_labels_bert.py. RuntimeError: Attempted to use a closed Session. Any suggestion?

Thank you for great work. The Appendix B of the [GPT-3 paper](https://arxiv.org/abs/2005.14165) mentions the following. I'm wondering whether the idea has been implemented in gpt2-ml. If not yet, what would...

Thank you for great work. Is it possible to train with variable length and padding like the following? > One last detail. GPT2 was pre-trained by OpenAI on large spans...

It would be great if TPU is possible.

Will the code for training from scratch be released after the 1.5B model?

Sorry about a newbie question. If I'd like to integrate OpenAI GPT-2 for autocompletion, which source code should I try?

Is the pretraining of GPT-J-6B based on CausalTransformerV2 or simply CausalTransformer? Why? Thanks for any advice.

I updated "6B_roto_256.json" with the following for trying a smaller model. > "d_model": 768 The pretraining works on one TPU v3-8, but the slimmed model after using "slim_model.py" produces gibberish...

Is "to_hf_weights.py" specific to "6B_roto_256.json" only? I was trying to make this codebase work for smaller models (e.g., "layers": 12, "d_model": 768, "n_heads": 16). However, the HF model produced by...

For making "to_hf_weights.py" work correctly, do I have to modify the following if I have my own tokenizer trained with vocab_size=50400? Or, can I assume that "GPT2Tokenizer" does not matter...