leejason
leejason
I'm experimenting on whether the "creative" nature in GPT-2 can be "innovative" in patent sense. The "fake news" issue should be a lesser concern since it's unreasonable to pay significant...
Does the WebText corpus include patent data, say data from the USPTO or Google Patents?
> we’ve been adding tests this week to fortify the TPU code. > Expect the next release in a few days to stabilize TPU training more thoroughly. Very impressive &...
Any chance to implement the following in GPT-3? Or was it done somewhere else for enhancing GPT-2? [Appendix B](https://arxiv.org/pdf/2005.14165). .... During training we always train on sequences of the full...
Thanks for the update. Does "done implicitly" mean the following? > raw_text += start_token + row[0] + end_token + "\n" If so, how would the attention mechanism take start_token and...
You may consider running "device_serve.py" on TPU and the "streamlit" approach in the following. https://github.com/vicgalle/gpt-j-api
I was trying streamlit as a quick web app for testing model inference and found it convenient. Indeed, the floating IP of TPU is another issue. As for stop_sequence, I...
Me too. Any advice would be appreciated.
+1
very helpful & thanks