text2text
text2text copied to clipboard
Text2Text: Crosslingual NLP/G toolkit
Two approaches to try: 1. Use crosslingual embeddings as input to MLP or tree-based model in transfer learning fashion 2. Fine-tune crosslingual translator with softmax output
Fine-tune cross-lingual translator for text2text generation tasks, e.g. question generation, question answering, summarization, etc. to demonstrate cross-lingual alignment, zero-shot generation, etc. For example, can we demonstrate question generation or question...
Perform a similar study to https://arxiv.org/pdf/1907.04307.pdf but expanding to support 100 languages using the [embeddings from the translator](https://github.com/artitw/text2text#embedding--vectorization). Possibly start with the paper's [code sample](https://www.tensorflow.org/hub/tutorials/cross_lingual_similarity_with_tf_hub_multilingual_universal_encoder).
I get this error when using `Handler()`: `json.decoder.JSONDecodeError: Unterminated string starting at: line 98979 column 3 (char 2833058)` Here is a screenshot of the error: Here's the test code that...
Hi, I tried to generate question for Arabic and Urdu language and it seems small model cannot fit into memory to generate question. It runs for a long time and...
Training and inferencing performance could be better. Need to update and test https://github.com/artitw/apex
Currently, the documentation consists of the README, which very brief. There is much more functionality in the text2text API that is not described. Such functionality can be better documented for...
There is currently no type checking, so we can follow practices from https://docs.python.org/3/library/typing.html
Follow guidelines from official Python documentation for unit testing: https://docs.python.org/3/library/unittest.html
Turn colab demo notebook into integration tests: https://colab.research.google.com/drive/1LE_ifTpOGO5QJCKNQYtZe6c_tjbwnulR https://github.com/artitw/text2text/blob/master/text2text_demo.ipynb