hebrew-gpt_neo icon indicating copy to clipboard operation
hebrew-gpt_neo copied to clipboard

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 made avilable via TPU Research Cloud Program.

hebrew-gpt_neo

Hebrew text generation models based on EleutherAI's gpt-neo. Each was trained on a TPUv3-8 which was made avilable to me via the TPU Research Cloud Program.

JS Colab notebook Open in Google Colab

Gradio Colab notebook Open in Google Colab

Datasets

  1. An assortment of various Hebrew corpuses - I have made it available here

  2. oscar / unshuffled_deduplicated_he - Homepage | Dataset Permalink

The Open Super-large Crawled ALMAnaCH coRpus is a huge multilingual corpus obtained by language classification and filtering of the Common Crawl corpus using the goclassy architecture.

Models

hebrew-gpt_neo-xl

  • Model configs <BR>
  • Available on Huggingface <BR>
  • A Google Colab Notebook is available here <BR>

hebrew-gpt_neo-small

  • Model configs <BR>
  • Available on Huggingface <BR>
  • A Google Colab Notebook is available here <BR>

hebrew-gpt_neo-tiny

  • Model configs <BR>
  • Available on Huggingface <BR>
  • A Google Colab Notebook is available here <BR>