simpleT5 issues

Unicode Charecter training issue

5

I tried to train My model for translating English to Bengali. After Training when I run the code, The output is not Unicode Bengali character. I Eat Rice (eng)=> আমি...

rahat10120141

ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).

1

import soundfile as sf from scipy.io import wavfile from IPython.display import Audio from transformers import Wav2Vec2ForCTC, Wav2Vec2CTCTokenizer import speech_recognition as sr import io from pydub import AudioSegment tokenizer = Wav2Vec2CTCTokenizer.from_pretrained("facebook/wav2vec2-base-960h")...

Ushanjay

Add LongT5 support

2

Shivanandroy

Saved model name not customizable

` def training_epoch_end(self, training_step_outputs): """ save tokenizer and model on epoch end """ self.average_training_loss = np.round( torch.mean(torch.stack([x["loss"] for x in training_step_outputs])).item(), 4, ) path = f"{self.outputdir}/simplet5-epoch-{self.current_epoch}-train-loss-{str(self.average_training_loss)}-val-loss-{str(self.average_validation_loss)}"` Will be very helpful...

ke-lara

How can i train my t5 (not t5.1 or not mt5) model from scratch?

I want to train my t5 model from scratch and with bpe tokenizer, is there a example ?

520jefferson

No Model.Save method?

I'd like to save the model. It has a load method, but if i save using model.model.save_pretrained and then use model.load I get: OSError: Can't load tokenizer for 't5.model'. If...

simonhughes22

how to train in multi-gpus

2

I don't find the parameter for multi-gpus training

jiangliqin

how to use simple t5 to build a new language model

hi im trying to create a new language model for tamil downtask is abstractive question Answering how to use simple t5 to build a new language model i have dataset...

AswiN-7

Model.predict on vector of strings

1

Thanks for outsourcing nice repo @Shivanandroy did you also develop functionality to predict multiple strings as a single vector like batch execution. Its computationally expensive to predict one by one

Nagakiran1

TPU support

4

Any chance you can add TPU support in the Colab? I think this is supported more of less out of the box now in the newest pyTorch Lightning versions.

peregilk

simpleT5
simpleT5 copied to clipboard

Metadata

Unicode Charecter training issue

ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).

Add LongT5 support

Saved model name not customizable

How can i train my t5 (not t5.1 or not mt5) model from scratch?

No Model.Save method?

how to train in multi-gpus

how to use simple t5 to build a new language model

Model.predict on vector of strings

TPU support

← Metadata

Owner

Metadata

simpleT5 simpleT5 copied to clipboard

Metadata

← Metadata

Owner

Metadata

simpleT5
simpleT5 copied to clipboard