Morgan McGuire comments

Results 24 comments of


                                            Morgan McGuire

training a language model is very slow

@radekosmulski After a quick look it seems like its the parallel tokenization that really slows down for some reason. In your colab example for me the progress bar flies for...

training a language model is very slow

Ah understood now...hmmm...I wonder what is going on

training a language model is very slow

Just to put some numbers on this, the decrease in speed is really disproportionate to the increase in dataset size... ## Numbers ### Data sizes, train/val split (lines in .txt...

training a language model is very slow

Just taking another peak at this, the slowness seems to come from the amount of data in the datasets, e.g. `dls.train_ds` Here is the size of the items in `train_ds`...

training a language model is very slow

Tried a few things based on what you said, but first here is a minimal repro: ## Repro ``` import fastai from fastai.text.all import * from fastcore.basics import * path...

training a language model is very slow

I spoke too soon, moving `Numericalize` to 'item_tfms' **does** speed things up when using a dataframe with large chunks of text in each row. But another issue with @radekosmulski 's...

training a language model is very slow

# Potential Solutions 1. Use smaller text files :D **or** 2. After a quick chat with Jeremy on the discord, a temporary work around would be do to the numericalization...

Wandb logging bug using Iteration based runner

@ayulockin can probably help here :)

Error while trying to load model from tuner.get_best_models()

Flagged with Ayush + Soumik

Seems Langchain Wandb can not handle the ChatOpenAI object?

Hey @firezym , @GraesonB , our preferred LangChain integration, W&B Prompts, can be found here: https://docs.wandb.ai/guides/prompts/quickstart The above is an earlier callback that we'll likely be deprecating in the coming...