pablogranolabar

Results 23 comments of pablogranolabar

Ok, I stopped training Large and attempted to restart the training process but I'm getting the following error: ``` $ python3 main.py --model_size=large --per_gpu_train_batch_size=24 --do_test --reload_from=640 number of gpus: 0...

Yah that's the hope. Digging into it today to do some memory profiling.

85.0b4 (64-bit) on Ubuntu FYI

Looks like they've pulled ESNI with 85.0b4 but then added it back again: https://bugzilla.mozilla.org/show_bug.cgi?id=1667801

So this is super duper bad on the part of the Firefox development team, they've yanked ESNI without even telling anyone which could result in someone getting smoked depending on...

Hi @golsun, thanks for the quick response! The two machine idea makes sense, I think I can do that with relative ease if it comes to that. For the DialogRPT...

Hi again @golsun. I'm working on ensembling human_vs_rand with updown per your advice, but I'm unsure of the way to proceed with ensemble.yml. Should human_vs_rand and updown be a part...

Ok, converted these weights from GPT-JT and it generated the model file, however I'm getting the following error when loading: ``` gptj_model_load: f16 = 1 gptj_model_load: ggml ctx size =...

Yes and no, it's getting a lot of conflicting reviews because GPT-JT is fine tuned for task oriented stuff like chain of thought reasoning. So for canned general tasks like...

probably best suited for a new issue, but @ggerganov what do you think about adding 8-bit inference? this would further cut model memory consumption by 50% and with nominal loss...