GPT2-Chinese 请问为什么会报tcmalloc: large alloc 8930017280 bytes = 0x7f5b3f7c600 @ 0x7f5e071601e7 0x..

请问为什么会报tcmalloc: large alloc 8930017280 bytes = 0x7f5b3f7c600 @ 0x7f5e071601e7 0x..

Open lztz0022 opened this issue 5 years ago • 5 comments

用的是推荐的语料里那个 3.7G的new2016zh.zip，接压缩后，该名为train.json。训练时如果用默认batch_size会出现cuda memory不足的错误。于是，改成 train.py --raw --epochs 5 --batch_size 1 结果出现： using device: cuda building files reading lines tcmalloc : large alloc 8930017280 bytes = 0x7f5b3f7c6000 @ 0x7f5e071601e7 0x5929fc .......很长一行 tcmalloc : large alloc 8930017280 bytes = 0x7f5b3f7c6000 @ 0x7f5e071601e7 0x5450df 0x53e2c8 .......又很长一行 tcmalloc : large alloc 17860034560 bytes = 0x7ff501aca000 @ 0x7f5e071601e7 0x5450df 0x52e319 ...... 第三行 ^c 没错，第四行就是一个^c，好像被强行终止了。请问这会是怎么回事？

Jan 13 '20 13:01 lztz0022

楼主有没有解决呀？用的是colab跑的吗:o?

Jul 23 '20 17:07 doulemint

hi 我也遇到了同样的问题是什么原因解决了吗

Jan 31 '22 17:01 EagleEyeKestrel

哦应该就是json太大 colab内存用光了

Feb 01 '22 09:02 EagleEyeKestrel

@EagleEyeKestrel 过年还在奋斗，辛苦辛苦，新年快乐

Feb 01 '22 13:02 doulemint

@EagleEyeKestrel 过年还在奋斗，辛苦辛苦，新年快乐

哈哈哈，新年快乐兄弟。话说你还记得吗，用这个repo训下来效果如何？

Feb 03 '22 16:02 EagleEyeKestrel

GPT2-Chinese GPT2-Chinese copied to clipboard

请问为什么会报tcmalloc: large alloc 8930017280 bytes = 0x7f5b3f7c600 @ 0x7f5e071601e7 0x..

GPT2-Chinese
GPT2-Chinese copied to clipboard