Suharu comments

Results 12 comments of


                                            Suharu

Errors when pre-training Bert on local GPU

I solved this problem by referenced others blogs. (sorry I forgot the link.) I tried many envs on conda with different kinds of tensorflow-gpu and python's combination, but its all...

Where is pre-trained model tf_examples.tfrecord ?

**tf_examples.tfrecord** file need to be made by using this script, create_pretraining_data.py. To use this , you need to prepare the text files according to betold format.

OSError: [Errno 9] Bad file descriptor raised on program exit

> I tried changing the MirroredStrategy to OneDeviceStrategy and the exception went away. So, not sure if it is an issue caused by both combination of python and TF problems....

OSError: [Errno 9] Bad file descriptor raised on program exit

> In my case, I'd to specify `--distribution_strategy=one_device` here in my tests https://github.com/open-ce/tensorflow-feedstock/blob/main/tests/open-ce-tests.yaml#L22 @npanpaliya I'm using tensorflow model -garden, and tired your way to add strategy but tat parameter is...

OSError: [Errno 9] Bad file descriptor raised on program exit

> This happens in TF 2. 7 too with python 3.9 > > I think it's because MirroredStrategy [creates a multiprocessing ThreadPool](https://github.com/tensorflow/tensorflow/blob/9eb5fdf99053625f6e870e895a7cce6d1d3ed752/tensorflow/python/distribute/cross_device_ops.py#L1104), but doesn't close it before the program ends,...

OSError: [Errno 9] Bad file descriptor raised on program exit

@npanpaliya I'm training Bert using the run_pretraining.py here, and got the error of Bad descriptor. Then I referenced the post of yours, changed the python3.8/multiprocessing/pool.py file where shows the error.(see...

OSError: [Errno 9] Bad file descriptor raised on program exit

Hi, @npanpaliya It workes! I tried this way before but not worked, and after you pointed to me I checked it again, found there's a back slash lost before I...

macbert预训练问题

@shibing624 老师您好，我想训练一个外语的MacBert4csc模型但是nlp方面没什么经验，想向您请教几点问题。 1.训练步骤的确认。 1-a. 收集数据集处理格式，应用LTP分词，且在MLM任务上把[mask]用近义词及n-gram替换。 1-b. 使用google官方的pretraining_data.py生成pretraining data, 进行训练, 从而生成macbert模型。 1-c. 用上文生成的macbert, 收集外语的纠错数据集，按仓库首页 [Read Me](https://github.com/shibing624/pycorrector/tree/master/pycorrector/macbert#%E8%AE%AD%E7%BB%83) 中的说明去训练。如果步骤1的理解有错误还希望您得到您的指正。 2. 1-a的的脚本是否有公开呢，想知道在哪儿可以参考。百忙之中占用您的时间非常抱歉，也非常感激。

macbert预训练问题

谢谢老师回复！还有一个疑问，我想训的是日语版的，试用了下hfl/chinese-macbert-base对日语似乎不具有预测能力，这种情况也可以用hfl/chinese-macbert-base吗，还是1-a,b需要重新拿日语做一遍呀？

macbert预训练问题

谢谢老师! 直接用现有的日语版的模型，之后收集外语的纠错数据集，按首页方法训练就可以吗？这样的话感觉没有做“[mask]用近义词及n-gram替换”这一步，请问这一步是不必须的吗