bert_crf icon indicating copy to clipboard operation
bert_crf copied to clipboard

CRF layer keeps resulting in CUDA: Device Side Assert Triggered error

Open harunkuf opened this issue 4 years ago • 4 comments

Hi, I wrote the code for BERT token classification from scratch and was looking around how to add the CRF layer on top of the model for NER task. Ran into your repo and found it useful, thanks for that! However, after I add the CRF layer I keep getting CUDA error due to limitations with ram. I'm currently using Colab with a Tesla T4 GPU. For reference, I'm using seq_len = 200, number of labels = 9 and batch size = 64. I tried with batch size 1 out of curiousity to see what happens and I still got the same error. I mean the card isn't bad, without the CRF layer I was able to train the model even with batch size 128 so I'm reall confused here.

My question is: With which GPU did you train your model with? Did you ever run into this problem after adding the CRF layer? If not, do you have any suggstions for me? Thank you very much in advance!

harunkuf avatar Dec 27 '20 09:12 harunkuf

I don't get this point "CUDA error due to limitations with ram". I trained the model using google colab. Please have a look at this example here.

Dhanachandra avatar Dec 29 '20 05:12 Dhanachandra

Thanks, it works fine. I have question if you can help me. my work with bert+CRF(not on English language) gives a poor result compared to fin-tunned bert, I do not understand why? do you have any justification?

Astudnew avatar Feb 09 '21 23:02 Astudnew

Please check if the bert pretrained model is available for your language. If so please use it.

--

With Regards, Dhanachandra | Research Engineer +91-7600776547| 8014289629 Dhanachandra [email protected] ezDI Healthcare Data Intelligence "Empowering people to deliver innovative solutions that enhance human life."

On Wed, Feb 10, 2021 at 4:56 AM Student [email protected] wrote:

Thanks, it works fine. I have question if you can help me. my work with bert+CRF(not on English language) gives a poor result compared to fin-tunned bert, I do not understand why? do you have any justification?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Dhanachandra/bert_crf/issues/3#issuecomment-776313805, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACSTW4IXPTGJVJ367T7UVFLS6HACRANCNFSM4VKURL4Q .

Dhanachandra avatar Feb 10 '21 05:02 Dhanachandra

Exactly, that what I did, I used a pre-trained model in my language with CRF but the performance is less than fine-tuned bert

Astudnew avatar Feb 10 '21 14:02 Astudnew