bert_crf
bert_crf copied to clipboard
CRF layer keeps resulting in CUDA: Device Side Assert Triggered error
Hi, I wrote the code for BERT token classification from scratch and was looking around how to add the CRF layer on top of the model for NER task. Ran into your repo and found it useful, thanks for that! However, after I add the CRF layer I keep getting CUDA error due to limitations with ram. I'm currently using Colab with a Tesla T4 GPU. For reference, I'm using seq_len = 200, number of labels = 9 and batch size = 64. I tried with batch size 1 out of curiousity to see what happens and I still got the same error. I mean the card isn't bad, without the CRF layer I was able to train the model even with batch size 128 so I'm reall confused here.
My question is: With which GPU did you train your model with? Did you ever run into this problem after adding the CRF layer? If not, do you have any suggstions for me? Thank you very much in advance!
I don't get this point "CUDA error due to limitations with ram". I trained the model using google colab. Please have a look at this example here.
Thanks, it works fine. I have question if you can help me. my work with bert+CRF(not on English language) gives a poor result compared to fin-tunned bert, I do not understand why? do you have any justification?
Please check if the bert pretrained model is available for your language. If so please use it.
--
With Regards, Dhanachandra | Research Engineer +91-7600776547| 8014289629 Dhanachandra [email protected] ezDI Healthcare Data Intelligence "Empowering people to deliver innovative solutions that enhance human life."
On Wed, Feb 10, 2021 at 4:56 AM Student [email protected] wrote:
Thanks, it works fine. I have question if you can help me. my work with bert+CRF(not on English language) gives a poor result compared to fin-tunned bert, I do not understand why? do you have any justification?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Dhanachandra/bert_crf/issues/3#issuecomment-776313805, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACSTW4IXPTGJVJ367T7UVFLS6HACRANCNFSM4VKURL4Q .
Exactly, that what I did, I used a pre-trained model in my language with CRF but the performance is less than fine-tuned bert