deep_reference_parser
deep_reference_parser copied to clipboard
Future of CRF layer
The CRF causes some problems, namely:
- It depends on
keras_contribs.CRF
which:- Is not compatible (afaik) with tensorflow 2.0 (which is where the latest version of keras lives).
- Prevents deep_reference_parser being published on pypi (it doesn't allow static dependencies)
- Requires logic to rebuild the model prior to predictions in the same way that the model is built for training, rather than just loading the complete model from a file.
Replacing the CRF some other output would ameliorate these problems. Note that it is already possible to remove it right now by specifying output="softmax"
rather than "crf"
when building the model with deep_reference_parser.build_model()
. A softmax output will almost certainly perform worse than a CRF though.
Looks like CRF is available in tf 2.0 https://www.tensorflow.org/addons/api_docs/python/tfa/text/crf
Here's an example, but note that the CRF module is now in tfa.text.crf, not contrib: https://github.com/OpenNMT/OpenNMT-tf/blob/master/opennmt/models/sequence_tagger.py
Ahh it is implemented for tf but not for tf.keras, though looks like it could be coming: https://github.com/tensorflow/addons/pull/377#pullrequestreview-335963486
This has been merged: https://github.com/tensorflow/addons/pull/1999
Hello, when using CRF layer with BI-LSTM for an NER task, i get the following error : crf_loss * crf, idx = y_pred._keras_history[:2]
AttributeError: 'Tensor' object has no attribute '_keras_history'
I get that it's a problem in the loss function, but I don't know how to get past it. Could you please help if you have found a solution ?
Hi @chaalic is this code you are running outside of the deep reference parser?
Yes it's for another task of named entity recognition, but the model i'm using is the same : bilstm with CRF.
Ah OK. If you post some more of your code here we may be able to spot something.