python-crfsuite icon indicating copy to clipboard operation
python-crfsuite copied to clipboard

How to train crf in batch?

Open susht3 opened this issue 6 years ago • 1 comments

i have a big dataset, how to train this crf in batch?

susht3 avatar Aug 15 '17 10:08 susht3

Currently CRFSuite C++ library doesn't support mini-batch training, so you can't do that with python-crfsuite.

If you have issues with memory usage with python-crfsuite, you can generate feature dicts iteratively (see https://github.com/scrapinghub/python-crfsuite/issues/37#issuecomment-224575213); it should help to reduce memory, as usually most memory is taken by Python-level feature dicts; internal feature representation is more efficient. See also: https://github.com/TeamHG-Memex/sklearn-crfsuite/issues/15.

kmike avatar Aug 15 '17 11:08 kmike