Mikhail Korobov

Results 479 comments of Mikhail Korobov

Thanks for taking look at it! I think it'd be nice to support numpy arrays and/or stdlib [arrays](https://docs.python.org/3/library/array.html) instead of creating a custom wrapper, because this is a format you...

Hey @samgalen, It is not documented, but there is a way to access this training log: it is parsed by https://github.com/tpeng/python-crfsuite/blob/master/pycrfsuite/_logparser.py, and the logparser object is available as `trainer.logparser`. You...

No idea! I can't reproduce it with the changes you described. What is your OS and Python version? What python-crfsuite are you using? How is python-crfsuite installed (pip, conda)? Google...

Thanks for the catch! Tutorial is outdated and not complete though; the recommended way is to use crfsuite, not wapiti, and tutorial should have shown how to use Pattern features,...

Hey @Granitosaurus! I've added a complete example here: https://github.com/scrapinghub/webstruct/tree/master/example; it'd be nice to move some parts of it to the tutorial.

Currently CRFSuite C++ library doesn't support mini-batch training, so you can't do that with python-crfsuite. If you have issues with memory usage with python-crfsuite, you can generate feature dicts iteratively...

My main concern in Token class and TextTokenizer thing. Creating Token instances looks like a total overkill - why would anyone need to wrap text token in Token instance and...

`.info()` method is a hack, it parses logging output of crfsuite C++ library, and I suspect some label/observation values could break the parsing. A random guess - do you use...

You may try changing `.+` to `.*` [here](https://github.com/tpeng/python-crfsuite/blob/b7b72995884ef10abd7e456039cce85373024261/pycrfsuite/_dumpparser.py#L76) and [here](https://github.com/tpeng/python-crfsuite/blob/b7b72995884ef10abd7e456039cce85373024261/pycrfsuite/_dumpparser.py#L83) - it could fix an issue with empty labels. Pull requests (with tests) are welcome :)