Mikhail Korobov

Results 479 comments of Mikhail Korobov

crfsuite doesn't allow arbitrary CRFs; it implements only linear-chain CRF model with 1st order connections, i.e. there is a connection between the current (i-th) label and a previous ([i-1]) label,...

@karterotte I'm not sure I understood your question correctly, but anyways :) Extracting features from A, B, C is up to you; I don't know Chinese and so don't know...

@chekunkov do you by chance recall why wasn't this PR merged?

@ntugce are you mixing anaconda-installed packages and pip-installed packages?

@murphyd2 something seems to be wrong with paths, as usaddress is installed for Python 3.6, while pycrfsuite is installed for Pythoin 3.7. I don't have much Windows experience, but I...

@murphyd2 could you try creating a virtualenv, and installing python-crfsuite there, to make sure you're starting from a clean state?

Hi, A good tutorial is definitely missing. I have a complete example (an IPython notebook) in works, but haven't finished it yet. But I'd love to hear more feedback about...

There is a couple of "shortcuts" available: - there is example training data in https://github.com/scrapinghub/webstruct/tree/master/webstruct_data folder; try e.g. https://github.com/scrapinghub/webstruct/tree/master/webstruct_data/corpus/business_pages/wa instead of annotating your own pages; - there is webstruct.features.EXAMPLE_TOKEN_FEATURES constant...

Hey @usptact, Trainer parses the log on-the-fly using [_TrainLogParser](https://github.com/tpeng/python-crfsuite/blob/master/pycrfsuite/_logparser.py) object. It should be possible to access loss values using `trainer.logparser.iterations[i]['loss']`.

Hi @wboag, Yes, it is an important issue, but I'm not sure there is a single best solution. Tagger requires an on-disk file to work, so to pickle Tagger so...