Vladislavs Dovgalecs comments

Results 53 comments of


                                            Vladislavs Dovgalecs

crfsuite is only used for NLP ?

@jichaoliu basically you provide arbitrarily named sequence element labels + their features. Feature labels, again, can be arbitrarily named. The catch is that this information must be consistent and make...

Multi core support for training on large number of instances

@tianjianjiang All due respect to the author of CRFSuite (did really great job) but it would take a while to get your improvement merged in. Perhaps the best bet for...

Multi core support for training on large number of instances

@CSabty If you need speed for learning from very large datasets, please take a look at Wapiti or use Vowpal Wabbit in learning to search mode. I use the latter...

Multi core support for training on large number of instances

@bratao Sure, here you go: ``` vw --data train.feat \ --learning_rate 0.5 \ --cache --kill_cache \ --threads \ --passes 10 \ --search_task sequence \ --search $NUM_LABELS \ --search_rollin=policy \ --search_rollout=none...

Multi core support for training on large number of instances

@CSabty In my experience, performance-wise, the CRF is still the best although I did not do thorough comparison.

Multi core support for training on large number of instances

Both in CRFSuite and VW, the ":" character is special. In former you can escape it like this "\\:" but in latter you can't. Assuming you don't want to change...

Multi core support for training on large number of instances

@jbkoh If you are looking for multi CPU training of CRFs, take a look at https://github.com/zhongkaifu/CRFSharp

Feature for training CRF

Beyond gazetteer features, adding Brown or Clark cluster features also improve performances. I experimented a lot with Brown cluster features and got consistent improvement across various models I built. The...

Feature for training CRF

I would say that baseline features work as advertised - you know what information they carry. This is because those are hand-crafted features. The word embedding features encode information about...

Question about CRF model

If you can extract a "sequence" from your image, and expect sequence labels + one global label, I suggest you to look at TriCRF instead. https://github.com/minwoo/TriCRF The implemented model by...