python-crfsuite icon indicating copy to clipboard operation
python-crfsuite copied to clipboard

Does it support sentence labeling?

Open moushumimahato opened this issue 6 years ago • 1 comments

Hi,

I am trying crfsuite for sentence labeling (eg. whether a sentence is a QUESTION or GREETING or COMMAND etc.). Is it possible with this algorithm? If yes, what are the features required?

Thanks

moushumimahato avatar Jan 12 '18 09:01 moushumimahato

@moushumimahato In case you have sequence of sentences: Have a look at the paper: Automatic classification of sentences to support Evidence Based Medicine by Su Nam Kim et al(2011) For medical abstracts, they are classifying sections into around 6 classes. Alongwith the features for each sentence they are also taking into consideration the sequence of sentences.

Read the section: Conditional random fields (on page number 4)

CRFs are undirected graphical models in which each vertex represents a random variable whose distribution is to be inferred, and each edge represents a dependency between two random variables. In our case the sentences in an abstract are represented by vertices, and the edges represent the relationship between sentences. CRFs have the advantage that they both model sequential effects and support the use of a large number of features; they have also been shown to perform comparatively well in other sentence-classification tasks [3, 4]

Also read the papers mentioned as reference number 3,4 from which they have taken ideas for features.

kaushikacharya avatar Apr 22 '18 17:04 kaushikacharya