frog icon indicating copy to clipboard operation
frog copied to clipboard

Use MBMA to split compounds

Open kosloot opened this issue 6 years ago • 0 comments

in --deep_morph mode, MBMA can detect al kinds of compounds, and even outputs them. it would be very useful if we could add some code to give the logical splitting of the detected compounds.

e.g. Frog now gives for 'appeltaart' [[appel]noun[taart]noun]noun/singular NN-compound it seems doable to also give 'appel-taart'

In practice this can become very complicated: 'appelgebak' gives: [[appel]noun[[ge][bak]noun]noun/singular]noun NN-compound

You would like te get appel-gebak' NOT appel-ge-bakorappelge-bak` For longer compounds it gets even more difficult.

verkeersagent [[verkeer]noun[s][[ageer]verb[ent]]noun]noun/singular NN-compound

But still it seems worth investigating.

kosloot avatar Aug 08 '19 15:08 kosloot