frog
frog copied to clipboard
Use MBMA to split compounds
in --deep_morph mode, MBMA can detect al kinds of compounds, and even outputs them.
it would be very useful if we could add some code to give the logical splitting of the detected compounds.
e.g. Frog now gives for 'appeltaart'
[[appel]noun[taart]noun]noun/singular NN-compound
it seems doable to also give 'appel-taart'
In practice this can become very complicated:
'appelgebak' gives:
[[appel]noun[[ge][bak]noun]noun/singular]noun NN-compound
You would like te get appel-gebak' NOT appel-ge-bakorappelge-bak`
For longer compounds it gets even more difficult.
verkeersagent [[verkeer]noun[s][[ageer]verb[ent]]noun]noun/singular NN-compound
But still it seems worth investigating.