Michael Cysouw
Michael Cysouw
hmmm, the more I think about it, this all suggests that we need different kinds of 'segmentations' for the ontology after all. Currently we use different notions of 'segments' for...
@LinguList agree: we leave the generic `Segments` just like it is now, which can be used for whatever segmentation you like to add. When you want to become more precise...
OK, we don't change anything, and leave the semantics of Segments open to interpretation of the user
A problem came up in another discussion and I just note it here for future reference: when you have multiple different `Segment` columns, then it becomes a problem to use...
Another approach (as used by @LinguList) is to allow for boundary symbols in the `Segment` parsing. We could think about this using the regular Leipzig Glossing Rules conventions, with an...
The Leipzig Glossing Rules go further (besides using `-` for morphemes), and has symbols for other kinds of boundaries, e.g. for clitics `=`, reduplication `~` and infixes `< >`. Some...
Then we could interpret `+` as standing for a superset of just any word-internal boundary :-). Suitable, as it is not used by LGR. (sorry, but I didn't know about...
The term `Form` was conceived as a shorthand of _wordform_, i.e. an actual sequence of sounds/letters in some data - at the same time getting rid of this other problem,...
@xrotwang The comment about the wordhood problem was just a nod to @haspelmath :-). In practice: I always interpret the `FormTable` as a unique list of the Forms in some...
@xrotwang about the separators: how would one encode the separator for the the `Segment_Slice` specification? In parallel texts I would like to use spaces (i.e. use slice '3:4' as the...