crfsuite icon indicating copy to clipboard operation
crfsuite copied to clipboard

Spatial Data Training

Open ghost opened this issue 10 years ago • 2 comments

I've been trying for the past couple of days to train the CRF with rich spatial data looking like this:

Sequence1: A1 L=0.0 O=North B1 L=0.8 O=East C1 L=0.8 O=East C2 L=0.8 O=South

Sequence2: A2 L=0.0 O=North A3 L=0.8 O=South A4 L=0.8 O=South B5 L=0.8 O=East

(Something like a pawn traveling on a chessboard on possible paths.)

Then I'm passing arbitrary data (a small path) and try to match them and get a label sequence telling me the most probable path that the pawn took but I'm getting nowhere. Would you be able to provide an example of chunking using a dataset like this one for possible path-map matching?

Thank you in advance. J.

ghost avatar Mar 13 '15 18:03 ghost

Can you define explicitly the labels and the features?

usptact avatar Apr 07 '15 18:04 usptact

Yes I can. For example I'm doing:

(edge) (features) x1-y1|x2-y2 dir[0]=W ___BOS___ x2-y2|x3-y3 dir[-1]=W dir[0]=E . . x(n-1)-y(n-1)|x(n)-y(n) dir[-5..5]=.. ___EOS___

I'm mapping out valid paths throughout my navigation matrix and I'm trying to select the most unique ones.

The way that I'm tagging afterwards is basically only using direction measurements trying to predict valid paths. ex:

(features) dir[0]=W ___BOS___ dir[-1]=W dir[0]=E . . dir[-5..5]=.. ___EOS___

I'm getting good inferences lately (when I'm very very strict with directions) but there's no way to introduce numbers on the features, such as edge frequency rate or any other numeric feature that can explicitly distinguish a path. Which means that I can never be relativistic to my predictions I must always explicitly define the path taken and even then I might not get valid paths.

Any help is most appreciated.

Thanks again. J.

ghost avatar Apr 20 '15 17:04 ghost