how to do tagging without "attribute" file?
This is a stupid question,but I followed your manual and still struck on this.
The first step in training a tagging model is to transform raw data into "feature/attribute" file, use chunking.py. thus to do: train.txt -----> train.crfsuite.txt test.txt -----> test.crfsuite.txt Then do training and testing both on these "feature/attribute" file, like this. crfsuite learn -m CRF.model train.crfsuite.txt crfsuite tag -m CRF.model test.crfsuite.txt But the question is when I tried to do tagging, I actually don't want to do experiment and check accuracy, f1 score and sort of these. I only have unlabelled text data, then how do I tag it? I tried this: crfsuite tag -m CRF.model unlabelled.txt but the result is all the same, which is obviously wrong. Should I first transform my unlabelled text data into "feature/attribute" file? then how to do this? please help.
The feature extraction is up to you. You should be able to extract the features from unlabeled data, without knowing the correct label. Some simple features in NLP can be unigrams/bigrams, character-ngrams.
@usptact Yes, I've designed some features, but only labelled data can be transformed to crfsuite "feature/attribute" file using chunking.py. After trained, when doing tagging, I only have unlabelled data, then how do I transform it to "feature/attribute" file?
thanks.