catalyst icon indicating copy to clipboard operation
catalyst copied to clipboard

Is there an example on how to use the lemmatizer in a pipeline?

Open bancroftway opened this issue 9 months ago • 1 comments

Could you please document how to use the lemmatizer in a pipeline? I am unable to find any sample code in the samples directory on this.

bancroftway avatar Oct 05 '23 13:10 bancroftway

You can try the following code:

Catalyst.Models.English.Register();
var nlp = await Pipeline.ForAsync(Language.English);
var doc = new Document("I used to have dogs", Language.English);
nlp.ProcessSingle(doc);
var tokenList = doc.ToTokenList();
tokenList.ForEach(token => Console.WriteLine($"{token.Value} -> {token.Lemma}"));

/* 
Result:
  I -> I
  used -> use
  to -> to
  have -> have
  dogs -> dog
*/

But check the presence of the *_lemma_lookup_*.bin file and the ILemmatizer implementation for the language you need.

flor3sc0 avatar Oct 26 '23 08:10 flor3sc0