lingua-go icon indicating copy to clipboard operation
lingua-go copied to clipboard

Support absolute language confidence metric

Open warvyvr opened this issue 6 months ago • 3 comments

Hi, In my scenario, the goal is to detect whether the input text is in English or another language. I'm not sure how to utilize the library to accomplish this task. For instance, if the input text is in a specified language, such as Vietnamese, I expect the detection as non english

	languages := []lingua.Language{
		lingua.English,
		lingua.Vietnamese,
		lingua.Unknown,
	}

	sentence := "Thông tin tài khoản của bạn"

	detector := lingua.NewLanguageDetectorBuilder().
		FromLanguages(languages...).
		WithMinimumRelativeDistance(0.9).
		Build()

	confidenceValues := detector.ComputeLanguageConfidenceValues(sentence)

	for _, elem := range confidenceValues {
		fmt.Printf("%s: %.2f\n", elem.Language(), elem.Value())
	}

output:

Vietnamese: 1.00
English: 0.00

when remove lingua.Vietnamese from expected language list, the program outputs English: 1.00, I would like the result is other language type rather than engilsh. please help me on how to do this. Thanks in advance.

warvyvr avatar Dec 21 '23 04:12 warvyvr