zingg icon indicating copy to clipboard operation
zingg copied to clipboard

Running training when dataset only has matches/non matches or limited samples throws errors. We should instead inform the user about this so they can add training samples.

Open sonalgoyal opened this issue 2 years ago • 3 comments

Reported by Luke from Databricks

zingg_Dec21_0823_log4j-active (1).txt zingg_Dec21_0823_sdtderr.txt

sonalgoyal avatar Dec 22 '21 08:12 sonalgoyal

When there is no training data available, NullPointerException is thrown.

navinrathore avatar Dec 27 '21 09:12 navinrathore

Other problematic scenarios: (refer to attached log file)

  1. When only negative or positive training data are available.
  2. When less number of training data are available. Error 1: java.lang.IllegalArgumentException: requirement failed: rawPredictionCol vectors must have length=2, but got 1 Error 2: java.lang.IllegalArgumentException: requirement failed: Nothing has been added to this summarizer

An appropriate error message should be added to ask user to add more training data

navinrathore avatar Dec 27 '21 10:12 navinrathore

let's fix all

sonalgoyal avatar Dec 27 '21 10:12 sonalgoyal

@gnanaprakash-ravi please verify this

sonalgoyal avatar Nov 02 '23 10:11 sonalgoyal

Hi,

  1. when trainingData is null image

  2. when neg is null and pos is less than 5 image result: image

  3. When pos is equal to 5 and neg is null image result: image

  4. when pos and neg are exactly equal to 5 in the train phase (Needs to be analyzed intensively) This behavior is occurring on the new model and new zinggdir image I suspect this error might be related to Apache Spark library but this was intercepted by zinggbusinessexception: (after the code change) image

gnanaprakash-ravi avatar Dec 06 '23 18:12 gnanaprakash-ravi

1,2,3 are working as expected. 4 is giving an exception with the error around less data. No fix needed.

sonalgoyal avatar Dec 08 '23 05:12 sonalgoyal