Duke
Duke copied to clipboard
No identity for record no.priv.garshol.duke.CompactRecord@61001b64
Any Idea about the error? I am trying to use active learning for record linkage for software names form two different sources.
[GeneticConfiguration 0.15 [ID] [VENDOR NumericComparator 0.77 0.12] [PRODUCT DifferentComparator 0.75 0.23] [VERSION QGramComparator 0.53 0.22]] Exception in thread "main" no.priv.garshol.duke.DukeException: No identity for record no.priv.garshol.duke.CompactRecord@61001b64 at no.priv.garshol.duke.matchers.TestFileListener.getid(TestFileListener.java:225) at no.priv.garshol.duke.matchers.TestFileListener.matches(TestFileListener.java:102) at no.priv.garshol.duke.Processor.registerMatch(Processor.java:601) at no.priv.garshol.duke.Processor.compareCandidatesBest(Processor.java:493) at no.priv.garshol.duke.Processor.match(Processor.java:428) at no.priv.garshol.duke.Processor.match(Processor.java:252) at no.priv.garshol.duke.Processor.linkBatch(Processor.java:379) at no.priv.garshol.duke.Processor.linkRecords(Processor.java:364) at no.priv.garshol.duke.Processor.linkRecords(Processor.java:342) at no.priv.garshol.duke.genetic.GeneticAlgorithm.evaluate(GeneticAlgorithm.java:348) at no.priv.garshol.duke.genetic.GeneticAlgorithm.evolve(GeneticAlgorithm.java:208) at no.priv.garshol.duke.genetic.GeneticAlgorithm.run(GeneticAlgorithm.java:188) at com.fractal.dataextraction.ACDP.DukeTest$.delayedEndpoint$com$fractal$dataextraction$ACDP$DukeTest$1(DukeTest.scala:26) at com.fractal.dataextraction.ACDP.DukeTest$delayedInit$body.apply(DukeTest.scala:7) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$$anonfun$main$1.apply(App.scala:76) at scala.App$$anonfun$main$1.apply(App.scala:76) at scala.collection.immutable.List.foreach(List.scala:383) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35) at scala.App$class.main(App.scala:76) at com.fractal.dataextraction.ACDP.DukeTest$.main(DukeTest.scala:7) at com.fractal.dataextraction.ACDP.DukeTest.main(DukeTest.scala)
It means that you have no ID field for this record. That's a problem, because then Duke has no way to identify the record when reporting back to you. So you need to make sure the schema declares an ID field, and that every record has a value for this field.
Hi @larsga, thanks for the quick reply. I made sure to remove all the null values and it seems working. But in the active mode duke is not asking me any questions, does it expose those questions to any http://localhost:<
` val geneticAlgorithm = new GeneticAlgorithm(config, null, false)
geneticAlgorithm.setActive(true) // geneticAlgorithm.setThreads(5) geneticAlgorithm.setConfigOutput("output/config_output.xml") geneticAlgorithm.setLinkFile("output/label_data.txt") geneticAlgorithm.setQuestions(10) geneticAlgorithm.run() `