ml4ir icon indicating copy to clipboard operation
ml4ir copied to clipboard

ml4ir-inference integration test seems to be broken when there are missing feature values

Open lastmansleeping opened this issue 4 years ago • 4 comments

Steps to repro

  1. Train a model and generate the model_predictions.csv
  2. In the model_predictions.csv set the feature for any record to null by removing its value Example: query_0,1,0.314,0.0,0.0,1.0,m01h9eeb,0,domain_0,1,0.4095464,1 -> query_0,1,,0.0,0.0,1.0,m01h9eeb,0,domain_0,1,0.4095464,1
  3. Run integration test
mvn scala:run "-DaddArgs=../../python/models/end_to_end_test_ranking/final/tfrecord/|../../python/logs/end_to_end_test_ranking/model_predictions.csv|../../python/ml4ir/applications/ranking/tests/data/configs/feature_config_integration_test.yaml"

Error Trace

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:/Users/ashish.srinivasa/.m2/repository/com/google/protobuf/protobuf-java/3.5.1/protobuf-java-3.5.1.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at scala_maven_executions.MainHelper.runMain(MainHelper.java:161)
	at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: java.lang.NumberFormatException: empty String
	at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1842)
	at java.base/jdk.internal.math.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
	at java.base/java.lang.Float.parseFloat(Float.java:461)
	at scala.collection.immutable.StringLike$class.toFloat(StringLike.scala:281)
	at scala.collection.immutable.StringOps.toFloat(StringOps.scala:29)
	at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(FeatureProcessors.scala:14)
	at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(FeatureProcessors.scala:14)
	at scala.Option.map(Option.scala:146)
	at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3.apply(FeatureProcessors.scala:14)
	at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3.apply(FeatureProcessors.scala:14)
	at ml4ir.inference.tensorflow.data.FeaturePreprocessor$$anonfun$extractFloatFeatures$1.apply(FeaturePreprocessor.scala:41)
	at ml4ir.inference.tensorflow.data.FeaturePreprocessor$$anonfun$extractFloatFeatures$1.apply(FeaturePreprocessor.scala:38)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.immutable.Map$Map4.foreach(Map.scala:188)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at ml4ir.inference.tensorflow.data.FeaturePreprocessor.extractFloatFeatures(FeaturePreprocessor.scala:38)
	at ml4ir.inference.tensorflow.data.FeaturePreprocessor.apply(FeaturePreprocessor.scala:34)
	at ml4ir.inference.tensorflow.data.FeaturePreprocessor.apply(FeaturePreprocessor.scala:17)
	at scala.collection.immutable.List.map(List.scala:284)
	at ml4ir.inference.tensorflow.data.SequenceExampleBuilder.apply(TFRecordBuilders.scala:40)
	at ml4ir.inference.tensorflow.data.SequenceExampleBuilder.build(TFRecordBuilders.scala:48)
	at ml4ir.inference.tensorflow.SequenceExampleInference$$anonfun$runQueriesAgainstDocs$2.apply(SequenceExampleInference.scala:132)
	at ml4ir.inference.tensorflow.SequenceExampleInference$$anonfun$runQueriesAgainstDocs$2.apply(SequenceExampleInference.scala:130)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
	at scala.collection.immutable.List.map(List.scala:296)
	at ml4ir.inference.tensorflow.SequenceExampleInference$.runQueriesAgainstDocs(SequenceExampleInference.scala:130)
	at ml4ir.inference.tensorflow.SequenceExampleInference$.evaluateRankingInferenceAccuracy(SequenceExampleInference.scala:68)
	at ml4ir.inference.tensorflow.SequenceExampleInference$.main(SequenceExampleInference.scala:64)
	at ml4ir.inference.tensorflow.SequenceExampleInference.main(SequenceExampleInference.scala)
	... 6 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  12.111 s
[INFO] Finished at: 2021-06-09T12:32:05-07:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.3.1:run (default-cli) on project ml4ir-inference: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 240 (Exit value: 240) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

lastmansleeping avatar Jun 09 '21 19:06 lastmansleeping

Error while creating work item!

uip-robot-zz avatar Jun 09 '21 19:06 uip-robot-zz

as discussed on slack, to let the CSVReader properly "skip" empty cell values:

we just need line 49 of the IT test to be changed to:

      val lineMapper: String => Map[String, String] = (line: String) =>
        colNames.zip(line.split(",")).toMap.filter(kv => kv._2.nonEmpty)

jakemannix avatar Jun 10 '21 05:06 jakemannix

@lastmansleeping / @ducouloa - do you know if this fix is getting folded into another PR one of you are working on, or should I open a tiny PR for this?

jakemannix avatar Jun 11 '21 04:06 jakemannix

@jakemannix I can integrate it when providing the tool to predict score from CLI.

ducouloa avatar Jun 11 '21 13:06 ducouloa