ml4ir
ml4ir copied to clipboard
ml4ir-inference integration test seems to be broken when there are missing feature values
Steps to repro
- Train a model and generate the
model_predictions.csv - In the
model_predictions.csvset the feature for any record to null by removing its value Example:query_0,1,0.314,0.0,0.0,1.0,m01h9eeb,0,domain_0,1,0.4095464,1->query_0,1,,0.0,0.0,1.0,m01h9eeb,0,domain_0,1,0.4095464,1 - Run integration test
mvn scala:run "-DaddArgs=../../python/models/end_to_end_test_ranking/final/tfrecord/|../../python/logs/end_to_end_test_ranking/model_predictions.csv|../../python/ml4ir/applications/ranking/tests/data/configs/feature_config_integration_test.yaml"
Error Trace
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.google.protobuf.UnsafeUtil (file:/Users/ashish.srinivasa/.m2/repository/com/google/protobuf/protobuf-java/3.5.1/protobuf-java-3.5.1.jar) to field java.nio.Buffer.address
WARNING: Please consider reporting this to the maintainers of com.google.protobuf.UnsafeUtil
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
java.lang.reflect.InvocationTargetException
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:564)
at scala_maven_executions.MainHelper.runMain(MainHelper.java:161)
at scala_maven_executions.MainWithArgsInFile.main(MainWithArgsInFile.java:26)
Caused by: java.lang.NumberFormatException: empty String
at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1842)
at java.base/jdk.internal.math.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
at java.base/java.lang.Float.parseFloat(Float.java:461)
at scala.collection.immutable.StringLike$class.toFloat(StringLike.scala:281)
at scala.collection.immutable.StringOps.toFloat(StringOps.scala:29)
at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(FeatureProcessors.scala:14)
at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3$$anonfun$apply$4.apply(FeatureProcessors.scala:14)
at scala.Option.map(Option.scala:146)
at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3.apply(FeatureProcessors.scala:14)
at ml4ir.inference.tensorflow.data.FeatureProcessors$$anonfun$3$$anonfun$apply$3.apply(FeatureProcessors.scala:14)
at ml4ir.inference.tensorflow.data.FeaturePreprocessor$$anonfun$extractFloatFeatures$1.apply(FeaturePreprocessor.scala:41)
at ml4ir.inference.tensorflow.data.FeaturePreprocessor$$anonfun$extractFloatFeatures$1.apply(FeaturePreprocessor.scala:38)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.Map$Map4.foreach(Map.scala:188)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.AbstractTraversable.map(Traversable.scala:104)
at ml4ir.inference.tensorflow.data.FeaturePreprocessor.extractFloatFeatures(FeaturePreprocessor.scala:38)
at ml4ir.inference.tensorflow.data.FeaturePreprocessor.apply(FeaturePreprocessor.scala:34)
at ml4ir.inference.tensorflow.data.FeaturePreprocessor.apply(FeaturePreprocessor.scala:17)
at scala.collection.immutable.List.map(List.scala:284)
at ml4ir.inference.tensorflow.data.SequenceExampleBuilder.apply(TFRecordBuilders.scala:40)
at ml4ir.inference.tensorflow.data.SequenceExampleBuilder.build(TFRecordBuilders.scala:48)
at ml4ir.inference.tensorflow.SequenceExampleInference$$anonfun$runQueriesAgainstDocs$2.apply(SequenceExampleInference.scala:132)
at ml4ir.inference.tensorflow.SequenceExampleInference$$anonfun$runQueriesAgainstDocs$2.apply(SequenceExampleInference.scala:130)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234)
at scala.collection.immutable.List.foreach(List.scala:392)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:234)
at scala.collection.immutable.List.map(List.scala:296)
at ml4ir.inference.tensorflow.SequenceExampleInference$.runQueriesAgainstDocs(SequenceExampleInference.scala:130)
at ml4ir.inference.tensorflow.SequenceExampleInference$.evaluateRankingInferenceAccuracy(SequenceExampleInference.scala:68)
at ml4ir.inference.tensorflow.SequenceExampleInference$.main(SequenceExampleInference.scala:64)
at ml4ir.inference.tensorflow.SequenceExampleInference.main(SequenceExampleInference.scala)
... 6 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 12.111 s
[INFO] Finished at: 2021-06-09T12:32:05-07:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal net.alchim31.maven:scala-maven-plugin:4.3.1:run (default-cli) on project ml4ir-inference: wrap: org.apache.commons.exec.ExecuteException: Process exited with an error: 240 (Exit value: 240) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
Error while creating work item!
as discussed on slack, to let the CSVReader properly "skip" empty cell values:
we just need line 49 of the IT test to be changed to:
val lineMapper: String => Map[String, String] = (line: String) =>
colNames.zip(line.split(",")).toMap.filter(kv => kv._2.nonEmpty)
@lastmansleeping / @ducouloa - do you know if this fix is getting folded into another PR one of you are working on, or should I open a tiny PR for this?
@jakemannix I can integrate it when providing the tool to predict score from CLI.