serving icon indicating copy to clipboard operation
serving copied to clipboard

Could not regenerate tf_text_regression model on s390x BE system and tensorflow_model_server_test is failing

Open Sidong-Wei opened this issue 3 years ago • 3 comments

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 20.04 on s390x
  • TensorFlow Serving installed from (source or binary): Source
  • TensorFlow Serving version: 2.5.1

Describe the problem

When running test suites on an s390x machine, test_tf_text in //tensorflow_serving/model_servers:tensorflow_model_server_test will fail with the following error message:

grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INTERNAL
	details = "U_INVALID_FORMAT_ERROR: Could not retrieve ICU NFKC normalizer
	 [[{{node NormalizeUTF8WithOffsets/NormalizeUTF8WithOffsetsMap}}]]"
	debug_error_string = "{"created":"@1626196655.285737838","description":"Error received from peer ipv4:127.0.0.1:41813","file":"src/core/lib/surface/call.cc","file_line":1066,"grpc_message":"U_INVALID_FORMAT_ERROR: Could not retrieve ICU NFKC normalizer\n\t [[{{node NormalizeUTF8WithOffsets/NormalizeUTF8WithOffsetsMap}}]]","grpc_status":13}"
>

This seems to be caused by the fact that TF text is using Little Endian ICU normalization data, so I tried to manually generate this normalization data file on a Big Endian machine from ICU repo.

The next step I would like to try was to regenerate the tf_text_regrassion model for serving. However, after I run the script here, it does not seem to produce the correct model for this test. TF serving 2.5.1 is depending on TF text 2.4.3, so I tried this script for both 2.4.3 and 2.5.0 versions TF text (because 2.5.0 is the first version it supports TF 2.5.0 - TF serving 2.5.1 is also depending on TF 2.5.0). But they either could not run or the generated model is still making the test case fail.

My question would be is there a clear way to regenerate the tf_text_regression model for a particular TF serving version (2.5.1)? Simply running the script is not working and also there is this version discrepancy here, so I would appreciate it if someone could guide me through generating this model on a local machine.

Exact Steps to Reproduce

This will reproduce the test case failure

bazel test --build_tests_only --test_output=errors --verbose_failures -c opt //tensorflow_serving/model_servers:tensorflow_model_server_test

Sidong-Wei avatar Jul 23 '21 16:07 Sidong-Wei

The model servers are supposed to be built within a docker rather than native OS environments. Please try prepending the bazel test command with tools/run_in_docker.sh.

tools/run_in_docker.sh bazel test --build_tests_only --test_output=errors --verbose_failures -c opt //tensorflow_serving/model_servers:tensorflow_model_server_test

godot73 avatar Sep 08 '21 21:09 godot73

@Sidong-Wei Could you please respond to the above comment of @godot73 and confirm if this issue can be closed.Thanks

UsharaniPagadala avatar Nov 12 '21 14:11 UsharaniPagadala

@UsharaniPagadala @godot73 Thanks for the reply, it is good to know that it is supposed to be run inside a docker container. However, what I was trying to figure out is how to generate the tf_text_regression model locally instead of using the pre-built one here. In other words, I was trying to confirm that the pre-built model binary is not compatible with Big Endian machines, so I would appreciate it if you could inform me how this model is generated on your end, especially when the TF serving, TF, TF text version updates as they do not always update at the same time.

Sidong-Wei avatar Nov 12 '21 16:11 Sidong-Wei

@UsharaniPagadala @godot73 In TFS 2.9.1 tensorflow_model_server_test did not run by default (bazel test -c opt tensorflow_serving/...). Could you please confirm if this testcase is deprecated? Thanks!

rposts avatar Sep 28 '22 13:09 rposts

@Sidong-Wei,

TF Text is updated to v2.9.0 in latest TF serving release 2.11.0. Ref: commit

Please try to generate the tf_text_regression model with latest TF serving release 2.11.0 and let us know if you issue has been resolved. Thank you!

singhniraj08 avatar Jan 23 '23 05:01 singhniraj08

@singhniraj08 tensorflow_model_server_test did not run by default (bazel test -c opt tensorflow_serving/...). Could you please confirm if this testcase is deprecated? Thanks!

rposts avatar Jan 25 '23 23:01 rposts

Either way it is unclear from the commit posted whether pre-built model binary is compatible with Big Endian machines.

rposts avatar Jan 25 '23 23:01 rposts

@rposts,

Let me check with the team why this testcase didn't run by default. Please let me know the TF serving and bazel release you faced the issue with? You can execute below command to test your build as shown here

To test your build, execute:

tools/run_in_docker.sh bazel test -c opt tensorflow_serving/...

On the second point, I couldn't get much info on the pre-built model binary, but the saved model contain raw binary data in the byte order of the host that produced the SavedModel. The best workaround would be to create and save a new model and then serve in using TF Serving on LE/BE systems. Thank you!

singhniraj08 avatar Jan 30 '23 07:01 singhniraj08

Closing this due to inactivity. Please take a look into the answers provided above, feel free to reopen and post your comments(if you still have queries on this). Thank you!

singhniraj08 avatar Feb 17 '23 11:02 singhniraj08