inference mode issue
After exporting the model for prediction, I found that setting de.enable_inference_mode() is required.
and comment is : de.enable_inference_mode()
TrainableWrapper.read_value` is not thread-safe that causes threads competition and Out-Of-Bound exception in concurrent serving scenario.
To resolve this, we define the ModelMode APIs to instruct
the TrainableWrapper to build a different thread-safe sub-graph
for 'TrainableWrapper.read_value' on inference mode.
But after export the model and use tensorflow serving or trion to inference,how to set enable_inference_mode(),or is it necessary ?
model = build_model(xxx)
de.enable_inference_mode()
model.save(export_dir)
enable_inference_mode is used to change the graph building logic inner TFRA. It would be reduce two times memory copy in TrainableWrapper which are IDs and Embedding Values.
@beijinggao Do you have any questions? If not, I will close the issue.