tensorflow
tensorflow copied to clipboard
How to "tf.data.experimental.AutoShardPolicy.OFF" if tf.data is not used? But gets this message while using tf.keras.model.fit.
System information
- TensorFlow version (you are using):2.5
Describe the feature and the current behavior/state. How to "tf.data.experimental.AutoShardPolicy.OFF" if tf.data is not used? But gets this message while using tf.keras.model.fit.
@rrklearn2020 , Can you please elaborate about your Feature. Also, please specify the Use Cases for this feature. Thanks!
Its difficult to use TF.data.Dataset for relational database with Camera, PointCloud, Map and other data/information. On single system with multiple GPU's, its better to use distributed training with MirroredStrategy (Tensorflow). It would be great if the 'AutoShardpolicy' can be set to OFF in such use cases, either in the model.fit (TF.KERAS) statement or any other ways.
Please inform the effect of having 'AutoShardpolicy' set to OFF on single machine application, since 'Sharding is a method for distributing data across multiple machines'.
Keras may want to make autosharding configurable for non-dataset inputs. Assigning to @fchollet for triage
@fchollet and @aaudiber , It would be a great help, if you could help to address this concern of 'AutoShardPolicy policy' .
By having the 'AutoShardPolicy policy' training takes long time in searching for tf.data.Dataset, for the training where tf.data.Dataset is not used. This consumes lots of time during each epoch.
By setting 'AutoShardPolicy policy' as OFF, reduces the time taken for training, with single machine with multiple GPU's.
Thanks for providing your insight, I think the issue which you are referring is in the code here https://github.com/keras-team/keras/blob/v2.11.0/keras/engine/training.py#L2303, where tf.data.experimental.AutoShardPolicy is set to DATA.
As per the Keras training process, each type of data will be converted to tf.data format internally, even if you provide non tf.data to the model, internally it converts it to tf.data format and feeds it to the model.
So according to this behavior, having tf.data.experimental.AutoShardPolicy.OFF does not fit into the existing logic. Thanks!
Closing due to lack of recent activity. Please update the issue when new information becomes available, and we will reopen the issue. Thanks!