HybridBackend issues

to_sparse failed for Value with ragged_rank > 1 read from parquet file

9

# Current behavior when hb read some nested lists with ragged_rank > 1，the read Value cannot be transformed to SparseTensor by function hb.data.to_sparse. For example: dense_feature is one of the...

SamJia

bug

Throughput is lower than TFRecords when there are many strings in Parquets file

# System information - OS Platform: Ubuntu 18.04.5 LTS - Docker version: 18.09.5 - GCC version: 7.5.0 - Python version: 3.6.9 - TensorFlow/PyTorch version: tf1.15.5 # Willing to contribute Yes

deepllz

Deeprec hangs in distributed mode.

# Current behavior In distributed mode, deeprec works fine when training on one hour of data, but hangs when training on one day or more. Log： ![6ca9fe77321c27383b3b3de9bb8fc5d5](https://user-images.githubusercontent.com/35439432/229059537-ea1626df-2411-46bf-acb8-fb61fada092d.png) Nvidia-smi: ![a3ee237e24abfd35d1c087126b6331f8](https://user-images.githubusercontent.com/35439432/229059658-5b425fcf-f027-4f71-9d2c-908ffca14bf5.png) cpu:...

silingtong123

feature_column bucket_size is 6, use 8 gpus, then worker-5 and worker-6 'save/RestoreV2' failed

feature_column bucket_size is 6, use 8 gpus, then worker-5 and worker-6 'save/RestoreV2' failed; backtrace: Traceback (most recent call last): File "neg_feedback_multi.py", line 1252, in tf.app.run() File "/home/pai/lib/python3.6/site-packages/tensorflow_core/python/platform/app.py", line 40, in...

zhbhhb

the EarlyStopping callback not working well on multi worker distribute training job

# Current behavior If there is only one worker ,training with EarlyStopping callback is ok. When multi workers with EarlyStopping callback doing distribute training, all workers will be hanging and...

taoyun951753

Dataset iterator can't be warpped in the hybridBackend scope

# Current behavior I am using hybridBackend to do data parallelism, I create a dataset and make it an iterator, when I use hybridBackend scope to wrap the whole pipeline,...

fuhailin

Support default values for filename dataset

This patch fixes #156

2sin18

ParquetDataset support configuration with default value

# User Story The fixed-length features in TFRecord support configuration with default values（https://www.tensorflow.org/api_docs/python/tf/io/FixedLenFeature）, but currently, Parquet does not support this feature. If encountering a non-existent feature, an error will be...

Markz2z

enhancement

HybridBackend
HybridBackend copied to clipboard

Metadata

to_sparse failed for Value with ragged_rank > 1 read from parquet file

Throughput is lower than TFRecords when there are many strings in Parquets file

Deeprec hangs in distributed mode.

feature_column bucket_size is 6, use 8 gpus, then worker-5 and worker-6 'save/RestoreV2' failed

the EarlyStopping callback not working well on multi worker distribute training job

Dataset iterator can't be warpped in the hybridBackend scope

Support default values for filename dataset

ParquetDataset support configuration with default value

← Metadata

Owner

Metadata

HybridBackend HybridBackend copied to clipboard

Metadata

← Metadata

Owner

Metadata

HybridBackend
HybridBackend copied to clipboard