elasticdl
elasticdl copied to clipboard
Support better index reading from multi RecordIO files
Currently, we master starts, it needs to read all RecordIO shards to find out number of records in each shard. For large number of files, this may have too much overhead. We could keep such metadata during RecordIO building time. Or migrate to a self-adaptive RecordIO format, such as: https://github.com/google/riegeli