Open3D-ML Difference in batch size available for PyTorch and Tensorflow on the same GPU

Difference in batch size available for PyTorch and Tensorflow on the same GPU

Open numahha opened this issue 2 years ago • 3 comments

With changing the batch size, I tried both PyTorch and Tensorflow versions of RandLANet on SemanticKITTI. For PyTorch, I could start training with batch size 5, while I could not with batch size 6 due to CUDA out of memory error. For Tensorflow, I could with batch size 2, while I could not with batch size 3 due to error message "ResourceExhaustedError: OOM when allocating ...". So, I could only use half the batch size in Tensorflow on the same GPU. The code that I use is

import os
import open3d.ml as _ml3d
#import open3d.ml.torch as ml3d
import open3d.ml.tf as ml3d
import pprint

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.RandLANet(**cfg.model)
cfg.dataset['dataset_path'] = "./"
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.SemanticSegmentation(model, dataset=dataset, device="gpu", **cfg.pipeline)

pipeline.cfg_tb = {
    "readme": "readme",
    "cmd_line": "cmd_line",
    "dataset": pprint.pformat(cfg.dataset, indent=2),
    "model": pprint.pformat(cfg.model, indent=2),
    "pipeline": pprint.pformat(cfg.pipeline, indent=2),
}

pipeline.run_train()

I'd like to know where the problem lies and how to solve it.

Thanks.

Jul 29 '21 13:07 numahha

@numahha Thanks for reporting this. I am looking into this.

Oct 22 '21 14:10 sanskar107

@numahha Possible reason for this might be that TensorFlow prefetches the data for next iteration. It is happening here in the code (https://github.com/isl-org/Open3D-ML/blob/master/ml3d/tf/dataloaders/tf_dataloader.py#L158). Could you try again after disabling it ?

Nov 19 '21 10:11 sanskar107

Thank you for the suggestion. I tried disabling it but got another error. This happened regardless of the batch size.

How to desable it:

        if (self.model is None or 'batcher' not in self.model_cfg.keys() 
            #or self.model_cfg.batcher == 'DefaultBatcher'
            ):
            loader = loader.batch(batch_size)

Error message when disabling it:

(o3dmltf) ub18@ub18-desktop:~/Open3D-ML$ python test_tf.py 
2021-12-03 10:20:26.671009: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:20:27.430333: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:20:27.967629: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-12-03 10:20:27.967769: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.968141: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-12-03 10:20:27.968186: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:20:27.970059: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-12-03 10:20:27.970112: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-12-03 10:20:27.970766: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-12-03 10:20:27.970960: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-12-03 10:20:27.971430: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-12-03 10:20:27.972098: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-12-03 10:20:27.972220: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-12-03 10:20:27.972290: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.972624: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.972929: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-12-03 10:20:27.973160: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-03 10:20:27.973464: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.973793: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-12-03 10:20:27.973840: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.974138: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:27.974429: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-12-03 10:20:28.369102: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-12-03 10:20:28.369144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-12-03 10:20:28.369166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-12-03 10:20:28.369281: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:28.369645: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:28.369947: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:20:28.370241: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6222 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2070 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
INFO - 2021-12-03 10:20:28,530 - semantic_segmentation - <open3d._ml3d.tf.models.randlanet.RandLANet object at 0x7fde282a6d90>
INFO - 2021-12-03 10:20:28,530 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_tf/log_train_2021-12-03_10:20:28.txt
INFO - 2021-12-03 10:20:28,559 - semantickitti - Found 19130 pointclouds for training
INFO - 2021-12-03 10:20:31,531 - semantickitti - Found 4071 pointclouds for validation
INFO - 2021-12-03 10:20:32,204 - semantic_segmentation - Writing summary in train_log/00015_RandLANet_SemanticKITTI_tf.
INFO - 2021-12-03 10:20:32,205 - semantic_segmentation - Initializing from scratch.
INFO - 2021-12-03 10:20:32,205 - semantic_segmentation - === EPOCH 0/100 ===
training:   0%|                                                                                                                                        | 0/19130 [00:00<?, ?it/s]2021-12-03 10:20:32.223916: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-12-03 10:20:32.242234: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3600000000 Hz
2021-12-03 10:20:32.368198: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-12-03 10:20:32.752362: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
training:   0%|                                                                                                                                        | 0/19130 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/ub18/Open3D-ML/test_tf.py", line 23, in <module>
    pipeline.run_train()
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/pipelines/semantic_segmentation.py", line 250, in run_train
    results = model(inputs, training=True)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1030, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 259, in call
    f_encoder_i = self.forward_dilated_res_block(
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 223, in forward_dilated_res_block
    f_pc = m_conv2d(feature, training=self.training)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1030, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/utils/helper_tf.py", line 50, in call
    x = self.conv(x)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1013, in __call__
    input_spec.assert_input_compatibility(self.input_spec, inputs, self.name)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/input_spec.py", line 230, in assert_input_compatibility
    raise ValueError('Input ' + str(input_index) + ' of layer ' +
ValueError: Input 0 of layer conv2d_1 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (45056, 8, 1)

Error message when not disabling it:

(o3dmltf) ub18@ub18-desktop:~/Open3D-ML$ python test_tf.py 
2021-12-03 10:23:23.552434: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:23:24.305214: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:23:24.842576: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-12-03 10:23:24.842695: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.843063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-12-03 10:23:24.843114: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-12-03 10:23:24.844961: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-12-03 10:23:24.845016: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-12-03 10:23:24.845716: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-12-03 10:23:24.845909: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-12-03 10:23:24.846345: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-12-03 10:23:24.846964: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-12-03 10:23:24.847090: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-12-03 10:23:24.847163: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.847495: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.847794: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-12-03 10:23:24.848037: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-03 10:23:24.848393: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.848715: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 2070 SUPER computeCapability: 7.5
coreClock: 1.785GHz coreCount: 40 deviceMemorySize: 7.79GiB deviceMemoryBandwidth: 417.29GiB/s
2021-12-03 10:23:24.848762: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.849049: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:24.849317: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-12-03 10:23:25.238891: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-12-03 10:23:25.238923: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264]      0 
2021-12-03 10:23:25.238943: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0:   N 
2021-12-03 10:23:25.239057: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:25.239391: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:25.239691: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-12-03 10:23:25.239985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6228 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 2070 SUPER, pci bus id: 0000:01:00.0, compute capability: 7.5)
INFO - 2021-12-03 10:23:25,397 - semantic_segmentation - <open3d._ml3d.tf.models.randlanet.RandLANet object at 0x7f6061d11d90>
INFO - 2021-12-03 10:23:25,397 - semantic_segmentation - Logging in file : ./logs/RandLANet_SemanticKITTI_tf/log_train_2021-12-03_10:23:25.txt
INFO - 2021-12-03 10:23:25,426 - semantickitti - Found 19130 pointclouds for training
INFO - 2021-12-03 10:23:28,380 - semantickitti - Found 4071 pointclouds for validation
INFO - 2021-12-03 10:23:29,046 - semantic_segmentation - Writing summary in train_log/00016_RandLANet_SemanticKITTI_tf.
INFO - 2021-12-03 10:23:29,048 - semantic_segmentation - Initializing from scratch.
INFO - 2021-12-03 10:23:29,048 - semantic_segmentation - === EPOCH 0/100 ===
training:   0%|                                                                                                                                         | 0/4783 [00:00<?, ?it/s]2021-12-03 10:23:29.066488: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-12-03 10:23:29.085860: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 3600000000 Hz
2021-12-03 10:23:29.460843: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-12-03 10:23:29.849283: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-12-03 10:23:29.948460: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-12-03 10:23:30.323331: I tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded cuDNN version 8201
2021-12-03 10:23:40.506214: W tensorflow/core/common_runtime/bfc_allocator.cc:456] Allocator (GPU_0_bfc) ran out of memory trying to allocate 88.00MiB (rounded to 92274688)requested by op BiasAdd
If the cause is memory fragmentation maybe the environment variable 'TF_GPU_ALLOCATOR=cuda_malloc_async' will improve the situation. 
Current allocation summary follows.
Current allocation summary follows.
2021-12-03 10:23:40.506317: I tensorflow/core/common_runtime/bfc_allocator.cc:991] BFCAllocator dump for GPU_0_bfc

...

2021-12-03 10:23:40.523610: I tensorflow/core/common_runtime/bfc_allocator.cc:1066] Stats: 
Limit:                      6531317760
InUse:                      6508513280
MaxInUse:                   6508513280
NumAllocs:                         487
MaxAllocSize:                184549376
Reserved:                            0
PeakReserved:                        0
LargestFreeBlock:                    0

2021-12-03 10:23:40.523648: W tensorflow/core/common_runtime/bfc_allocator.cc:467] ****************************************************************************************************
2021-12-03 10:23:40.523770: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at bias_op.cc:331 : Resource exhausted: OOM when allocating tensor with shape[11264,16,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
training:   0%|                                                                                                                                         | 0/4783 [00:11<?, ?it/s]
Traceback (most recent call last):
  File "/home/ub18/Open3D-ML/test_tf.py", line 23, in <module>
    pipeline.run_train()
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/pipelines/semantic_segmentation.py", line 250, in run_train
    results = model(inputs, training=True)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1030, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 259, in call
    f_encoder_i = self.forward_dilated_res_block(
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 225, in forward_dilated_res_block
    f_pc = self.forward_building_block(xyz, f_pc, neigh_idx, name + 'LFA')
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 208, in forward_building_block
    f_pc_agg = self.forward_att_pooling(f_concat, name + 'att_pooling_1')
  File "/home/ub18/.local/lib/python3.9/site-packages/open3d/_ml3d/tf/models/randlanet.py", line 173, in forward_att_pooling
    att_activation = m_dense(f_reshaped)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1030, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/keras/layers/core.py", line 1253, in call
    outputs = nn_ops.bias_add(outputs, self.bias)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/ops/nn_ops.py", line 3377, in bias_add
    return gen_nn_ops.bias_add(
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 678, in bias_add
    _ops.raise_from_not_ok_status(e, name)
  File "/home/ub18/anaconda3/envs/o3dmltf/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 6897, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[11264,16,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:BiasAdd]

Dec 03 '21 01:12 numahha

Open3D-ML Open3D-ML copied to clipboard

Difference in batch size available for PyTorch and Tensorflow on the same GPU

Open3D-ML
Open3D-ML copied to clipboard