spikingjelly
spikingjelly copied to clipboard
I am getting the follwing NotImplementedError with spikingjelly
I am using spikingjelly 0.0.0.0.14. I am not sure why I am getting this error
Comet is not installed, Comet logger will not be available. 2023-09-28 20:09:07.520963: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
TF_ENABLE_ONEDNN_OPTS=0. 2023-09-28 20:09:07.546317: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. 2023-09-28 20:09:08.020003: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT Namespace(dataset='gen1', path='../atis_data/atis/train_a/', num_classes=2, b=48, sample_size=100000, T=5, tbin=2, image_shape=(240, 304), epochs=50, lr=0.001, wd=0.0001, num_workers=4, train=True, test=False, device=0, precision=16, save_ckpt=True, comet_api=None, model='vgg-11', bn=True, pretrained_backbone=None, pretrained=None, extras=[640, 320, 320], min_ratio=0.05, max_ratio=0.8, aspect_ratios=[[2], [2, 3], [2, 3], [2, 3], [2], [2]], box_coder_weights=[10.0, 10.0, 5.0, 5.0], iou_threshold=0.5, score_thresh=0.01, nms_thresh=0.45, topk_candidates=200, detections_per_img=100) [256, 512, 512, 640, 320, 320] /home/ubinet_admin/anaconda3/lib/python3.11/site-packages/lightning_fabric/connector.py:554: UserWarning: 16 is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead! rank_zero_warn( /home/ubinet_admin/anaconda3/lib/python3.11/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:508: UserWarning: You passed
Trainer(accelerator='cpu', precision='16-mixed')but AMP with fp16 is not supported on CPU. Using
precision='bf16-mixed'instead. rank_zero_warn( Using bfloat16 Automatic Mixed Precision (AMP) GPU available: True (cuda), used: False TPU available: False, using: 0 TPU cores IPU available: False, using: 0 IPUs HPU available: False, using: 0 HPUs /home/ubinet_admin/anaconda3/lib/python3.11/site-packages/pytorch_lightning/trainer/setup.py:176: PossibleUserWarning: GPU available but not used. Set
acceleratorand
devicesusing
Trainer(accelerator='gpu', devices=1). rank_zero_warn(
Trainer(limit_train_batches=1.0)` was configured so 100% of the batches per epoch will be used..
File loaded.
File loaded.
Number of parameters: 12652695
| Name | Type | Params
0 | backbone | DetectionBackbone | 11.9 M 1 | anchor_generator | GridSizeDefaultBoxGenerator | 0 2 | head | SSDHead | 742 K
12.7 M Trainable params
0 Non-trainable params
12.7 M Total params
50.611 Total estimated model params size (MB)
Sanity Checking DataLoader 0: 0%| | 0/2 [00:00<?, ?it/s]Traceback (most recent call last):
File "/home/ubinet_admin/Documents/argha_work/object-detection-with-spiking-neural-networks/object_detection.py", line 140, in
Do you use TPU? CuPy can not work on TPU.
No. I am using CPU.
AMP with fp16 is not supported on CPU
Okay I tried GPU but now I am getting the following OSError: [Errno 24] Too many open files
Here is it.
Comet is not installed, Comet logger will not be available.
2023-10-11 10:29:25.291527: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2023-10-11 10:29:25.312584: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-10-11 10:29:25.698530: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Namespace(dataset='gen1', path='../atis_data/atis/train_a', num_classes=2, b=8, sample_size=100000, T=5, tbin=2, image_shape=(240, 304), epochs=50, lr=0.001, wd=0.0001, num_workers=4, train=True, test=False, device=0, precision=16, save_ckpt=True, comet_api=None, model='vgg-11', bn=True, pretrained_backbone=None, pretrained=None, extras=[640, 320, 320], min_ratio=0.05, max_ratio=0.8, aspect_ratios=[[2], [2, 3], [2, 3], [2, 3], [2], [2]], box_coder_weights=[10.0, 10.0, 5.0, 5.0], iou_threshold=0.5, score_thresh=0.01, nms_thresh=0.45, topk_candidates=200, detections_per_img=100)
[256, 512, 512, 640, 320, 320]
/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/lightning_fabric/connector.py:554: UserWarning: 16 is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!
rank_zero_warn(
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Trainer(limit_train_batches=1.0)
was configured so 100% of the batches per epoch will be used..
File loaded.
File loaded.
You are using a CUDA device ('NVIDIA RTX A2000 12GB') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high')
which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Number of parameters: 12652695
| Name | Type | Params
0 | backbone | DetectionBackbone | 11.9 M 1 | anchor_generator | GridSizeDefaultBoxGenerator | 0 2 | head | SSDHead | 742 K
12.7 M Trainable params
0 Non-trainable params
12.7 M Total params
50.611 Total estimated model params size (MB)
Sanity Checking DataLoader 0: 0%| | 0/2 [00:00<?, ?it/s]/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/pytorch_lightning/utilities/data.py:76: UserWarning: Trying to infer the batch_size
from an ambiguous collection. The batch size we found is 8. To avoid any miscalculations, use self.log(..., batch_size=batch_size)
.
warning_cache.warn(
Sanity Checking DataLoader 0: 100%|████████████████████████████████████████████████████| 2/2 [00:00<00:00, 3.36it/s]
[0] val results:
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type bbox
DONE (t=0.05s).
Accumulating evaluation results...
DONE (t=0.00s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Epoch 0: 13%|▏| 225/1682 [01:09<07:32, 3.22it/s, v_num=76, train_loss_bbox_step=2.900, train_loss_classif_step=0.83Traceback (most recent call last):
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/queues.py", line 244, in _feed
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
File "/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 370, in reduce_storage
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 198, in DupFd
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files
Epoch 0: 13%|▏| 226/1682 [01:10<07:31, 3.22it/s, v_num=76, train_loss_bbox_step=4.100, train_loss_classif_step=0.77Traceback (most recent call last):
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/queues.py", line 244, in _feed
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
File "/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 370, in reduce_storage
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 198, in DupFd
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files
Epoch 0: 13%|▏| 227/1682 [01:10<07:31, 3.22it/s, v_num=76, train_loss_bbox_step=2.960, train_loss_classif_step=0.70Traceback (most recent call last):
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/queues.py", line 244, in _feed
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
File "/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 370, in reduce_storage
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 198, in DupFd
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files
Epoch 0: 14%|▏| 228/1682 [01:10<07:31, 3.22it/s, v_num=76, train_loss_bbox_step=3.440, train_loss_classif_step=0.70Traceback (most recent call last):
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/queues.py", line 244, in _feed
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps
File "/home/ubinet_admin/anaconda3/lib/python3.11/site-packages/torch/multiprocessing/reductions.py", line 370, in reduce_storage
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 198, in DupFd
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 48, in init
OSError: [Errno 24] Too many open files
Epoch 0: 14%|▏| 229/1682 [01:11<07:30, 3.22it/s, v_num=76, train_loss_bbox_step=3.610, train_loss_classif_step=0.85Traceback (most recent call last):
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 145, in _serve
send(conn, destination_pid)
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/resource_sharer.py", line 50, in send
reduction.send_handle(conn, new_fd, pid)
File "/home/ubinet_admin/anaconda3/lib/python3.11/multiprocessing/reduction.py", line 183, in send_handle
with socket.fromfd(conn.fileno(), socket.AF_UNIX, socket.SOCK_STREAM) as s:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubinet_admin/anaconda3/lib/python3.11/socket.py", line 546, in fromfd
nfd = dup(fd)
^^^^^^^
OSError: [Errno 24] Too many open files
Traceback (most recent call last):
File "/home/ubinet_admin/Documents/argha_work/object-detection-with-spiking-neural-networks/object_detection.py", line 140, in
Hi, can you provide the minimal codes to reproduce the errors?