Kunwar Raj Singh comments

Results 47 comments of


                                            Kunwar Raj Singh

MaskRCNN Inference

The result should exactly match with reference now. the resnet being used in reference implementation had strides in first layer of the bottleneck, causing the difference in outputs. Now I'm...

MaskRCNN Inference

@geohot getting Box AP 0.374 and Mask AP 0.342, so close now!

@geohot Added model in eval, `MODEL=mrcnn python examples/mlperf/model_eval.py` So we are short of mlperf requrirement in bbox by only 0.001 pts. I strongly beleive such small difference is because of...

MaskRCNN Inference

I started facing the issue of kernels having too many args after I removed numpy from hot paths, for now i used the fix in https://github.com/geohot/tinygrad/issues/953 which isnt fully correct...

MaskRCNN Inference

Updates: 1. Roi Align is now implemented, but still has to use numpy for gathers, because the gathers are on huge tensors. A single forward pass needs to do all...

MaskRCNN Inference

master merge seems to have broken, something, the model isnt working now. looking into it EDIT: fixed now

MaskRCNN Inference

@geohot, Latest results `GPU=1 OPT=1 MODEL=mrcnn python examples/mlperf/model_eval.py` Inference on 5k images ran in 9 hours 59 mins with OPENCL and Nvidia RTX 3060 Mobile bbox ``` Average Precision (AP)...

MaskRCNN Inference

> Is this ready for me to test? Will run on 7900XTX and confirm it meets the target yes @geohot , run on 7900XTX should take around 3-4 hours `GPU=1...

MaskRCNN Inference

> Made it to: 3%|████ 3%|████ | 136/5000 [1:35:07 > and got > > pyopencl._cl.MemoryError: create_buffer failed: MEM_OBJECT_ALLOCATION_FAILURE > > 7900XTX with 24GB of VRAM So I’ve been using OPT=1...

Limit the number of ops which can be evaluated lazily

`433 arguments with a total size of 3464 bytes` means each arg is 8 bytes, and I found the no. of args to be `counter + 1` in my case,...