Kunwar Raj Singh
Kunwar Raj Singh
The result should exactly match with reference now. the resnet being used in reference implementation had strides in first layer of the bottleneck, causing the difference in outputs. Now I'm...
@geohot getting Box AP 0.374 and Mask AP 0.342, so close now!
@geohot Added model in eval, `MODEL=mrcnn python examples/mlperf/model_eval.py` So we are short of mlperf requrirement in bbox by only 0.001 pts. I strongly beleive such small difference is because of...
I started facing the issue of kernels having too many args after I removed numpy from hot paths, for now i used the fix in https://github.com/geohot/tinygrad/issues/953 which isnt fully correct...
Updates: 1. Roi Align is now implemented, but still has to use numpy for gathers, because the gathers are on huge tensors. A single forward pass needs to do all...
master merge seems to have broken, something, the model isnt working now. looking into it EDIT: fixed now
@geohot, Latest results `GPU=1 OPT=1 MODEL=mrcnn python examples/mlperf/model_eval.py` Inference on 5k images ran in 9 hours 59 mins with OPENCL and Nvidia RTX 3060 Mobile bbox ``` Average Precision (AP)...
> Is this ready for me to test? Will run on 7900XTX and confirm it meets the target yes @geohot , run on 7900XTX should take around 3-4 hours `GPU=1...
> Made it to: 3%|████ 3%|████ | 136/5000 [1:35:07 > and got > > pyopencl._cl.MemoryError: create_buffer failed: MEM_OBJECT_ALLOCATION_FAILURE > > 7900XTX with 24GB of VRAM So I’ve been using OPT=1...
`433 arguments with a total size of 3464 bytes` means each arg is 8 bytes, and I found the no. of args to be `counter + 1` in my case,...