py-videocore6
py-videocore6 copied to clipboard
Un-match with the `examples` result
Dear author,
I am using Raspberrry Pi 4B, and here is my result:
- sgemm.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/sgemm.py
==== sgemm example (1024x1024 times 1024x1024) ====
numpy: 0.08994 sec, 23.91 Gflop/s
QPU: 0.5661 sec, 3.799 Gflop/s
Minimum absolute error: 0.0
Maximum absolute error: 0.0003814697265625
Minimum relative error: 0.0
Maximum relative error: 0.13134673237800598
- pctr_gpu_clock.py
pi@node01:~/py-videocore6 $ sudo PYTHONPATH=sandbox/ python3 examples/pctr_gpu_clock.py
==== QPU clock measurement with performance counters ====
500.08264399999996 MHz
- memset.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/memset.py
==== memset example (64.0 MiB) ====
Preparing for buffers...
Traceback (most recent call last):
File "/home/pi/py-videocore6/examples/memset.py", line 148, in <module>
main()
File "/home/pi/py-videocore6/examples/memset.py", line 143, in main
memset(fill=0x5a5a5a5a, length=16 * 1024 * 1024)
File "/home/pi/py-videocore6/examples/memset.py", line 119, in memset
X.fill(~fill)
OverflowError: Python integer -1515870811 out of bounds for uint32
- scopy.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/scopy.py
==== scopy example (16.0 Mi elements) ====
Preparing for buffers...
Traceback (most recent call last):
File "/home/pi/py-videocore6/examples/scopy.py", line 201, in <module>
main()
File "/home/pi/py-videocore6/examples/scopy.py", line 196, in main
scopy(length=16 * 1024 * 1024)
File "/home/pi/py-videocore6/examples/scopy.py", line 181, in scopy
unif[-1] = 4 * (-len(unif) + 3)
~~~~^^^^
OverflowError: Python integer -8 out of bounds for uint32
- summation.py
pi@node01:~/py-videocore6 $ PYTHONPATH=sandbox/ python3 examples/summation.py
==== summaton example (32.0 Mi elements) ====
Preparing for buffers...
Traceback (most recent call last):
File "/home/pi/py-videocore6/examples/summation.py", line 189, in <module>
main()
File "/home/pi/py-videocore6/examples/summation.py", line 184, in main
summation(length=32 * 1024 * 1024)
File "/home/pi/py-videocore6/examples/summation.py", line 169, in summation
unif[-1] = 4 * (-len(unif) + 3)
~~~~^^^^
OverflowError: Python integer -12 out of bounds for uint32
In summary, only the QPU clock shows 500MHz, the same as the tutorial. But the segmm time of QPU is 0.5661 sec >> numpy: 0.08994 sec. Moreover, other examples have non-trivial bugs. Could you please give some thoughts about the solution?