Danylo Lykov
Danylo Lykov
@mkshah5 I'm getting some strange compression timings: ``` Compression end timestamp: 6482.277344 ms CUDA Error: no error Compress: Measure: 6.961s, 256.00MB -> 2.22MB (115.254 in/out ratio) CUDA Error: no error...
I pushed a preprocessing [file](https://github.com/danlkv/QTensor/blob/compression/bench/qc_simulation/data/preprocess/qaoa_maxcut/3reg_N256_p1.jsonterms_Otamaki_120_M30). The circuit is pretty large, but you can see the timings early on before it gets too big. To test, go to `bench/qc_simulation` and run...
@mkshah5 Any updates on the compression speed?
@mkshah5 Thanks! I am seeing around 0.1-1 GB/s throughput. Sometimes I see some inconsistency between python and c time measures: ``` CUDA Error: no error GPU decompression timing: 1.644544 ms...
@mkshah5 I added a smaller instance for local tests. Run from `bench/qc_simulation`: ``` ./main.py simulate ./data/preprocess/qaoa_maxcut/3reg_N52_p3.jsonterms_Otamaki_3_M30 ./data/simulations/qaoa_maxcut/{in_file}_cM{M} --sim qtensor -M 25 --backend=cupy --compress=szx ``` This creates a json file with...
@mkshah5 I can see that decompression times dropped by ~100x, that's great! The compression throughput is now the bottleneck, being about 0.2 GB/s. Is it possible to improve this?
This is the test script I use for benchmark (it's a bit smaller than the previous one): ``` ./main.py simulate ./data/preprocess/qaoa_maxcut/3reg_N52_p3.jsonterms_Otamaki_3_M30 ./data/simulations/qaoa_maxcut/{in_file}_cM{M} --sim qtensor -M 25,26,27 --backend=cupy --compress=szx```
@mkshah5 Here's a line profile fo cuszx_wrapper compression: ``` Total time: 3.50393 s File: /home/danlkv/qsim/QTensor/qtensor/compression/szx/src/cuszx_wrapper.py Function: cuszx_device_compress at line 71 Line # Hits Time Per Hit % Time Line Contents...
@mkshah5 Why do you have [.get()](https://docs.cupy.dev/en/stable/reference/generated/cupy.ndarray.html#cupy.ndarray.get)? > Returns a copy of the array on host memory. I hope this doesn't mean we copy the data 4 times to host and...
I think the issue is that there seems to be no way to output only the functions I care about (but maybe there's another way?), only the full file. The...