Danylo Lykov comments

Results 43 comments of


                                            Danylo Lykov

Compression

@mkshah5 I'm trying to build sz compression as described in `README_python.md`. I get the following error: ``` DynamicByteArray.c:13:10: fatal error: DynamicByteArray.h: No such file or directory 13 | #include "DynamicByteArray.h"...

Compression

@mkshah5 compression with arbitrary dtype seems to work for small tensors (tested float32, float64, and complex128), but fails on larger tensor. What I do is adjust num_elements to be larger...

Compression

I think the issue is that it doesn't make sense to treat single float64 as two float32s:) I tested complex64 and it works ok. Let's go with complex64 from now...

Compression

@mkshah5 I think using complex64 maybe actually is better. If one is using lossy compression simulator, they might tolerate loss in precision. After all, it's the same precision-memory tradeoff. Additionally,...

Compression

@mkshah5 It seems there is a memory leak when I compress data. Take a look at this test file: https://github.com/danlkv/QTensor/blob/9b787d17c0f877a44ed3e4310cad5519fec0754a/qtensor/compression/tests/test_memory_leak.py If you run `pytest -sk leak` in `qtensor/compression/tests` folder the...

Compression

@mkshah5 I added some `cudaFree` for arrays that were allocated. https://github.com/danlkv/QTensor/blob/e86e30331bb2cbb634da83daf654cfd4156f346a/qtensor/compression/szx/src/cuszx_entry.cu#L620-L624 It seems there are quite a few malloc-s going on for each compression. Could this significantly slow down the...

Compression

There's another thing that got my attention: I ran the simulation with `nvprof` and noticed that `device_post_proc` function takes a lot of time. In the code, it is called with...

Compression

@mkshah5 I've been running the simulations and unfortunately at some point they fail with ```CUDA error at cuszx_entry.cu:987 code=700(cudaErrorIllegalAddress) "cudaMalloc(&d_newdata, nbBlocks_h*bs*sizeof(float))"``` Additionally, I noticed that over time the compression ratio...

Compression

> Could compression run asynchronously? If so, we could have the GPU compress some data while there is some tensor operations going on simultaneously, masking compression cost to some degree....

Compression

> When running test_memory_leak, there seems to be consistent CR now. I'll keep looking for memory leaks that may be causing the CUDA error you mentioned here. Are there any...