QuantumClifford.jl
QuantumClifford.jl copied to clipboard
CUDA.@sync for timing
document that CUDA.@sync is necessary for timing. I removed all unnecessary synchronizations for the sake of improving performance. as a result, when using apply! on gpu, the result is computed asynchronously. (however, everything gets synced up when users copy the data to cpu to see the result).