The async variant for NR works but the read throughput is still much lower than non-async (30% drop in perf). Likely this is due to necessary allocations.