Ben Gorlick
Ben Gorlick
Due to having no scrolling horizontal bar on this page https://test.elementsproject.org/elements/confidential-transactions/ the TX code runs off the of the edge of the screen in browser (Chrome)
Clarified instructions for installing CUDNN and front-end headers. Added cudnn installation steps for Arch Linux.
optimization in matmul_backward_bias_kernel - measurable 7-8% speedup in the kernel in isolated benchmarking. ## summary optimizes the matmul_backward_bias_kernel, resulting in an average speedup of 7-8%. ## changes include: reduced cast...
In the forward pass in gpt2_train.cu - adding cuda streams with events for async layered operations - added offset precalculations and cache prefetching for efficient data access with high temporal...
The gradient norm calculation is improved by: - Reusing variables (ShardInfo tensor and ShardInfo shard) to reduce redundancy and enhance readability. - Introducing is_first_pass flag to clearly determine the first...
qitari - Rust tool to streamline setup, config and usage of Tari toolchain (starting with Tor). Description --- Currently implements the complete tor setup and more, with cli -h qitari...