aur61

Results 3 issues of aur61

Fixes https://github.com/pytorch/tutorials/issues/2687 ## Description 1. Fixed the unpack error 2. Add code for transferring data to the GPU ## Checklist - [x] The issue that is being fixed is referred...

cla signed
C++
CUDA

### Add Link https://pytorch.org/tutorials/advanced/cpp_extension.html ### Describe the bug Code to reproduce the issue. ```python import time import torch batch_size = 16 input_features = 32 state_size = 128 # Check if...

bug

Great job, starred! I do have a few questions: 1. Did you test the e2e generation speed, specifically in terms of tokens/second or the latency of the first token? 2....