Vedant Roy
Vedant Roy
I wrote a helper that allows someone to use CuDNN attention within Pytorch seamlessly. ```python import cudnn import torch import math # export CUDNN_FRONTEND_LOG_FLIE=fe.log # export CUDNN_FRONTEND_LOG_INFO=1 # import os...
Right now -- my k/v vectors are padded since I have different sequence lengths. I was wondering, is performance better using ragged tensors / non-padded key/value vectors?
I'd like to understand what things my model is caching during the forward pass for the backward pass. Ideally sorted, so I can see the biggest memory-hogs. Do you have...
I'm running my code with: ``` env CUDNN_LOGERR_DBG=1 CUDNN_LOGDEST_DBG=stderr torchrun --standalone --nproc_per_node=8 -m extra_scripts.model_playground_train ``` and getting errors like: ``` [rank5]: RuntimeError: /home/ved/TransformerEngine/transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu:358 in function fused_attn_arbitrary_seqlen_fwd_impl: cuDNN Error: execute(handle, plan->get_raw_desc(),...
Does this support it/sec or sec/it (it = iteration), similar to the Python tqdm package? Would be useful to calculate things like throughput.
**Version**: 5.1.0. **Platform**: `python:3.10-slim` Docker image. **Description**: I can connect if I manually specify the hostname: ``` self.redis = aioredis.Redis( host='my-upstash-subdomain.upstash.io', port=6379, password='****', ssl=True, decode_responses=False ) ``` But I get...
I see there's a command, but I have no idea how to use it. It says my torrent's meta-info has no tracker information. Can you give an example CLI command?
### What version of Bun is running? 1.1.38+bf2f153f5 ### What platform is your computer? Linux 6.8.0-36-generic x86_64 x86_64 ### What steps can reproduce the bug? `bunx prisma generate` fails (infinite...
I have the following code in my training loop: ``` if rank == 0: t = Thread( target=save_file, args=(model_sd, f"{cfg.model_dir}/model_{step + 1}.safetensors"), daemon=True ) t.start() ``` Which saves the checkpoint...
I am trying to launch a ECS cluster that is backed by an auto-scaling group that uses spot instances. To do this, I use the following code: ```typescript const projectName...