joshpopelka20

Results 44 comments of joshpopelka20

how do you manually change these lambda functions: "amplify-login-(verify/create/custom/define)-(ID)"? I don't see any Cloudformation templates for them.

any updates? Also, looking to have this feature

I have a similar use case, where I need to shard a large model (gradient.ai llama3 262K context) across multiple GPUs. Looks like Pytorch has "fully sharded data parallel" [https://pytorch.org/blog/introducing-pytorch-fully-sharded-data-parallel-api/...

Not sure if I'm doing this right, but this is the code I have so far: ``` inputs = tokenizer(text, return_tensors="pt") outputs = model(**inputs) predictions = outputs.logits print(PostProcessPicker.get_threshold_max(predictions, 1.8982457699258832e-06)) ```...

I'm not understanding this piece of code: ``` # Case 2: Get the predictions - where we also pass a labels list(that can be used to ignore predictions at certain...

The decode method seems to require the labels list. I've tried to create labels list with the same shape as the predictions tensor, but I'm getting a different error. Code:...

Did you add the HuggingFace Token? I got the same error `RequestError(Status(401, Response[status: 401, status_text: Unauthorized, url: https://huggingface.co/api/models/revision/main]))` until I added the token. Here are the ways you can add...

I've been researching the algorithm further, and I'm thinking I'm going to have a problem implementing this with Rust. To start, based on my understanding, I'd need to split the...

I've been trying to implement this algorithm from the paper [https://arxiv.org/pdf/2310.01889](https://arxiv.org/pdf/2310.01889), and it really isn't working. ![image](https://github.com/user-attachments/assets/09ef9903-511b-4cc0-b378-9507d50a572f) The KV cache isn't being split so that's a big problem, but I'm...

Just adding a little more info. I think the biggest problem I'm facing is that the KV cache needs to cycle between GPUs. I'm trying to do this with a...