Alex Cheema issues

Results 117 issues of


                                            Alex Cheema

Dynamic device capabilities

- Right now device capabilities are statically defined - It makes more sense for this to be dynamic since resources can change / utilisation can change.

[BOUNTY - $500] Distributed stable diffusion

Will require some core changes to how distributed inference works, hence higher bounty of $500. This would be a great contribution to exo.

Get rid of transformers dependency

- It's only used for tokenizers (processor for llava VLM) - The tokenizers code is fuzzy and bloated as a result of this hard to understand AutoTokenizer - Should be...

[BOUNTY - $200] Share kv cache between nodes for redundancy

https://github.com/exo-explore/exo/issues/23#issuecomment-2241521048 Perhaps after each inference, we synchronise the full kv cache between all nodes. This should be fairly straightforward, we can broadcast the entire cache. this would allow for saving...

enhancement

[BOUNTY - $500] Pipeline Parallel Inference

**Prerequisite:** https://github.com/exo-explore/exo/issues/1 **Motivation:** exo should use device resources as efficiently as possible. Current implementation underutilises available resources. **What:** See https://pytorch.org/docs/stable/pipeline.html **Reward:** $500 Bounty paid out with USDC on Ethereum, email...

enhancement

flow-based partitioning strategy

automatically determine estimate of device FLOPs

Right now FLOPs are displayed using a lookup. Often users are confused when it shows 0 FLOPs, so we should show an estimate of device FLOPs even if it's not...

[EXPLORATION] comfyui

comfyui is pretty awesome https://github.com/comfyanonymous/ComfyUI We've had a request to integrate this. Would be really cool to build and run pipelines across multiple devices.

Max tokens limits responses on a given request_id

Right now, `max_generate_tokens` option limits the total number of tokens a given request can return. The desired behaviour is that it should limit the number of tokens on a given...

Support Mixture of Expert (MoE) Models

![IMG_0084](https://github.com/user-attachments/assets/b52fd7a3-1b0d-48b5-9f61-afc5931e1d18)

enhancement