petals issues

Results 92 petals issues

Sort by recently updated

Feature Request: Distributed inference API where the "miners" are paid.

I'm imagining a situation where people can infer for others and make a small profit. In practicality it is similar to, but more useful than, cryptocurrency mining. You would get...

NewtonTrendy

Grok | Mixture-of-Experts | Model Support

[Grok](https://github.com/xai-org/grok-1) architecture and weights were just released, do Petals support, or is it in plan to support Grok and MOE models? Having a first in class 314B parameter model running...

tibrezus

Fix push-docker-image's "No space left on device"

dvmazur

Example of agents usage with Petals

RomaA2000

replaced call to `_prepare_decoder_attention_mask()` with `_prepare_4d_causal_attention_mask()`

fixes issue #536 edited calls to mask methods consistently with https://github.com/huggingface/transformers/pull/27086

poedator

install script to help the project deployment or for easy use

install script to help the project deployment or for easy use. actually he is alone for windows wsl nvidia config. but feel free to put it in a folder with...

elsolo5000-2

Fix retries during inference

#331 introduced a bug during inference retries that caused this: ```python [INFO] Route found: 0:18 via …1EBzGt [WARN] [petals.client.inference_session.step:327] Caught exception when running inference via RemoteSpanInfo(peer_id=, start=0, end=18, server_info=ServerInfo(state=, throughput=1040.823002928876,...

borzunov

petals
petals copied to clipboard

Metadata

Feature Request: Distributed inference API where the "miners" are paid.

Grok | Mixture-of-Experts | Model Support

Fix push-docker-image's "No space left on device"

Example of agents usage with Petals

replaced call to `_prepare_decoder_attention_mask()` with `_prepare_4d_causal_attention_mask()`

install script to help the project deployment or for easy use

Fix retries during inference

Add loading LoRA adapters from clients' requests

Remove smaller limit for legacy bfloat16 serialization

Llama: Merge query/key/value projection layers

← Metadata

Owner

Metadata

petals petals copied to clipboard

Metadata

← Metadata

Owner

Metadata

petals
petals copied to clipboard