petals icon indicating copy to clipboard operation
petals copied to clipboard

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading

Results 92 petals issues
Sort by recently updated
recently updated
newest added

This would help to take the server load into account while planning a route for inference and fine-tuning.

We should create a detailed API reference for public interface to ease the development for new users.

documentation

I ran into this when trying to run: https://github.com/petals-infra/chat.petals.dev ``` $ flask run --host=0.0.0.0 --port=5000 Floating point exception (core dumped) ``` But I believe this is an issue with the...

I'm running a 660Ti, so I'm pretty used to it not playing well with other things. If there's something I can do, that would be great. Otherwise, this is just...

As the title says, how can I parallelize this? ``` def generate_output(row): inputs = tokenizer(prompt, return_tensors="pt")["input_ids"] outputs = model.generate(inputs, max_new_tokens=185, temperature=0.0, eos_token_id=tokenizer.encode("}")[0]) result = tokenizer.decode(outputs[0]) completion = extract_completion(result) for index,...

I added my RTX 3080 to swarm using: conda install pytorch pytorch-cuda=11.7 -c pytorch -c nvidia pip install git+https://github.com/bigscience-workshop/petals python -m petals.cli.run_server enoch/llama-65b-hf --adapters timdettmers/guanaco-65b But I still find my...

One of the obstacles of using petals is the fact that there is no privacy. It would be great to add some features for this. I'm not an expert, but...

I am following this basic tutorial, and I'm wondering how I save the fine tuned model and use it later on? For example, in this tutorial, we fine tune a...

Problem: if some (but not all) servers support longer sequence length, inferencing with that sequence length would be very inefficient because the client will constantly bump into short-length servers. Suggested...

How it will be running on the [FLUX network](https://runonflux.io/) ? They have enterprise grade hardware, lots of compute power and muuuch cheaper prices. Don't know about GPUs, but nevertheless, I...