llm-inference topics

Tensor parallelism is all you need. Run LLMs on an AI cluster at home using any device. Distribute the workload, divide RAM usage, and increase inference speed.

b4rtaz

distributed-computing

distributed-llm

llama2

llm

LLM-FineTuning-Large-Language-Models

438

Stars

108

Forks

Watchers

LLM (Large Language Model) FineTuning

rohan-paul

gpt-3

gpt3-turbo

large-language-models

llama2

embedding_studio

377

Stars

5

Forks

Watchers

Embedding Studio is a framework which allows you transform your Vector Database into a feature-rich Search Engine.

EulerSearch

embeddings

embeddings-similarity

fine-tuning

llm-inference

LLMtuner

228

Stars

14

Forks

Watchers

FineTune LLMs in few lines of code (Text2Text, Text2Speech, Speech2Text)

promptslab

fine-tuning

fine-tuning-llm

finetune

finetune-gpt

lmdeploy

4.5k

Stars

404

Forks

Watchers

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

InternLM

codellama

cuda-kernels

deepspeed

fastertransformer

LeanCopilot

957

Stars

83

Forks

Watchers

LLMs as Copilots for Theorem Proving in Lean

lean-dojo

formal-mathematics

lean

lean4

llm-inference