✨ The Open-Source Serverless GPU Container Runtime ✨

Beta9

Beta9 makes it easy for developers to run serverless functions on cloud GPUs.

Features:

Run Python functions on thousands of GPUs in the cloud
Automatically scale up and scale down resources
Flexible: run workloads on the public cloud or your own hardware
Built for AI: store model weights in distributed storage and deploy custom models with ultra-fast, serverless cold starts

We use beta9 internally at Beam to run AI applications for users at scale.

Use-Cases

Serverless Inference Endpoints

Decorate Any Python Function

from beta9 import Image, endpoint


@endpoint(
    cpu=1,
    memory="16Gi",
    gpu="T4",
    image=Image(
        python_packages=[
            "vllm==0.4.1",
        ],  # These dependencies will be installed in your remote container
    ),
)
def predict():
    from vllm import LLM

    prompts = ["The future of AI is"]
    llm = LLM(model="facebook/opt-125m")
    output = llm.generate(prompts)[0]

    return {"prediction": output.outputs[0].text}

Deploy It to the Cloud

$ beta9 deploy app.py:predict --name llm-inference

=> Building image
=> Using cached image
=> Deploying endpoint
=> Deployed 🎉
=> Invocation details

curl -X POST 'https://app.beam.cloud/endpoint/llm-inference/v1' \
-H 'Authorization: Bearer [YOUR_AUTH_TOKEN]' \
-d '{}'

Fan-Out Workloads to Hundreds of Containers

from beta9 import function

# This decorator allows you to parallelize this function
# across multiple remote containers
@function(cpu=1, memory=128)
def square(i: int):
    return i**2


def main():
    numbers = list(range(100))
    squared = []

    # Run a remote container for every item in list
    for result in square.map(numbers):
        squared.append(result)

Enqueue Async Jobs

from beta9 import task_queue, Image


@task_queue(
    cpu=1.0,
    memory=128,
    gpu="T4",
    image=Image(python_packages=["torch"]),
    keep_warm_seconds=1000,
)
def multiply(x):
    result = x * 2
    return {"result": result}

# Manually insert task into the queue
multiply.put(x=10)

How It Works

Beta9 is designed for launching remote serverless containers quickly. There are a few things that make this possible:

A custom, lazy loading image format (CLIP) backed by S3/FUSE
A fast, redis-based container scheduling engine
Content-addressed storage for caching images and files
A custom runc container runtime

demo gif

Get Started

Beam Cloud (Recommended)

The fastest and most reliable way to get started is by signing up for our managed service, Beam Cloud. Your first 10 hours of usage are free, and afterwards you pay based on usage.

Open-Source Deploy (Advanced)

You can run Beta9 locally, or in an existing Kubernetes cluster using our Helm chart.

Local Development

Setting Up the Server

k3d is used for local development. You'll need Docker and Make to get started.

To use our fully automated setup, run the setup make target.

[!NOTE] This will overwrite some of the tools you may already have installed. Review the setup.sh to learn more.

make setup

Setting Up the SDK

The SDK is written in Python. You'll need Python 3.8 or higher. Use the setup-sdk make target to get started.

[!NOTE] This will install the Poetry package manager.

make setup-sdk

Using the SDK

After you've setup the server and SDK, check out the SDK readme here.

Contributing

We welcome contributions, big or small! These are the most helpful things for us:

Rank features in our roadmap
Open a PR
Submit a feature request or bug report

Community & Support

If you need support, you can reach out through any of these channels:

Slack (Chat live with maintainers and community members)
GitHub issues (Bug reports, feature requests, and anything roadmap related)
Twitter (Updates on releases and stuff)

beta9
beta9 copied to clipboard

Metadata

✨ The Open-Source Serverless GPU Container Runtime ✨

Beta9

Use-Cases

Serverless Inference Endpoints

Decorate Any Python Function

Deploy It to the Cloud

Fan-Out Workloads to Hundreds of Containers

Enqueue Async Jobs

How It Works

Get Started

Beam Cloud (Recommended)

Open-Source Deploy (Advanced)

Local Development

Setting Up the Server

Setting Up the SDK

Using the SDK

Contributing

Community & Support

Thanks to Our Contributors

← Metadata

Owner

Metadata

beta9 beta9 copied to clipboard

Metadata

✨ The Open-Source Serverless GPU Container Runtime ✨

Beta9

Use-Cases

Serverless Inference Endpoints

Decorate Any Python Function

Deploy It to the Cloud

Fan-Out Workloads to Hundreds of Containers

Enqueue Async Jobs

How It Works

Get Started

Beam Cloud (Recommended)

Open-Source Deploy (Advanced)

Local Development

Setting Up the Server

Setting Up the SDK

Using the SDK

Contributing

Community & Support

Thanks to Our Contributors

← Metadata

Owner

Metadata

beta9
beta9 copied to clipboard