bitsandbytes [RFC] Cross-Platform Refactor: Overview + Link Hub

[RFC] Cross-Platform Refactor: Overview + Link Hub

Open Titus-von-Koeller opened this issue 1 year ago • 1 comments

Disclaimer: This document is dynamic and will be updated to reflect the evolving consensus and decisions made throughout our discussions in each RFC issue.

Central Hub for Cross-Platform Enhancements

Welcome to the meta RFC (Request for Comments) designed to navigate the needed enhancements for enabling cross-platform compatibility within bitsandbytes. This thread is intended to be the nucleus of all RFC discussions, interlinking various topics and proposals.

Our mission is to consolidate efforts from across our community, establish clear objectives, agree on optimal strategies, and coordinate contributions to ensure a seamless transition to full cross-platform support.

We would like an approach were the individual algorithms of the backend can be gradually or even only partially implemented by whomever is willing to put in the work.

The core areas identified by the BNB maintainers for community input and collaboration include:

Testing and CI/CD Infrastructure

To facilitate community contributions and ensure rapid integration, we're focusing on establishing a robust CI/CD framework. This system should support all platforms, providing immediate and actionable feedback on submitted PRs. Essential to this effort is addressing and resolving existing issues like flaky tests.

Detailed Discussion: [RFC] Cross-Platform Refactor: Testing and CI/CD Strategy #1031

Build Process and Distribution

Adapting our build and distribution processes to accommodate various platforms is crucial. We're exploring efficient strategies, including CMake and GitHub Actions, for building binaries. Additionally, we need to tackle challenges related to package hosting, size constraints, and the distribution of binary wheels.

In-Depth Conversation: [RFC] Cross-Platform Refactor: Build System and Binary Distribution #1032
related: [RFC] Cross-Platform Refactor: CPU-only implementation #1021

Setup

There's been ongoing discussion around improving bitsandbytes/cuda_setup and it plays a role in the Intel + Windows related PRs. So we need to make sure that we align how this module goes on in the light of the other topics.

Another topic is that we get many issues that are confused to be the same error, but aren't. We need to make sure that we get actionable issues when people have issues that they get notified through this module (e.g. introduce error codes, improve issue template).

In-Depth Conversation: #918

Intel CPU + GPU backend

This RFC proposes extending the bitsandbytes library to support Intel CPUs and GPUs. The approach includes introducing a device abstraction layer to simplify adding non-CUDA devices and leveraging the PyTorch 2.x compiler stack alongside Intel Extension for PyTorch (IPEX) for lightweight integration.

This enables 8-bit and 4-bit precision features on Intel platforms without the need for native backend code, reducing complexity and maintenance. Key performance functions will utilize IPEX, while others will be optimized using PyTorch's compilation technology. The plan involves phased PRs to implement these features, alongside proposed changes to the Transformers library to expand bitsandbytes API usage across multiple devices.

steps (as PRs):

#898 (important ongoing discussion about main abstraction for supporting other backend)
jianan-gu#3
jianan-gu#4

(Since the unmerged PRs have dependencies to each other, some of those PRs are in other repos, but they will be rebased when everything is ready.)

AMD

PRs #756 AMD semi-official fork issues https://github.com/nixified-ai/flake/issues/56

Please @arlo-phoenix or @fxmarty create an RFC issue of the same format as the Apple one to centralize discussions, as well as decisions and tracking of open work.

comment from @arlo-phoenix:

I only skimmed through #898, but from what I see the idea is to add the ability to have different backends with one of them being the current implementation now under a CudaBackend. From my perspective this won't really change this PR that much then (only gotta move some checks) since there isn't really a need for a separate backend for HIP and AMD GPU's should just use the CudaBackend as well.

One improvement could be moving the defines to a separate header hip-compat.h so it's better separated.

The Makefile definitely still needs work, as already said never worked with them directly

If there is a move towards a CMakeFile for Windows Support (I think there are several PR's) I could try to make this work with CMake. Should be easier to add good integration that doesn't bother Cuda compilation as I'm more experienced with that

Apple Silicon

Summary and coordination of the ongoing efforts can be found and should be discussed here: [RFC] Cross-Platform Refactor: Mac M1 support.

issue #252 #485

Windows

@wkpark: please provide overview given in https://github.com/TimDettmers/bitsandbytes/discussions/990#discussioncomment-8314733 in a dedicated RFC issue akin to the one Apple Silicon one above.

Community Contributions and Engagement

We believe in the power of community-driven development and encourage contributions from everyone. Your insights, code snippets, and solution proposals are super important for making this a reality.

The RFCs are for really goal directed technical discussions. But don't hold back and let us know what you think, even if it's unstructured or your not sure about it:

For brainstorming and informal discussions: Join our community forum.

Jan 30 '24 20:01 Titus-von-Koeller

Please @arlo-phoenix or @fxmarty create an RFC issue of the same format as the Apple one to centralize discussions, as well as decisions and tracking of open work.

I'll write the RFC after the device abstraction is merged since most stuff going forward for HIP depends on that.

Edit: AMD themselves will open a PR and there won't be a RFC

Feb 19 '24 11:02 arlo-phoenix

bitsandbytes bitsandbytes copied to clipboard

[RFC] Cross-Platform Refactor: Overview + Link Hub

Central Hub for Cross-Platform Enhancements

Testing and CI/CD Infrastructure

Build Process and Distribution

Setup

Intel CPU + GPU backend

AMD

Apple Silicon

Windows

Community Contributions and Engagement

bitsandbytes
bitsandbytes copied to clipboard