mlc-llm issues

open-llama-7b

1

## Instructions: 1. Clone https://huggingface.co/openlm-research/open_llama_7b_700bt_preview to local, 2. Link the cloned repo to `dist/models/open-llama-700bt-7b` 3. run `python3 build.py --debug-dump --model open-llama-700bt-7b --use-cache=0 --quantization q3f16_0` 4. run `./build/mlc_chat_cli --local-id open-llama-700bt-7b-q3f16_0` Then...

spectrometerHBH

Implement Python chat module and REST API

1

This PR adds the following: 1) A Python chat module with the same functionality defined in the CLI (note that this requires a module without tvm_runtime dependency, see changes to...

sudeepag

1.0(10) works on IpadOS (iPad Air 2020)

1

Just FYI: tested your TestFlight suite and it works just fine on my iPad Air (4. Generation) That's just 4GB RAM. is there a benchmark prompt or something? ![9252DFB8-6A34-4B94-84A9-CCD064367A6B](https://github.com/mlc-ai/mlc-llm/assets/69374354/d2808b73-c263-42c7-8d10-d95ae130d9a3)

su77ungr

[Feature] GPTQ Quantization Support for MLC-LLM

Hi, this pull request introduces support for a novel quantization method: GPTQ. This addition is driven by the observation that GPTQ demonstrates acceptable performance under lower bit representations, and tends...

LeiWang1999

How to build the "cpp" dir for android ?

1

Dear How to build the "cpp" dir as a stand alone executable bin for Android ? Thanks

huanyingjun

documentation

android

how to use mlc-llm-cli tools on android platform?

I tried to run the command line tools on the android system with cl enabled. I followed the instructions README file under android folder. But when I run the tools,...

NullCarrier

documentation

android

android maven repository

2

Can you provide the Android Maven SDK format?

BaeBae33

help wanted

cmake build android

2

I would suggest using Cmake to organize the C++ code in an Android project instead of using ndk-build.

thomaszheng

enhancement

How to run this using python?

6

I build the model ok, but don't know how to run it using python. python tests/chat.py ？？？ how to config it? It runs fail.

sleepwalker2017

feature request

[iOS] - `dolly-v2-3b` latency + accuracy

Hey all, ## TLDR I managed to build and deploy the `dolly-v2-3b` model to iOS `iPad Pro M1` (thanks to the helpful suggestions in #129 and #116). However, I noticed...

terryworona

mlc-llm
mlc-llm copied to clipboard

Metadata

open-llama-7b

Implement Python chat module and REST API

1.0(10) works on IpadOS (iPad Air 2020)

[Feature] GPTQ Quantization Support for MLC-LLM

How to build the "cpp" dir for android ?

how to use mlc-llm-cli tools on android platform?

android maven repository

cmake build android

How to run this using python?

[iOS] - `dolly-v2-3b` latency + accuracy

← Metadata

Owner

Metadata

mlc-llm mlc-llm copied to clipboard

Metadata

← Metadata

Owner

Metadata

mlc-llm
mlc-llm copied to clipboard