Bojun Feng issues

Results 14 issues of


                                            Bojun Feng

ENH: Add old ui as example

Rewrote deprecated Gradio UI with RESTful Client and added it to examples as `gradio_arena.py`. Instructions should be available by running the file with the `--help` tag. e.g. `python path/to/inference/examples/gradio_arena.py --help`...

enhancement

FEAT: Support Launching Model with Uid

Add `--model-uid -i` in the terminal as an optional argument that overrides the default randomly generated uid. Resolves #254 example: Run the following two commands in two separate terminal windows....

feature

FEAT: Support Phi-1 & Phi-1.5

Resolve #462

feature

ENH: Disable 4-bit and 8-bit quantization on MacOS

Resolve #483 at frontend level by filtering options on render if machine is Mac-like. Tested locally, successfully removed 4-bit and 8-bit quantization on MacBook

enhancement

ENH: Add Language Settings to Theme

Resolves #504 - Refactor theme to implement translations - Translate the page names in dashboard Effect: ![output](https://github.com/xorbitsai/inference/assets/102875484/8a1d7e18-b6a8-4634-8eb5-d87543534251) This PR does not complete translating everything, but hopefully provides a framework to...

enhancement

FEAT: Support RWKV Pile

Resolve #533 Add support for rwkv-4-pile models. Successfully tested the 169m variant on a local Mac setup. I would like some guidance on setting the model size. Historically, we've rounded...

feature

[Misc]: Server Does Not Follow Scheduler Policy

### Anything you want to discuss about vllm. I was testing out vLLM on Colab and notices something weird. It seems from the code that vLLM is using first come...

misc

Fix: Minor discrepancy between comment and code

Updated comment in [toy_example/point-estimation.ipynb](https://github.com/litian96/TERM/blob/adb12e31a3f063ac9f01d84e75871423f6e14c17/toy_example/point-estimation.ipynb) to match the function of code. Resolve #13

Minor discrepancy between comment and code

In the following screenshot from [toy_example/point-estimation.ipynb](https://github.com/litian96/TERM/blob/adb12e31a3f063ac9f01d84e75871423f6e14c17/toy_example/point-estimation.ipynb), it seems that the re-centered data is centered around (-0.6, -0.6) instead of (3,3) as mentioned in the comments.

FEAT: JSON mode for Llama.cpp

### Is your feature request related to a problem? Please describe Similar to OpenAI's JSON mode, we can use custom grammar files to ask Llama.cpp to generate a valid JSON...

feature

stale