TaoOfTao issues

Results 7 issues of


                                            TaoOfTao

Inference Failed Because of '500 Internal Server Error'

After launching the distribution server by `"llama distribution start --name local-llama-8b --port 5000 --disable-ipv6 "`, running any inference example, for example `"python examples/scripts/vacation.py localhost 5000 --disable-safety"` will give the following...

FP8 Quantization Does Not Work

Trying to run inference with FP8 quantization, and got the following error: ``` Configuring API surface: inference Enter value for model (existing: Meta-Llama3.1-8B-Instruct) (required): Meta-Llama3.1-8B-Instruct Enter value for quantization (optional):...

Checkpoint Cannot Be Found For Llama 405B Model

Trying to run inference with FP8 version of Llama 3.1 405B model (Meta-Llama3.1-405B-Instruct). The model was downloaded with `llama download --source huggingface --model-id Meta-Llama3.1-405B-Instruct --hf-token TOKEN`. However, the command `llama...

Custom Tool Call Not Working For Inflation Example

Running the `inflation.py` example from the rep. I am expecting it calls the custom tool for `get_ticker_data` function, which is defined at the folder `custom_tools` by `ticker_data.py`. However, based on...

Mesop App Requires pillow Package to Run GUI

Running the chatbot GUI with command `mesop app/main.py` , it reports "No module of 'PIL'" error because the pillow package is missing. `pip install pillow` resolved the issue. Suggest to...

Tool Calling Not Working

I am trying the tool calling function with Brave Search engine by following the simple instructions at https://github.com/meta-llama/llama-agentic-system#add-api-keys-for-tools. Basically, I saved the api key inside an .env file by `BRAVE_SEARCH_API_KEY=xxxxx`...

Guardrail Loading Failed with Unexpected Large GPU Memory Requirement at Multi-GPU Server

### System Info Python version: 3.10.12 Pytorch version: llama_models version: 0.0.42 llama_stack version: 0.0.42 llama_stack_client version: 0.0.41 Hardware: 4xA100 (40GB VRAM/GPU) local-gpu-run.yaml file content is as following: ``` version: '2'...