text-generation-inference how do I adjust the logging level when launching via the docker container?

how do I adjust the logging level when launching via the docker container?

Open bitsofinfo opened this issue 1 year ago • 0 comments

System Info

Hi, How can I change the logging level of TGI when launching it via a docker container? Currently I'm only seeing INFO and above.

2024-05-08T21:29:45.452006Z  INFO text_generation_launcher: Runtime environment:
Target: x86_64-unknown-linux-gnu
Cargo version: 1.75.0
Commit sha: 6073ece4fc2d7180c2057cb49b9ea436463fd52b
Docker label: sha-6073ece
nvidia-smi:
Wed May  8 21:29:45 2024
   +---------------------------------------------------------------------------------------+
   | NVIDIA-SMI 535.171.04             Driver Version: 535.171.04   CUDA Version: 12.2     |
   |-----------------------------------------+----------------------+----------------------+
   | GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
   | Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
   |                                         |                      |               MIG M. |
   |=========================================+======================+======================|
   |   0  Tesla T4                       Off | 00000000:00:1E.0 Off |                    0 |
   | N/A   24C    P8               9W /  70W |      2MiB / 15360MiB |      0%      Default |
   |                                         |                      |                  N/A |
   +-----------------------------------------+----------------------+----------------------+

   +---------------------------------------------------------------------------------------+
   | Processes:                                                                            |
   |  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
   |        ID   ID                                                             Usage      |
   |=======================================================================================|
   |  No running processes found                                                           |
   +---------------------------------------------------------------------------------------+
xpu-smi:
N/A
2024-05-08T21:29:45.452036Z  INFO text_generation_launcher: Args {
    model_id: "bigscience/bloom-560m",
    revision: None,
    validation_workers: 2,
    sharded: None,
    num_shard: None,
    quantize: None,
    speculate: None,
    dtype: None,
    trust_remote_code: false,
    max_concurrent_requests: 128,
    max_best_of: 2,
    max_stop_sequences: 4,
    max_top_n_tokens: 5,
    max_input_tokens: None,
    max_input_length: None,
    max_total_tokens: None,
    waiting_served_ratio: 0.3,
    max_batch_prefill_tokens: None,
    max_batch_total_tokens: None,
    max_waiting_tokens: 20,
    max_batch_size: None,
    cuda_graphs: None,
    hostname: "39de305bfebc",
    port: 80,
    shard_uds_path: "/tmp/text-generation-server",
    master_addr: "localhost",
    master_port: 29500,
    huggingface_hub_cache: Some(
        "/data",
    ),
    weights_cache_override: None,
    disable_custom_kernels: false,
    cuda_memory_fraction: 1.0,
    rope_scaling: None,
    rope_factor: None,
    json_output: false,
    otlp_endpoint: None,
    cors_allow_origin: [],
    watermark_gamma: None,
    watermark_delta: None,
    ngrok: false,
    ngrok_authtoken: None,
    ngrok_edge: None,
    tokenizer_config_path: None,
    disable_grammar_support: false,
    env: true,
    max_client_batch_size: 4,
}

Information

[X] Docker
[ ] The CLI directly

Tasks

[X] An officially supported command
[ ] My own modifications

Reproduction

docker run     --gpus all     --shm-size 1g     -p 8080:80     -v $volume:/data   \
-e HUGGING_FACE_HUB_TOKEN=$token     ghcr.io/huggingface/text-generation-inference:2.0.2  \
--model-id $model     --quantize bitsandbytes-fp4     --max-input-length 8000     --max-total-tokens 8192

Expected behavior

i'd like to see logs TRACE/DEBUG and higher

May 08 '24 21:05 bitsofinfo

text-generation-inference text-generation-inference copied to clipboard

how do I adjust the logging level when launching via the docker container?

System Info

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard