meowcoder22
Results
1
issues of
meowcoder22
`mpirun -n 1 --allow-run-as-root python3 /app/TensorRT-LLM/examples/run.py \ --tokenizer_dir ./llama33_70b \ --draft_engine_dir ./draft-engine \ --engine_dir /app/all_models/inflight_batcher_llm/tensorrt_llm/1 \ --draft_target_model_config "[10,[0],[0], False]" \ --kv_cache_free_gpu_memory_fraction=0.35 \ --run_profiling \ --max_output_len=1024 \ --kv_cache_enable_block_reuse \ --input_text="user\nA 3-digit...