Tom Dörr
Tom Dörr
i got AttributeError: module 'tensorflow.python.ops.summary_op_util' has no attribute 'skip_summary'
You could use my docker setup to avoid issues with incompatible tensorflow versions: https://github.com/tom-doerr/TecoGAN
Not sure what the exact issue is, but the URL that is passed to wget doesn't seem to get constructed correctly. You could try to run it in WSL or...
You could just create one big images out of multiple images, worked well for me
What is the performance hit when using this? From the lm-format-enforcer: ``` # Note on batched generation: # For some reason, I achieved better batch performance by manually adding a...
@itaybar I don't have the ability to do reviews, but really like the feature. @simon-mo agreed in this issue (bottom) https://github.com/vllm-project/vllm/issues/1707 that that would be a good idea.
Same issue here, I'm using the OpenAI API. Here's how I started the server: ``` python3 -m vllm.entrypoints.openai.api_server --model TheBloke/Xwin-LM-70B-V0.1-AWQ --quantization awq --dtype half --tensor-parallel-size 2 --port 8427 --gpu-memory-utilization 0.6...
@simon-mo Thank you! really like all other aspects of vllm so far. If you need help reproducing it I'm happy to help. I attached the versions of the packages in...
Any idea how long it might take to fix this or if there is a chance we can fix it ourselves?
Generating multiple completions in parallel also only works efficiently if there are no other requests. With other requests the completion time goes from ~10 seconds to ~120 seconds for n=30.
Could the format enforcer slow down all requests or just when format is used?