aphrodite-engine
aphrodite-engine copied to clipboard
Fix metrics and allow disable block manager v2
With the latest block manager v2, it can cause max recursion depth errors with some models and certain sampler combinations. The fix is to use block manager v1, but the arg parser did not allow this. Hence this change to be able to use block manager v1 by using argument --use-v2-block-manager false
When using chunked prefill and LoRA with the old metrics instead of request level metrics, it can cause this error:
ERROR: Engine background task failed
Exception in callback functools.partial(<function _log_task_completion at 0x7e38ccf05d00>, error_callback=<bound method AsyncAphrodite._error_callback of <aphrodite.engine.async_aphrodite.AsyncAphrodite object at 0x7e38c8a91cd0>>)
handle: <Handle functools.partial(<function _log_task_completion at 0x7e38ccf05d00>, error_callback=<bound method AsyncAphrodite._error_callback of <aphrodite.engine.async_aphrodite.AsyncAphrodite object at 0x7e38c8a91cd0>>)>
Traceback (most recent call last):
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 54, in _log_task_completion
return_value = task.result()
^^^^^^^^^^^^^
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 809, in run_engine_loop
result = task.result()
^^^^^^^^^^^^^
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 735, in engine_step
request_outputs = await self.engine.step_async(virtual_engine)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 391, in step_async
self.do_log_stats(scheduler_outputs, outputs)
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/aphrodite_engine.py", line 1421, in do_log_stats
loggers.log(stats)
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/metrics.py", line 559, in log
self._log_prometheus(stats)
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/metrics.py", line 510, in _log_prometheus
self._log_counter(self.metrics.counter_generation_tokens,
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/metrics.py", line 471, in _log_counter
counter.labels(**self.labels).inc(data)
File "/home/arli/miniconda3/envs/aphrodite/lib/python3.11/site-packages/prometheus_client/metrics.py", line 313, in inc
raise ValueError('Counters can only be incremented by non-negative amounts.')
ValueError: Counters can only be incremented by non-negative amounts.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "uvloop/cbhandles.pyx", line 63, in uvloop.loop.Handle._run
File "/home/arli/aphro-latest/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 66, in _log_task_completion
raise AsyncEngineDeadError(
aphrodite.engine.async_aphrodite.AsyncEngineDeadError: Task finished unexpectedly. This should never happen! Please open an issue on Github. See stack trace above for the actual cause.
Can you limit this PR to just the Metrics fix? Block Manager V1 is going to be deprecated as of #1300 so the other part of this PR will conflict with the changes.