llama_index
llama_index copied to clipboard
[Bug]: UnboundLocalError: cannot access local variable 'x' where it is not associated with a value
Bug Description
x
below is unbound in a very well defined case:
https://github.com/run-llama/llama_index/blob/9859599ecc8d2f61b96c3cde7aa42c733d8c537a/llama-index-core/llama_index/core/llms/callbacks.py#L150-L172
namely, when the for
body never runs (see reproduction below).
Version
llama-index==0.10.32
Steps to Reproduce
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
llm = Ollama(model="phi3", request_timeout=30.0)
response = llm.stream_chat([ChatMessage(role="user", content="")])
for r in response:
print(r.delta, end="", flush=True)
Relevant Logs/Tracbacks
❯ python
Python 3.12.3 (main, Apr 15 2024, 18:25:56) [Clang 17.0.6 ] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from llama_index.llms.ollama import Ollama
>>> from llama_index.core.llms import ChatMessage
>>> llm = Ollama(model="phi3", request_timeout=30.0)
>>> response = llm.stream_chat([ChatMessage(role="user", content="")])
>>> for r in response:
... print(r.delta, end="", flush=True)
...
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/schneider/Dropbox/eleanor/.venv/lib/python3.12/site-packages/llama_index/core/ll
ms/callbacks.py", line 172, in wrapped_gen
response=x,
^
UnboundLocalError: cannot access local variable 'x' where it is not associated with a value
Hey there, @schneiderfelipe! 🎉 I'm Dosu, a bot here to lend a hand with bugs, answer your questions, and guide you through contributing while we wait for a human maintainer to jump in. Think of me as your friendly neighborhood code helper. I'm diving into your issue as we speak and will get back to you with a full answer shortly. Stay tuned!
why would the for loop not run? response timeout?
The UnboundLocalError
you're seeing happens because the variable x
is being referenced before it's assigned any value. This issue arises specifically in the scenario where the for
loop in llama_index/core/llms/callbacks.py
doesn't execute because the generator f_return_val
yields no values. This could be due to the generator being empty or an exception occurring before any yield statements are executed.
To address this, consider the following steps:
-
Adjust the
request_timeout
parameter: If it's too short, the request might fail before any data is received, leading to an empty generator and thus theUnboundLocalError
. -
Validate API key and model compatibility: Ensure they are correctly set up for use with the
stream_chat
method. -
Check external dependencies and network issues: These could affect the execution of the
stream_chat
method. -
Debug the method and file: Look into
stream_chat
andllama_index/core/llms/callbacks.py
to verify iff_return_val
yields values andx
gets assigned.
If after these steps the issue remains, providing more details like the full traceback, the llama_index
library version, and any other relevant context would be helpful for further diagnosis.
Details
Sources
- llama-index-legacy/llama_index/legacy/multi_modal_llms/ollama.py
- llama-index-integrations/multi_modal_llms/llama-index-multi-modal-llms-ollama/llama_index/multi_modal_llms/ollama/base.py
- llama-index-core/llama_index/core/instrumentation/events/chat_engine.py
- llama-index-core/llama_index/core/llms/callbacks.py
- llama-index-integrations/llms/llama-index-llms-litellm/llama_index/llms/litellm/base.py
- llama-index-legacy/llama_index/legacy/llms/litellm.py
About Dosu This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.
Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.
To continue the conversation, mention @dosu.
why would the for loop not run? response timeout?
This will happen whenever the generator f_return_val
never yields, which is apparently happening when messages
is either empty (see below) or consists of a single message with empty (""
) content (as in the original bug report above).
Definitely not because of timeout.
Variations of the example given in the issue (click to reveal the full output):
Forcing a request timeout:
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
llm = Ollama(model="phi3", request_timeout=0.1) # <--
response = llm.stream_chat([ChatMessage(role="user", content="")])
for r in response:
print(r.delta, end="", flush=True)
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
llm = Ollama(model="phi3", request_timeout=0.1) # <--
response = llm.stream_chat([ChatMessage(role="user", content="")])
for r in response:
print(r.delta, end="", flush=True)
Python 3.12.3 (main, Apr 26 2024, 13:22:08) [GCC 13.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.24.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from llama_index.llms.ollama import Ollama
...: from llama_index.core.llms import ChatMessage
...:
...: llm = Ollama(model="phi3", request_timeout=0.1) # <--
...: response = llm.stream_chat([ChatMessage(role="user", content="")])
...: for r in response:
...: print(r.delta, end="", flush=True)
...:
---------------------------------------------------------------------------
ReadTimeout Traceback (most recent call last)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:69, in map_httpcore_exceptions()
68 try:
---> 69 yield
70 except Exception as exc:
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:233, in HTTPTransport.handle_request(self, request)
232 with map_httpcore_exceptions():
--> 233 resp = self._pool.handle_request(req)
235 assert isinstance(resp.stream, typing.Iterable)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py:216, in ConnectionPool.handle_request(self, request)
215 self._close_connections(closing)
--> 216 raise exc from None
218 # Return the response. Note that in this case we still have to manage
219 # the point at which the response is closed.
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection_pool.py:196, in ConnectionPool.handle_request(self, request)
194 try:
195 # Send the request on the assigned connection.
--> 196 response = connection.handle_request(
197 pool_request.request
198 )
199 except ConnectionNotAvailable:
200 # In some cases a connection may initially be available to
201 # handle a request, but then become unavailable.
202 #
203 # In this case we clear the connection and try again.
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/connection.py:101, in HTTPConnection.handle_request(self, request)
99 raise exc
--> 101 return self._connection.handle_request(request)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:143, in HTTP11Connection.handle_request(self, request)
142 self._response_closed()
--> 143 raise exc
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:113, in HTTP11Connection.handle_request(self, request)
104 with Trace(
105 "receive_response_headers", logger, request, kwargs
106 ) as trace:
107 (
108 http_version,
109 status,
110 reason_phrase,
111 headers,
112 trailing_data,
--> 113 ) = self._receive_response_headers(**kwargs)
114 trace.return_value = (
115 http_version,
116 status,
117 reason_phrase,
118 headers,
119 )
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:186, in HTTP11Connection._receive_response_headers(self, request)
185 while True:
--> 186 event = self._receive_event(timeout=timeout)
187 if isinstance(event, h11.Response):
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_sync/http11.py:224, in HTTP11Connection._receive_event(self, timeout)
223 if event is h11.NEED_DATA:
--> 224 data = self._network_stream.read(
225 self.READ_NUM_BYTES, timeout=timeout
226 )
228 # If we feed this case through h11 we'll raise an exception like:
229 #
230 # httpcore.RemoteProtocolError: can't handle event type
(...)
234 # perspective. Instead we handle this case distinctly and treat
235 # it as a ConnectError.
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_backends/sync.py:124, in SyncStream.read(self, max_bytes, timeout)
123 exc_map: ExceptionMapping = {socket.timeout: ReadTimeout, OSError: ReadError}
--> 124 with map_exceptions(exc_map):
125 self._sock.settimeout(timeout)
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:158, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
157 try:
--> 158 self.gen.throw(value)
159 except StopIteration as exc:
160 # Suppress StopIteration *unless* it's the same exception that
161 # was passed to throw(). This prevents a StopIteration
162 # raised inside the "with" statement from being suppressed.
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpcore/_exceptions.py:14, in map_exceptions(map)
13 if isinstance(exc, from_exc):
---> 14 raise to_exc(exc) from exc
15 raise
ReadTimeout: timed out
The above exception was the direct cause of the following exception:
ReadTimeout Traceback (most recent call last)
Cell In[1], line 6
4 llm = Ollama(model="phi3", request_timeout=0.1) # <--
5 response = llm.stream_chat([ChatMessage(role="user", content="")])
----> 6 for r in response:
7 print(r.delta, end="", flush=True)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/core/llms/callbacks.py:150, in llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen()
148 def wrapped_gen() -> ChatResponseGen:
149 last_response = None
--> 150 for x in f_return_val:
151 dispatcher.event(
152 LLMChatInProgressEvent(
153 messages=messages,
(...)
156 )
157 )
158 yield cast(ChatResponse, x)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/llms/ollama/base.py:140, in Ollama.stream_chat(self, messages, **kwargs)
124 payload = {
125 "model": self.model,
126 "messages": [
(...)
136 **kwargs,
137 }
139 with httpx.Client(timeout=Timeout(self.request_timeout)) as client:
--> 140 with client.stream(
141 method="POST",
142 url=f"{self.base_url}/api/chat",
143 json=payload,
144 ) as response:
145 response.raise_for_status()
146 text = ""
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
135 del self.args, self.kwds, self.func
136 try:
--> 137 return next(self.gen)
138 except StopIteration:
139 raise RuntimeError("generator didn't yield") from None
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:870, in Client.stream(self, method, url, content, data, files, json, params, headers, cookies, auth, follow_redirects, timeout, extensions)
847 """
848 Alternative to `httpx.request()` that streams the response body
849 instead of loading it into memory at once.
(...)
855 [0]: /quickstart#streaming-responses
856 """
857 request = self.build_request(
858 method=method,
859 url=url,
(...)
868 extensions=extensions,
869 )
--> 870 response = self.send(
871 request=request,
872 auth=auth,
873 follow_redirects=follow_redirects,
874 stream=True,
875 )
876 try:
877 yield response
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:914, in Client.send(self, request, stream, auth, follow_redirects)
906 follow_redirects = (
907 self.follow_redirects
908 if isinstance(follow_redirects, UseClientDefault)
909 else follow_redirects
910 )
912 auth = self._build_request_auth(request, auth)
--> 914 response = self._send_handling_auth(
915 request,
916 auth=auth,
917 follow_redirects=follow_redirects,
918 history=[],
919 )
920 try:
921 if not stream:
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:942, in Client._send_handling_auth(self, request, auth, follow_redirects, history)
939 request = next(auth_flow)
941 while True:
--> 942 response = self._send_handling_redirects(
943 request,
944 follow_redirects=follow_redirects,
945 history=history,
946 )
947 try:
948 try:
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:979, in Client._send_handling_redirects(self, request, follow_redirects, history)
976 for hook in self._event_hooks["request"]:
977 hook(request)
--> 979 response = self._send_single_request(request)
980 try:
981 for hook in self._event_hooks["response"]:
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_client.py:1015, in Client._send_single_request(self, request)
1010 raise RuntimeError(
1011 "Attempted to send an async request with a sync Client instance."
1012 )
1014 with request_context(request=request):
-> 1015 response = transport.handle_request(request)
1017 assert isinstance(response.stream, SyncByteStream)
1019 response.request = request
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:232, in HTTPTransport.handle_request(self, request)
218 assert isinstance(request.stream, SyncByteStream)
220 req = httpcore.Request(
221 method=request.method,
222 url=httpcore.URL(
(...)
230 extensions=request.extensions,
231 )
--> 232 with map_httpcore_exceptions():
233 resp = self._pool.handle_request(req)
235 assert isinstance(resp.stream, typing.Iterable)
File ~/.pyenv/versions/3.12.3/lib/python3.12/contextlib.py:158, in _GeneratorContextManager.__exit__(self, typ, value, traceback)
156 value = typ()
157 try:
--> 158 self.gen.throw(value)
159 except StopIteration as exc:
160 # Suppress StopIteration *unless* it's the same exception that
161 # was passed to throw(). This prevents a StopIteration
162 # raised inside the "with" statement from being suppressed.
163 return exc is not value
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/httpx/_transports/default.py:86, in map_httpcore_exceptions()
83 raise
85 message = str(exc)
---> 86 raise mapped_exc(message) from exc
ReadTimeout: timed out
Giving no message at all:
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
llm = Ollama(model="phi3", request_timeout=30.0)
response = llm.stream_chat([]) # <--
for r in response:
print(r.delta, end="", flush=True)
from llama_index.llms.ollama import Ollama
from llama_index.core.llms import ChatMessage
llm = Ollama(model="phi3", request_timeout=30.0)
response = llm.stream_chat([]) # <--
for r in response:
print(r.delta, end="", flush=True)
❯ ipython
Python 3.12.3 (main, Apr 26 2024, 13:22:08) [GCC 13.2.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.24.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from llama_index.llms.ollama import Ollama
...: from llama_index.core.llms import ChatMessage
...:
...: llm = Ollama(model="phi3", request_timeout=30.0)
...: response = llm.stream_chat([]) # <--
...: for r in response:
...: print(r.delta, end="", flush=True)
...:
---------------------------------------------------------------------------
UnboundLocalError Traceback (most recent call last)
Cell In[1], line 6
4 llm = Ollama(model="phi3", request_timeout=30.0)
5 response = llm.stream_chat([]) # <--
----> 6 for r in response:
7 print(r.delta, end="", flush=True)
File ~/.pyenv/versions/3.12.3/lib/python3.12/site-packages/llama_index/core/llms/callbacks.py:172, in llm_chat_callback.<locals>.wrap.<locals>.wrapped_llm_chat.<locals>.wrapped_gen()
159 last_response = x
161 callback_manager.on_event_end(
162 CBEventType.LLM,
163 payload={
(...)
167 event_id=event_id,
168 )
169 dispatcher.event(
170 LLMChatEndEvent(
171 messages=messages,
--> 172 response=x,
173 span_id=span_id,
174 )
175 )
UnboundLocalError: cannot access local variable 'x' where it is not associated with a value
Although using empty messages or a single message with empty content serves no practical purpose, I personally feel there is indeed a slight issue from the perspective of Python — in theory, a for loop might not execute at all.
I guess a possible solution is to use last_response
, which is guaranteed to be predefined:
https://github.com/run-llama/llama_index/blob/9859599ecc8d2f61b96c3cde7aa42c733d8c537a/llama-index-core/llama_index/core/llms/callbacks.py#L149
instead of x
:
dispatcher.event(
LLMChatEndEvent(
messages=messages,
- response=x,
+ response=last_response,
span_id=span_id,
)
)
But I'm not sure whether this is acceptable as it will introduce a change to LLMChatEndEvent
:
class LLMChatEndEvent(BaseEvent):
messages: List[ChatMessage]
- response: ChatResponse
+ response: Optional[ChatResponse] = None
But I'm not sure whether this is acceptable as it will introduce a change to
LLMChatEndEvent
:class LLMChatEndEvent(BaseEvent): messages: List[ChatMessage] - response: ChatResponse + response: Optional[ChatResponse] = None
I think this is the way to go. There is a situation where there's simply nothing to be put in response
.
I also encountered the same problem. When will it be fixed
When I get cicd working for the above PR