HttpResponseError in prompt_pipeline.py and prompt_eval.py - Scripts Value cannot be null. (Parameter 'bytes')
We are experiencing an intermittent issue when running the prompt_pipeline or prompt_eval scripts in the pipelines. The error encountered is:
azure.core.exceptions.HttpResponseError: (UserError) Value cannot be null. (Parameter 'bytes') Code: UserError Message: Value cannot be null. (Parameter 'bytes')
This issue seems to occur randomly. Most of the time, it blocks the execution of the pipelines, but occasionally, the scripts run without any errors. The problem started occurring on Monday, March 24th, 2025.
Complete Traceback:
File "/home/azureuser/myagent/_work/5/s/llmops/common/prompt_pipeline.py", line 356, in prepare_and_execute
run = pf.run(
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/azure/_pf_client.py", line 305, in run
return self.runs.create_or_update(run=run, **kwargs)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/_sdk/_telemetry/activity.py", line 265, in wrapper
return f(self, *args, **kwargs)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/azure/operations/_run_operations.py", line 187, in create_or_update
self.stream(run=run.name)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/_sdk/_telemetry/activity.py", line 265, in wrapper
return f(self, *args, **kwargs)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/azure/operations/_run_operations.py", line 641, in stream
available_logs = self._get_log(flow_run_id=run.name)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/azure/operations/_run_operations.py", line 543, in _get_log
return self._service_caller.caller.bulk_runs.get_flow_run_log_content(
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/azure/core/tracing/decorator.py", line 116, in wrapper_use_tracer
return func(*args, **kwargs)
File "/home/azureuser/myagent/_work/_tool/Python/3.9.19/x64/lib/python3.9/site-packages/promptflow/azure/_restclient/flow/operations/_bulk_runs_operations.py", line 973, in get_flow_run_log_content
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: (UserError) Value cannot be null. (Parameter 'bytes') Code: UserError Message: Value cannot be null. (Parameter 'bytes')
Additional Information:
The issue started on March 24th, 2025. The environment uses Python 3.9.19; we updated all promptflow packages to 1.17.2
Request: We need assistance in identifying the root cause of this error and a potential fix or workaround to ensure the scripts run consistently without interruption.
We found out that the problem is related to the streaming mode (that is equals to True in the scripts). With this parameter, the code tries to call an API (/logContent) which is not available by the time the code makes the call and receives back a 400 error.
The workaround to fix the issue is to set stream=False in both scripts and add a polling mechanism to check the status of the job. We implemented this method and placed it after the runs are executed in the scripts:
def is_job_completed(job: Run) -> bool:
"""
Check if the job is completed.
Returns:
bool: True if the job is completed, False otherwise.
"""
return job.status == "Completed" or job.status == "Finished"
def poll_job_status(
pf_client: PFClient, job: Run, max_retries: int = 200, polling_interval: int = 15
) -> Run:
"""
Poll the job status until it is completed or max retries are reached.
Args:
pf_client: PFClient instance
job: Run instance
max_retries: maximum number of retries
polling_interval: time to wait between polls
Returns:
Run: the last job_state retirved
"""
number_of_retries = 0
while not is_job_completed(job) and number_of_retries < max_retries:
logger.info(f"Job status: {job.status}")
job = pf_client.runs.get(job.name)
time.sleep(polling_interval)
number_of_retries += 1
if is_job_completed(job):
logger.info("job completed")
return job
else:
logger.info(f"Max retries ({max_retries}) exceeded. Job not completed")
return job
Now the pipelines are OK