agenta
agenta copied to clipboard
[Bug] Errors when serving a non-functional variant due to null deployment
Describe the bug When serving a variant with code that fails to execute (e.g., due to a syntax error in Python), it leads to the deployment being set to null. This is affecting our environment as follows:
- When trying to launch the playground with a base that does not have a deployment, it results in an error due to the missing deployment.
- There is an error encountered when attempting to overwrite the deployment for this variant.
Steps to Reproduce:
- Create a variant containing code that will not execute properly, such as including a syntax error in Python.
- Serve the variant locally.
- Navigate to the playground -> observe the resulting error (base does not have a deployment) which is not user-friendly.
- Try to overwrite the variant -> another error that occurs during this process.
Expected Behavior:
When encountering a non-functional variant, the playground should provide a clear and informative error message. Moreover, users should be able to overwrite the variant despite the initial issues without encountering further errors.
The same problem is there for cloud. It's part of the same issue
Is it possible to add backend tests for this?
(agenta-py3.9) (base) mahmoudmabrouk@MacBook-Pro-3 baby_name_generator % agenta variant serve app.py
Checking and updating config file...
? This variant already exists. Do you want to overwrite it? Yes
Preparing code base app into a tar file...
Building code base app for app.default into a docker image...
Updating app to server...
Error while updating variant: Request to update app_variant failed with status code 500 and error message: {'detail': 'Unexpected error while trying to update the app variant: Failed to start Docker container for app variant babty/app.default \n Error while creating and starting Lambda function: An error occurred (InvalidParameterValueException) when calling the CreateFunction operation: Source image 412267349317.dkr.ecr.eu-central-1.amazonaws.com/agenta/llm-apps/654e742f0cc87615c0a757d1-app:latest does not exist. Provide a valid source image.'}.
https://agenta-ai.sentry.io/issues/4621410780/?referrer=digest-slack¬ification_uuid=444176a4-9208-4eff-a68e-11ef1ab441ec&alert_rule_id=14774441&alert_type=issue
https://agenta-ai.sentry.io/issues/4621410781/?referrer=digest-slack¬ification_uuid=444176a4-9208-4eff-a68e-11ef1ab441ec&alert_rule_id=14774441&alert_type=issue
Is it possible to add backend tests for this?
Yes, we'll need to include unit and integration tests for the CLI as well. I was planning to bring it up in our next standup (on Monday).
To replicate this bug, I had to run the cloud development compose and modify the CLI to use agenta cloud on localhost. As a result, I found the following:
Findings:
- I served a broken variant with a syntax error, and its deployment was successful. A deployment document was generated for it, as shown in the first screenshot.
- Additionally, I attempted to overwrite the same broken variant, and the deployment was also successful, as seen in the second screenshot.
Screenshots:
Question:
- How can we identify the variant error in the docker container? At present, when accessing the variant container's
openapi.json endpoint, you are redirected to a Nextjs 404 page. It would be helpful to have an additional exposed endpoint or some other means to obtain the error originating from the variant backend container.
Update: Fixed: Variant overwering Not fixed: Make the error message more tolerant to syntax error