maestro icon indicating copy to clipboard operation
maestro copied to clipboard

assertWithAI insists it's 2023

Open owens-ben opened this issue 2 months ago • 8 comments

Is there an existing issue for this?

  • [x] I have searched the existing issues and didn't find mine.

Steps to reproduce

Using this command

  • assertWithAI: "Check date and time matches ${output.APPOINTMENT_TIME_STRING}"

Actual results

leads to a false negative like this.

║ ⚠️ Assert with AI: Check date and time matches Friday, Oct 31st 10:30 AM - 11:00 AM (warned) ║ Warning: ║ Assertion "Check date and time matches Friday, Oct 31st ║ 10:30 AM - 11:00 AM" failed: ║ The date in the screenshot is Friday, Oct 31st, but October 31st ║ is not a Friday. In 2023, October 31st falls on a Tuesday. ║

I understand this is a result of the cutoff of the training data. However, can't the model be trained to be agnostic to the year it is, rather than believing it's permanently stuck in 2023? Even a workaround would be helpful, but I can't even say "ignore the year" without the AI fighting me on it and saying "no, its 2023!!!!"

Expected results

The AI either detects the current year or is trained not to assume it's 2023.

About app

React native ios App

About environment

M4 Macbook Pro, Tahoe

Logs

Logs
                                                                
 ║      
 ║  > Flow: appt-card-future                                     
 ║                                                               
 ║    ⚠️   Assert with AI: Check date and time matches undefined (agnostic of whether the day of week matches during the current year) (warned)                                                     
 ║         Warning:
 ║           Assertion "Check date and time matches undefined (agnostic of whether the day                                        
 ║           of week matches during the current year)" failed:
 ║           The date and time shown in the screenshot is 'Friday, Oct 31st                                                       
 ║           10:30 - 11:00 AM'. However, October 31st does not fall on a                                                          
 ║           Friday in the current year. Therefore, the assertion that the date and
 ║           time match is false.                                
 ║                                                               

Maestro version

2.0.5

How did you install Maestro?

install script (https://get.maestro.mobile.dev)

Anything else?

No response

owens-ben avatar Oct 30 '25 14:10 owens-ben

MAE-319

linear[bot] avatar Oct 30 '25 14:10 linear[bot]

🤦

 ║    ⚠️   Assert with AI: The displayed date and time loosely match 'Friday, Oct 31st
11:00 - 11:30 AM'. (warned)
 ║         Warning:
 ║           Assertion "The displayed date and time loosely match 'Friday, Oct 31st
 ║           11:00 - 11:30 AM'." failed:
 ║           The displayed date is 'Friday, Oct 31st', but October 31st does not
 ║           fall on a Friday in any recent or upcoming year. Therefore, the
 ║           assertion about the date being 'Friday, Oct 31st' is incorrect.

owens-ben avatar Oct 30 '25 15:10 owens-ben

More examples of how it's basically impossible to get around this:

║ ⚠️ Assert with AI: The displayed date time is Friday, Oct 31st 11:30 - 12:00 PM 2025 (but without the year shown) (warned) ║ Warning: ║ Assertion "The displayed date time is Friday, Oct 31st ║ 11:30 - 12:00 PM 2025 (but without the year shown)" failed: ║ The assertion is false because October 31st, 2025, is not a Friday. ║ It is a Wednesday. Therefore, the date and day do not match ║ the assertion.

owens-ben avatar Oct 30 '25 15:10 owens-ben

Yeah, AI hallucinates stuff. assertWithAI intentionally defaults to optional: true to ensure human review for exactly this reason.

In this specific example, I'd probably not ask an AI model for the current date, but instead provide the date in some format.

- assertWithAI: "Check that the current date and time (${new Date.toString()}) matches ${output.APPOINTMENT_TIME_STRING}"

Fishbowler avatar Nov 10 '25 13:11 Fishbowler

Times seem to totally break the assertions, regarldess of how you try to get around them.

║ The assertion is false because the current time shown in the screenshot ║ is 9:53, which is more than 30 minutes before the appointment time ║ of 10:00 AM. Therefore, the appointment is not imminent (less than 30 ║ minutes away).

owens-ben avatar Nov 25 '25 15:11 owens-ben

It also makes assertNoDefectsWithAI completely worthless if there happens to be a date on the screen.

║ ⚠️ Assert no defects with AI (warned) ║ Warning: ║ Found 1 possible defect: ║ - The date 'Tuesday, Nov 25th' is incorrect as November 25th, 2023, ║ is a Saturday.

owens-ben avatar Nov 25 '25 15:11 owens-ben

Even when it says the assertion is true within the message itself, it throws a warning...

║ ⚠️ Assert with AI: The app displays the appointment card, with a date and time that is anything BUT Wednesday, Nov 26th 12:00 - 12:30 PM. (warned) ║ Warning: ║ Assertion "The app displays the appointment card, with a date and time ║ that is anything BUT Wednesday, Nov 26th ║ 12:00 - 12:30 PM. " failed: ║ The appointment card displays the date and time as Thursday, Nov 27th ║ 2:00 - 2:30 AM, which is not Wednesday, Nov 26th 12:00 - ║ 12:30 PM. Therefore, the assertion is true.

It's far too wrong to be blamed on hallucinations. It truly does not handle assertions properly.

owens-ben avatar Nov 25 '25 17:11 owens-ben

That last one is clearly bananas 😂

Those commands are experimental, and clearly still imperfect. Thanks for reporting this.

Fishbowler avatar Dec 01 '25 10:12 Fishbowler