CA - Last Mile Failures
Problem statement
Hello, there were a few last mile failures for California. Please see screenshot for details.
What you need to know
Acceptance criteria
To do
- Resubmit files
Hey @victor-chaparro, may you please take a look to see what is causing the failures?
update
More last mile failures from CA have appeared
These errors are showing up because the files we send are being flagged as duplicates. We recently upgraded CA to use REST instead of SFTP and we were sending duplicate files from both receivers. I've turned off their SFTP receiver and should no longer be seeing this errors. There's no need to resend the files since they are duplicates.
@victor-chaparro just for confirmation is this still the case? I see more last mile failures come through, screenshot attached below. If it is still duplicates and no action needed can we close the ticket? Please provide the receiver information for us to ignore going forward.
We've determined there's a sender in SimpleReport sending duplicate reports. I've contacted SimpleReport about the issue so they can reach out to them. Here's the thread https://nava.slack.com/archives/C0411VC78DN/p1725896603292609
Leaving this open because we are still seeing duplicate errors for CA. We thought it would get fixed by adjusting their limit. Keeping this open until the duplicate issue is solved.
The number of duplicates we are seeing being rejected from CA over the past week is significantly reduced (about 2-3 a day). We are still waiting for a response from Manifest regarding investigation into this issue from their side.
For the time being, we have capped our file limit at 400 items per send/file. This seems to have fixed the majority of the issue with CA experiencing a high level of duplicates.
However, in order to completely solve the issue either:
- Manifest needs to update the connection to allow more than 400 items in a short time period
- OR Reportstream needs to update OUR settings to send less frequently (once per 10 min or so)
These items on this ticket were identified as duplicates and do not need to be resubmitted.
However, we will keep this ticket open until we have another conversation with CA to determine next steps.
@victor-chaparro will ping Manifest via email for an update on the issue from their side.
Still no response from Manifest on this. Will continue to monitor.
Still no response from Manifest on this issue. Ticket was created on the Platform board for refinement today: #17706
This will add a small timeout every time we send reports which may solve this issue regardless of Manifest's response.
@Jcavallo7 We received a response from Manifest today via email. They are creating a script to help identify the problem that should be completed today. Then they would follow up with us regarding next steps.
Received another response from Manifest today:
The test is still in progress. We’re currently working on a script to simulate Report Stream messages to SaPHIRE and are aiming to complete our troubleshooting by the end of the day today.
Victor meeting with Manifest today to help reproduce the error
ReportStream is working on #17744 which MAY help reduce the potential for this error from our perspective temporarily, however that is not a long-term solution and the root cause we think is likely is on the Manifest's side. We will continue to interface with Manifest to solve this issue and work towards a long term solution.
Manifest has tried to reproduce the error without success. Victor has confirmed that RS does produce an error during testing.
Victor to follow up with Manifest to let them know RS is receiving errors and ask if they can share their script so we can determine if it may be missing something that would trigger the error.
Victor continuing to work with Manifest on reproducing the error via a script.
Platform team continuing to work on #17744
Platform moving forward with deploying #17744 this week Victor reaching back out to Manifest this week to work on their script