finops-toolkit icon indicating copy to clipboard operation
finops-toolkit copied to clipboard

msexports_ExecuteETL pipeline fails for empty files

Open NicholasBrand opened this issue 8 months ago • 5 comments

🐛 Problem

ADF pipeline "msexports_ExecuteETL" failing with the following error when processing Reservation Transaction information. (I have obscured the storage account name, billing account and billing profile from the below. _Operation on target Get Existing Parquet Files failed: ADLS Gen2 operation failed for: Operation returned an invalid status code 'NotFound'. Account: 'STORAGEACCOUNTNAMEHERE'. FileSystem: 'ingestion'. Path: 'Transactions/2025/04/providers/microsoft.billing/billingaccounts/BILLINGACCOUNTID/billingprofiles/BILLINGPROFILEID'. ErrorCode: 'PathNotFound'. Message: 'The specified path does not exist.'. RequestId: '0ce578c0-101f-0088-5675-b4721f000000'. TimeStamp: 'Wed, 23 Apr 2025 17:27:37 GMT'._

The exported reservation transaction CSV had no records (just the column headers). No directory structure created in ingestion folder for reservation transactions. This is with an MCA agreement.

👣 Repro steps

  1. Export reservation transaction data to storage account.
  2. Review ADF pipeline monitoring.
  3. msexports_ExecuteETL will show an error as above.
  4. When looking at the ingestion container no transaction directory is created.
  5. Reservation transaction csv is still in the msexports container.

🤔 Expected

When transaction data is exported into the msexports container ADF pipeline should run without errors.

NicholasBrand avatar Apr 24 '25 18:04 NicholasBrand

Get Existing Parquet Files is expected to fail when there's no data previously exported. That can be ignored.

If there's no data, the pipeline should complete successfully, but without moving anything around since there's no data. (Or so I thought 🤔) Are any other activities in that pipeline failing?

#needs-info

flanakin avatar Apr 25 '25 07:04 flanakin

This is the result of a pipeline run that failed:

Image Image

NicholasBrand avatar Apr 25 '25 16:04 NicholasBrand

@NicholasBrand the CSV file conversion is the issue. Can you grab some time on my calendar to walk thru this together? Please include a link to this issue for context. You can also create a UAT to get more direct help from the CSU Infra team, if needed.

flanakin avatar Apr 27 '25 20:04 flanakin

Turns out the issue here is that we are failing to check the file when there are no rows in it. While it is an error that we need to resolve, it's not impacting anything.

flanakin avatar May 23 '25 01:05 flanakin

To the user it is still a failed pipeline run that they then need to investigate and spend time on it. It happened to me this week. I'd also vote for it to be not in a failed state but that files with dataRowCount = 0 make the pipeline run successfully

bmargula avatar Sep 12 '25 11:09 bmargula