incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][Deployment Frequency] Consider as a success production pipeline when the deploy stage has never been executed successfully

Open estefaniasuasti opened this issue 1 year ago • 15 comments

Question

As you can see here, we had 4 deployments during April. However, when I checked the pipelines I discovered that "4244451" never passed the deploy stage, it only passed the "dry-run" stage. What would be the reason?

Screenshots

Screenshot 2024-02-22 at 2 31 03 PM (1) image (54) image image

Additional context

Add any other context here.

estefaniasuasti avatar Feb 23 '24 16:02 estefaniasuasti

Hi @estefaniasuasti , may I have your version number?

Startrekzky avatar Feb 27 '24 15:02 Startrekzky

Good morning @Startrekzky, I am using the version 21, beta5.

estefaniasuasti avatar Feb 27 '24 15:02 estefaniasuasti

Hi @d4x1 It looks like a bug to me. Could you follow up when you're available? Thank you.

Startrekzky avatar Feb 27 '24 16:02 Startrekzky

Sorry for my late reply.

@estefaniasuasti

After debugging and testing, I think the root cause is: image

Target branch nso/frankfurt can be regexped in your scope config (?i)(^(nso/frankfurt)$), and this will make current pipeline as DevLake Deployment, althrough it's jobs failed.

You can verify it by checking your database:

  1. environment field in cicd_deployments should be empty(not PRODUCTION).
  2. type field in cicd_pipelines should be DEPLOYMENT, environment field in cicd_deployments should be empty(not PRODUCTION).

FYI, I have checked code, there are some facts:

  1. Table cicd_deployments's data is from cicd_pipelines and cicd_tasks in most case.
  2. Status passed in GitLab pipeline's webpage is success in its API response.

d4x1 avatar Mar 05 '24 13:03 d4x1

Good morning @d4x1,

Just to clarify, I am not having an issue with the regex expression for the branch " (?i)(^(nso/frankfurt)$)", it is working fine, however I am facing an issue with the regex expression for the job "(?i)(fra-prod-deploy)", because it likes DevLake ignores if it is "success".

I checked the parameters that you mention:

  1. "environment" field in "cicd_deployments", it is "Production" and I think it is correct because it is happening in the Production branch.
image
  1. " type" field in "cicd_pipelines" should be "DEPLOYMENT" , it is "Deployment" but the result is "Success" which is not correct. Because the deployment job was never successfully executed.
image

Additional information: we are using that regex expression for the branch because under the same repo we have another branch call "nso/frankfurt-dev".

estefaniasuasti avatar Mar 05 '24 15:03 estefaniasuasti

  1. Is DevLake correctly capturing the status of the pipeline? In our case "Warning"

  2. Should we consider Pipeline status or job status to determine a "Success" deployment?

estefaniasuasti avatar Mar 05 '24 18:03 estefaniasuasti

  1. Is DevLake correctly capturing the status of the pipeline? In our case "Warning"

Pipeline's status is from _raw_gitlab_api_pipeline_details.data, there is a status field in it, you can check it. In GitLab API, there is nowarning status. image I want to know it's value in API response too.

  1. Should we consider Pipeline status or job status to determine a "Success" deployment?

Of course, only success or failed pipelines or jobs can be regarded as deployments. The key field is result in cicd_pipelines and cicd_tasks.

I also want to know the value of cicd_pipelines. environment and cicd_tasks.environment (with pipeline_id = {your pipeline id}). Thank you.

d4x1 avatar Mar 06 '24 04:03 d4x1

1.- The value is success. image 2.- cicd_pipelines.result = "SUCCESS" and cicd_pipelines. environment = ""(empty)

image

cicd_tasks.result = and cicd_tasks.environment =, results below.

image

estefaniasuasti avatar Mar 06 '24 20:03 estefaniasuasti

  1. It seems GitLab regard it as success, we cannot fetch the warning status and can do nothing with it.
  2. fields' values in cicd_tasks are within my expectation.

In cicd_pipelines, it's marked "success"(although it' warning in GitLab web page. I think it's the root cause beacuse it's not a successful pipeline in face), and in this pipeline, there is a "PRODUCTION "task. So this pipeline becomes a PRODUCTION DEPLOYMENT.

Do you have any advice about how to fix it?

d4x1 avatar Mar 08 '24 07:03 d4x1

Hi @d4x1 ,

I think the missing part is to validate when the Environment is "Production" the Result should be "Success" in the cicd_tasks table for that specific job. In our case, the job that deploys to production is "fra-prod-deploy," and even if the other jobs fail, this is the indicator to say if it reached production or not. I think this is the only way to be 100% sure if the deployment happened in production despite the pipeline status.

estefaniasuasti avatar Mar 08 '24 20:03 estefaniasuasti

Hi @d4x1 ,

I think the missing part is to validate when the Environment is "Production" the Result should be "Success" in the cicd_tasks table for that specific job. In our case, the job that deploys to production is "fra-prod-deploy," and even if the other jobs fail, this is the indicator to say if it reached production or not. I think this is the only way to be 100% sure if the deployment happened in production despite the pipeline status.

@estefaniasuasti But a task can be PRODUCTION even if it's status is FAILED. It means there is a failed production task.

I think the root cause is this pipeline cannot be marked as SUCCESS(both status and result). The API response is so weird!

d4x1 avatar Mar 12 '24 09:03 d4x1

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar May 12 '24 00:05 github-actions[bot]

This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.

github-actions[bot] avatar May 19 '24 00:05 github-actions[bot]

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Aug 12 '24 00:08 github-actions[bot]

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Oct 14 '24 00:10 github-actions[bot]

This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.

github-actions[bot] avatar Oct 21 '24 00:10 github-actions[bot]