incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Question][CI/CD] How can we stop auto collection of CI/CD metrics?

Open thiDucTran opened this issue 1 year ago • 11 comments

Question

Hi, what is the correct config to use if we do not want to automatically collect deployment/builds/etc data? Referencing https://devlake.apache.org/docs/DataModels/DevLakeDomainLayerSchema#data-models, I have tried not choosing CI/CD for my data scope config...but deployment data is still being collected. Using v1beta5 with azure devops GO connection

FYI: i asked this also in slack. see https://devlake-io.slack.com/archives/C03APJ20VM4/p1714025708040129

Screenshots

image image

thiDucTran avatar Apr 25 '24 07:04 thiDucTran

@thiDucTran Hi, by design, the previously collected data of selected entities won't be deleted but simply skipping those related subtasks, please check if the collectApiBuilds subtask is showing in the pipeline plan JSON for the plugin image

klesh avatar Apr 26 '24 03:04 klesh

i dont see collectApiBuilds

    [
      {
        "plugin": "azuredevops_go",
        "subtasks": [
          "collectAccounts",
          "collectApiPullRequests",
          "convertRepo",
          "extractAccounts",
          "convertAccounts",
          "extractApiPullRequests",
          "collectApiPullRequestCommits",
          "convertApiBuilds",
          "convertApiPullRequests",
          "convertPrLabels",
          "extractApiPullRequestCommits",
          "convertApiPullRequestsCommits",
          "convertApiTimelineRecords"
        ],

thiDucTran avatar Apr 26 '24 06:04 thiDucTran

idk if this is an issue..and if it is a separate issue that needs its own github issue. but sharing it again from the slack thread

whenever I do a new pipeline run..i see that it changed updated_at for all of the pipeline runs to the same time (see before and after picture)

although, what is this data used for..i do not think it is used to calculate DORA? because when I go to the DORA - Deployment frequency dashboard...it's empty (as expected)...so there seems to be a difference between pipeline runs that you see in Azure DevOps dashboard versus deployments that you would see in DORA dashboards_

image image

thiDucTran avatar Apr 26 '24 06:04 thiDucTran

Weird, why there is a convertApiBuilds in the subtasks list. It looks like a bug indeed, would you like to file it in a separate issue and we will look into it?

klesh avatar Apr 28 '24 03:04 klesh

@thiDucTran Already effective in v1.0.0-beta6

abeizn avatar Apr 28 '24 10:04 abeizn

issue is not resolved for me in v1.0.0-beta6 ? CI/CD metrics still gets collected... I even deleted the project...purge scope's data...re-created project..made sure CI/CD is not in my scope config...colleted data..and I still see ci/cd metrics in Azure DevOps dashboard

        "plugin": "azuredevops_go",
        "subtasks": [
          "collectAccounts",
          "collectApiPullRequests",
          "convertRepo",
          "extractAccounts",
          "convertAccounts",
          "extractApiPullRequests",
          "collectApiPullRequestCommits",
          "convertApiPullRequests",
          "convertPrLabels",
          "extractApiPullRequestCommits",
          "convertApiPullRequestsCommits",
          "convertApiTimelineRecords"
        ],

i think part of the issue is that mysql data is not really removed? I still see records like this after deleting the project, clear data scope historical data, and even removing the data scope...

SELECT
  *
FROM 
  cicd_pipelines

image

thiDucTran avatar Apr 29 '24 02:04 thiDucTran

@thiDucTran What is the value of your environment variable ENABLE_SUBTASKS_BY_DEFAULT?

abeizn avatar Apr 29 '24 02:04 abeizn

@thiDucTran That is weird, I don't see any related subtasks in the list. Can you check the raw tables and see if the data gets purged?

klesh avatar Apr 29 '24 02:04 klesh

i edited my previous comment....seems data is not purged

thiDucTran avatar Apr 29 '24 02:04 thiDucTran

To be investigated.

klesh avatar Apr 29 '24 02:04 klesh

Hi, would I need to create another issue for the table cicd_deployment_commits ? I have deleted the project and data scope..but data of deleted project is still there in cicd_deployment_commits ..using 1.0.0-beta9

edit: also seeing unpurged data in table cicd_deployments ..i mean there could be other tables with unpurged data as well

thiDucTran avatar May 29 '24 07:05 thiDucTran

This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.

github-actions[bot] avatar Jul 29 '24 00:07 github-actions[bot]

This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.

github-actions[bot] avatar Aug 06 '24 00:08 github-actions[bot]