azuredevops icon indicating copy to clipboard operation
azuredevops copied to clipboard

Question regarding easy caching approach

Open cyberblast opened this issue 1 year ago • 13 comments


obviously loading the whole NVD database for every pipe run is a bad idea. So I thought how to improve it without requiring too much effort or even costs for hosting etc.

Then I came across the --data CLI argument. Using that, we could easily use Azure DevOps Cache task to cache/restore the data.

But I'm wondering if that is a valid approach, as the description for the argument says: "This option should generally not be set." Also that approach is suggested nowhere.

Anyway, I started to implement it like this, but unfortunately I'm unable to test it currently due to some issues on NVD API side of things (HTTP 503).

Any idea if that should work or not at all or if there is any reason why it should not be done like this?

- task: Cache@2
  displayName: ODC NVD Database Cache
    key: 'ODCNVD | "$(Agent.OS)"'
    path: $(Pipeline.Workspace)/odc/data
- task: dependency-check-build-task@6
  displayName: 'OWASP Dependency Check'
  continueOnError: ${{ parameters.warningOnly }}
    projectName: ${{ parameters.projectName }}
    scanPath: ${{ parameters.scanPath }}
    format: ${{ parameters.format }}
    enableVerbose: ${{ parameters.verbose }}
    failOnCVSS: ${{ parameters.cvssThreshold }}
    warnOnCVSSViolation: ${{ parameters.warningOnly }}
    additionalArguments: --nvdApiKey <secret> --data $(Pipeline.Workspace)/odc/data ${{ parameters.additionalArguments }}

cyberblast avatar Nov 23 '23 11:11 cyberblast

Yes, several people use this option. They build a database on one node, save the database, and then copy the DB to any other node that is running ODC and simply use the --noupdate option. I suppose the docs should be updated for this use case.

jeremylong avatar Nov 23 '23 11:11 jeremylong

great 😄 thank you for the confirmation

cyberblast avatar Nov 23 '23 11:11 cyberblast

In @cyberblast's example, however, --noupdate is not used and therefore it always downloads the data or if --noupdate was used then I would not be sure whether the updated db is in the cache. There should be a daily triggered pipeline for update only the cache.

pippolino avatar Nov 24 '23 22:11 pippolino

I added --nvdValidForHours with a high value instead. I agree, a dedicated update pipe may be more reliable for concurrency reasons. But in general I guess it should work already.

cyberblast avatar Nov 25 '23 07:11 cyberblast

It seems I forgot about Azure Cache tasks hard wired and strict scoping, which makes sharing of cache between pipes impossible 😞 Will look into setting up a dedicated NVD DB update pipe instead...

cyberblast avatar Nov 26 '23 09:11 cyberblast

Hi, not 100% sure how much it makes sense currently, as ODC is down due to NVD issues, but I'd like to share some implementation aproach to run a dedicated pipeline to update NVD Database and utilize it in other pipes for ODC execution.

Anyway, I believe it may give a good starting point for anybody implementing a similar aproach.

NVD Update Pipe

- name: purge
 displayName: Purge Database
 type: boolean
 default: false
- name: verbose
 displayName: Verbose
 type: boolean
 default: false
- name: nvdValidForHours
 displayName: NVD valid for hours
 type: number
 default: 23
- name: additionalArguments
 displayName: Additional arguments
 type: string
 default: ' '
- name: nvdApiKey
 displayName: NVD API key
 type: string

trigger: none
- cron: '0 0 * * *'
 displayName: Daily midnight run
   - master

- stage: update_odc_nvd
   name: Azure Pipelines
   vmImage: ubuntu-latest
 - job: build
     clean: outputs
   displayName: Update NIST NVD
   - checkout: none
   - task: Cache@2
     displayName: ODC Cache
       key: 'ODC | "$(Agent.OS)"'
       path: $(Pipeline.Workspace)/odc/app
   - task: Cache@2
     displayName: NVD Cache
       key: 'NVD | "$(Agent.OS)"'
       path: $(Pipeline.Workspace)/odc/data
   - bash: |
       set -x # echo on
       VERSION=$(curl -s

       if [ ! -d "$(Pipeline.Workspace)/odc/app/$VERSION" ]; then
         rm -rf $(Pipeline.Workspace)/odc/app/*
         mkdir -p $(Pipeline.Workspace)/odc/app/$VERSION
         curl -Ls "$VERSION/dependency-check-$" --output
         unzip -uq ./ -d $(Pipeline.Workspace)/odc/app/$VERSION

       $(Pipeline.Workspace)/odc/app/$VERSION/dependency-check/bin/ --updateonly --nvdApiKey ${{ parameters.nvdApiKey }} --data $(Pipeline.Workspace)/odc/data --nvdValidForHours ${{ parameters.nvdValidForHours }} $PURGE ${{ parameters.additionalArguments }}
     displayName: Update NVD
       ${{ if eq( parameters.purge, true ) }}:
         PURGE: '--purge'
       ${{ else }}:
         PURGE: ''
   - task: ArchiveFiles@2
     displayName: Compress NVD Artifact
       rootFolderOrFile: '$(Pipeline.Workspace)/odc/data'
       includeRootFolder: false
       archiveFile: '$(Build.ArtifactStagingDirectory)/'
   - task: PublishPipelineArtifact@1
     displayName: Publish NVD Artifact
       targetPath: '$(Build.ArtifactStagingDirectory)/'
       artifact: 'NVD'
       publishLocation: 'pipeline'

Please be aware that I'm still not 100% sure if this code works well, as NVD DB is currently unavailable. Also, your API Key gets exposed to the logs.

Also, this code doesn't make use of the Azure DevOps Marketplace task "dependency-check-build-task@6". I started with it, but you need to add additional parameters and other mandatory parameters of the task are not needed at all for this use case. So I decided to get rid of it eventually.

cyberblast avatar Nov 28 '23 00:11 cyberblast

Ciao @cyberblast, the pipeline seems correct to me, but then you need to download the artifact in all pipelines. Why don't you directly use the database for storage as mentioned here. I'm trying to do it right now. I use a dedicated pipeline for updating NVDs with maven plugin and then I have everything ready in all client pipelines.

pippolino avatar Nov 28 '23 08:11 pippolino

Hi @pippolino, Thank you for the suggestion. yes sounds like a reasonable idea. However, that also means additional infrastructure setup and maintenance. Having an up to date pipe artifact managed completely within Azure DevOps pipes is much easier in our specific setup. At least for now.

cyberblast avatar Nov 28 '23 14:11 cyberblast

Hi, only wanted to give a short feedback that the above code works very well now as the issue with getting NVD API queried has been solved. Maybe its helps someone to set it up.

For completeness I'm also pasting consumer code, executing ODC. It's a task template.

- name: verbose
  type: boolean
  default: false
- name: projectName
  type: string
  default: 'OWASP'
- name: scanPath
  type: string
  default: './'
- name: warningOnly
  type: boolean
  default: false
- name: additionalArguments
  type: string
  default: ''
- name: cvssThreshold
  type: number
  default: '4'
- name: format
  type: string
  default: 'HTML, JUNIT, JSON'
- name: publishTestResults
  type: boolean
  default: true
- name: NistNvdTeamProject
  type: string
  default: '<Name of DevOps Project>'
- name: NistNvdPipeId
  type: string
  default: '<Name of NVD DB Pipe>'
- name: NistNvdPipeBranch
  type: string
  default: 'refs/heads/master'
- name: NistNvdArtifactName
  type: string
  default: 'NVD'
- name: NistNvdFileName
  type: string
  default: ''

- task: DownloadPipelineArtifact@2
  displayName: Download NVD Artifact
  continueOnError: ${{ parameters.warningOnly }}
    source: specific
    project: ${{ parameters.NistNvdTeamProject }}
    pipeline: ${{ parameters.NistNvdPipeId }}
    runVersion: latestFromBranch
    runBranch: ${{ parameters.NistNvdPipeBranch }}
    artifact: ${{ parameters.NistNvdArtifactName }}
    path: '$(Pipeline.Workspace)/odc'
- task: ExtractFiles@1
  displayName: Unpack NVD
  continueOnError: ${{ parameters.warningOnly }}
    archiveFilePatterns: '$(Pipeline.Workspace)/odc/${{ parameters.NistNvdFileName }}'
    destinationFolder: '$(Pipeline.Workspace)/odc/data'
    overwriteExistingFiles: true 
- task: dependency-check-build-task@6
  displayName: 'OWASP Dependency Check'
  continueOnError: ${{ parameters.warningOnly }}
  condition: succeeded()
    projectName: ${{ parameters.projectName }}
    scanPath: ${{ parameters.scanPath }}
    format: ${{ parameters.format }}
    enableVerbose: ${{ parameters.verbose }}
    failOnCVSS: ${{ parameters.cvssThreshold }}
    warnOnCVSSViolation: ${{ parameters.warningOnly }}
    additionalArguments: --noupdate --data $(Pipeline.Workspace)/odc/data ${{ parameters.additionalArguments }}
- ${{ if eq(parameters.publishTestResults, true) }}:
  - task: PublishTestResults@2
    displayName: 'Publish ODC results'
    continueOnError: ${{ parameters.warningOnly }}
    condition: succeededOrFailed()
      testResultsFormat: 'JUnit'
      searchFolder: $(Common.TestResultsDirectory)
      testResultsFiles: 'dependency-check/*junit.xml'
      failTaskOnFailedTests: ${{ not(parameters.warningOnly) }}

To use the task in a pipe it can be done like this (here with pipe in same repo for C#):

- template: ../task/test-owasp-dependencies.yml
    scanPath: '**/*.csproj'
    warningOnly: true

and for npm (e.g. react):

- template: ../task/test-owasp-dependencies.yml
    scanPath: '**/yarn.lock'
    additionalArguments: '--scan "$(Build.SourcesDirectory)/**/package.json" --scan "$(Build.SourcesDirectory)/**/node_modules" --disableYarnAudit --nodeAuditSkipDevDependencies --nodePackageSkipDevDependencies'
    warningOnly: true

Please note that we are here disabling Yarn Audit (--disableYarnAudit) only because we are using yarn berry (v4) which seems to not work well with ODC currently. Most likely you can/should remove that flag...

cyberblast avatar Dec 05 '23 07:12 cyberblast

I also used a similar approach after all of the issues. One suggestion for your pipeline is that you don't need to have tasks to archive and unarchive the files. Azure Pipeline Artifacts does all that already and has optimizations for uploading and downloading to skip redundant files.

Here's my pipeline that caches the data files. It runs every 4 hours to always have the latest NVD data while following their recommended best practice for frequency. The nvd and oss variables are stored as secret pipeline variables.

appendCommitMessageToRunName: false

  batch: true
    - '*'
    - OwaspResourceDownload.yml

- cron: '0 0,4,8,12,16,20 * * *'
  displayName: 'Q.4H Update'
    - main
  always: true

  dependencyCheckVersion: latest

  vmImage: 'windows-latest'

- stage: update
  displayName: Update OWASP Dependency Check Data
  - job: update
    displayName: Update OWASP Dependency Check Data
    - checkout: none

    - task: PowerShell@2
      displayName: Update Build Name
        targetType: 'inline'
        script: |
          # OWASP Dependency Check Version
          $latestOnlineVersion = Invoke-RestMethod -Uri ''
          $odcVersion = if ($env:dependencyCheckVersion -eq 'latest' -and $latestOnlineVersion) {
          else {
          Write-Host -Object "Dependency Check Version: $odcVersion"

          # NVD Last Change
          $headers = @{
              'Accept' = 'application/json'
              'apiKey' = $env:nvdApiKey

          $startDate = ( Get-Date ).ToUniversalTime().AddHours(-4).ToString('o')
          $endDate = ( Get-Date ).ToUniversalTime().ToString('o') 

          $uri = "$startDate&changeEndDate=$endDate"
          try {
              $lastChange = Invoke-RestMethod -Uri $uri -Headers $headers -ErrorAction Stop |
                  Select-Object -ExpandProperty cveChanges |
                  Select-Object -Last 1
              $nvcLastChangeTime = $lastChange.change.created | Get-Date -Format 'yyyyMMdd.HHmm'
          catch {
              Write-Warning -Message "##[warning] Failed to get NVD Last Change: $($_.Exception.Message)"
              $nvcLastChangeTime = $endDate | Get-Date -Format 'yyyyMMdd.HHmm'
          Write-Host -Object "NVD Last Change: $nvcLastChangeTime"

          Write-Host -Object "##vso[task.setvariable variable=nvcLastChangeTime;]$nvcLastChangeTime"
          Write-Host -Object "##vso[Build.UpdateBuildNumber]ODC-$($odcVersion)_NVD-$($nvcLastChangeTime)"

    - task: Cache@2
        key: 'owasp-dependency-check | data | "$(nvcLastChangeTime)"'
        path: '$(Pipeline.Workspace)/owasp-dependency-check-data'
        restoreKeys: 'owasp-dependency-check | data'

    - task: dependency-check-build-task@6
      displayName: OWASP Dependency Check
      retryCountOnTaskFailure: 1
        dependencyCheckVersion: $(dependencyCheckVersion)
        projectName: 'Update'
        scanPath: '$(Pipeline.Workspace)'
        additionalArguments: >
          --nvdApiKey $(nvdApiKey)
          --nvdApiDelay 6000
          --data "$(Pipeline.Workspace)/owasp-dependency-check-data"
          --ossIndexUsername $(ossIndexUsername)
          --ossIndexPassword $(ossIndexPassword)

    - publish: $(Pipeline.Workspace)/owasp-dependency-check-data
      artifact: owasp-dependency-check-data

I then consume it with the following tasks (can't include the whole pipeline for IP reasons):

Declare the above pipeline as a resource:

  - pipeline: OWASPResources
    source: OWASP Resource Download
    branch: main

I use variables for CVSS score and ODC version

  failOnCVSS: 7 # More info ->
  dependencyCheckVersion: latest

And the steps, using the --data param for the resource artifact and --noupdate.

          - download: OWASPResources
            artifact: owasp-dependency-check-data
            displayName: Download OWASP Dependency Check Data

          - task: dependency-check-build-task@6
            displayName: OWASP Dependency Check
              dependencyCheckVersion: $(dependencyCheckVersion)
              projectName: '${{ parameters.release }}'
              scanPath: '$(Pipeline.Workspace)/${{ parameters.release }}Artifact/${{ coalesce(parameters.artifactName, parameters.product, ''drop'') }}'
              format: 'HTML, JUNIT'
              failOnCVSS: '$(failOnCVSS)'
              suppressionPath: '$(Pipeline.Workspace)\owasp-suppression.xml'
              enableExperimental: ${{ parameters.enableExperimental }}
              additionalArguments: >
                --data "$(Pipeline.Workspace)/OWASPResources/owasp-dependency-check-data"

thisjustin816 avatar Dec 05 '23 18:12 thisjustin816

Hello @cyberblast.

After downloading the, i'm trying to run an ODC scan with maven plugin by providing -DautoUpdate=false and also -DdataDirectory=$(Pipeline.Workspace)/odc/data. But it keeps returning: NoDataException: Autoupdate is disabled and the database does not exist

I also tried to extract the NVD zip to $(Pipeline.Workspace)/.m2/repository/org/owasp/dependency-check-data/9.0.2 but still not working.

if by any chance you have an idea ;)

omgdota123 avatar Dec 05 '23 18:12 omgdota123


@thisjustin816 thanks for sharing. Contains some interesting aspects. But also maybe depends a bit on usage scenario/environment. Will also look up again on the artifact topic. I wasn't aware of it.

@omgdota123 you need to extract it to the data directory $(Pipeline.Workspace)/odc/data, as described here.

cyberblast avatar Dec 05 '23 19:12 cyberblast

Hi everyone! I saw this owasp/dependency-check-action image used for GitHub Actions, in which the NVD database is updated nightly. I didn't get any example for Azure DevOps, so I decided to do the following:

  - name: projectName
    type: string
  - name: isExperimentalEnabled
    type: boolean

  - task: DockerInstaller@0
      dockerVersion: "$(LATEST_DOCKER_VERSION)"

  - script: |
      docker pull owasp/dependency-check-action:latest
    displayName: Pull OWASP Image

  - script: |
      if [ "${{ parameters.isExperimentalEnabled }}" ]; then
        echo "Experimental analyzers enabled"
      docker run --rm -v $(System.DefaultWorkingDirectory):/workspace owasp/dependency-check-action:latest \
                 --project ${{ parameters.projectName }} \
                 --failOnCVSS 7 \
                 --scan /workspace \
                 --format HTML --format JUNIT \
                 --noupdate \
                 --out /workspace $experimental
     displayName: Run dependency check

  - task: PublishTestResults@2
      testResultsFormat: "JUnit"
      testResultsFiles: "dependency-check-junit.xml"
      searchFolder: "$(System.DefaultWorkingDirectory)"
   displayName: "Publish Dependency Check Test Results if Available"

  - task: PublishPipelineArtifact@1
      targetPath: "$(System.DefaultWorkingDirectory)/dependency-check-report.html"
      artifact: "OWASP DOCKER HTML"
      publishLocation: "pipeline"
    displayName: "Publish OWASP Artifact"

  # Just in case, removing:
  - script: |
      docker image rm owasp/dependency-check-action:latest
    displayName: Remove OWASP Image

What are your thought on this approach?

gsarapura avatar Sep 09 '24 19:09 gsarapura

Hi everyone,

@thisjustin816, I recreated your template as bash script(Linux user), worked fine your logic

Based by script @thisjustin816 - updated for linux user

appendCommitMessageToRunName: false

  batch: true
    - '*'
    - update-dpdc-check.yml

- cron: '0 0,4,8,12,16,20 * * *'
  displayName: 'Q.4H Update'
    - master
    - feature/*

  always: true

  dependencyCheckVersion: 10.0.4

  vmImage: 'ubuntu-latest'

- stage: update
  displayName: Update OWASP Dependency Check Data
  - job: update
      - group: DEV
    displayName: Update OWASP Dependency Check Data
    - checkout: none

    - task: Bash@3
      displayName: Update Build Name
        targetType: 'inline'
        script: |
          # OWASP Dependency Check Version
          latestOnlineVersion=$(curl -s '')
          # Determine the Dependency Check version
          if [[ "$(dependencyCheckVersion)" == "latest" && -n "$latestOnlineVersion" ]]; then
          export   odcVersion=$latestOnlineVersion
          echo "Dependency Check Version: $odcVersion"

          # NVD Last Change
              -H "Accept: application/json"
              -H "apiKey: $(nvdApiKey)"

          startDate=$(date -u -d '-4 hours' +"%Y-%m-%dT%H:%M:%SZ")
          endDate=$(date -u +"%Y-%m-%dT%H:%M:%SZ")

          # Fetch the last change
          response=$(curl -s "${headers[@]}" "$uri")
          lastChange=$(printf '%s\n' "$response" | jq -r '.cveChanges[-1].change.created')

          if [[ -n "$lastChange" ]]; then
              nvcLastChangeTime=$(date -d "$lastChange" +"%Y%m%d.%H%M")
              echo "##[warning] Failed to get NVD Last Change"
              nvcLastChangeTime=$(date -d "$endDate" +"%Y%m%d.%H%M")

          echo "NVD Last Change: $nvcLastChangeTime"

          # Set variables for Azure DevOps
          echo "##vso[task.setvariable variable=nvcLastChangeTime;]$nvcLastChangeTime"
          echo "##vso[Build.UpdateBuildNumber]ODC-$(printf '%s\n' $odcVersion)_NVD-$nvcLastChangeTime"

    - task: Cache@2
        key: 'owasp-dependency-check | data | "$(nvcLastChangeTime)"'
        path: '$(Pipeline.Workspace)/owasp-dependency-check-data'
        restoreKeys: 'owasp-dependency-check | data'

    - task: dependency-check-build-task@6
      displayName: OWASP Dependency Check
      retryCountOnTaskFailure: 1
        dependencyCheckVersion: $(dependencyCheckVersion)
        projectName: 'Update'
        scanPath: '$(Pipeline.Workspace)'
        additionalArguments: >
          --nvdApiKey $(nvdApiKey)
          --nvdApiDelay 6000
          --data "$(Pipeline.Workspace)/owasp-dependency-check-data"
          --ossIndexUsername $(sonartypeossindex)
          --ossIndexPassword $(sonarToken)

    - publish: $(Pipeline.Workspace)/owasp-dependency-check-data
      artifact: owasp-dependency-check-data

I changed the resource instead declared on pipeline, for download task, because I use template for dependency check:

- task: DownloadPipelineArtifact@2
    buildType: 'specific'
    project: 'PROJECT'
    definition: NNNN
    buildVersionToDownload: 'latest'
    artifactName: 'owasp-dependency-check-data'
    targetPath: '$(Build.Repository.LocalPath)/owasp-dependency-check-data'
  displayName: 'Download OWASP Dependency Check Data'

Thank you so much for your contrubution, help me a lot ...

twrb avatar Oct 07 '24 20:10 twrb