scancode-toolkit icon indicating copy to clipboard operation
scancode-toolkit copied to clipboard

Incorrect detected "unknown-license-reference"

Open chinyeungli opened this issue 6 months ago • 0 comments
trafficstars

The following detection result is from busybox-1.37.0/coreutils/nproc.c

- matches:
    - score: '100.0'
      matcher: 2-aho
      end_line: 4
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/gpl-2.0_147.RULE
      from_file:
      start_line: 4
      matched_text: ' * Licensed under GPLv2, see LICENSE in this source tree'
      match_coverage: '100.0'
      matched_length: 3
      rule_relevance: 100
      rule_identifier: gpl-2.0_147.RULE
      license_expression: gpl-2.0
      license_expression_spdx: GPL-2.0-only
    - score: '100.0'
      matcher: 2-aho
      end_line: 4
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/unknown-license-reference_see-license_1.RULE
      from_file:
      start_line: 4
      matched_text: ' * Licensed under GPLv2, see LICENSE in this source tree'
      match_coverage: '100.0'
      matched_length: 2
      rule_relevance: 100
      rule_identifier: unknown-license-reference_see-license_1.RULE
      license_expression: unknown-license-reference
      license_expression_spdx: LicenseRef-scancode-unknown-license-reference
  identifier: gpl_2_0_and_unknown_license_reference-aa9a8299-cf53-dffd-d6ca-ec59773a44e3
  license_expression: gpl-2.0 AND unknown-license-reference
  license_expression_spdx: GPL-2.0-only AND LicenseRef-scancode-unknown-license-reference

The exact same line is detected as gpl-2.0 and unknown-license-reference with the exact same score.

Another sample from ./gnutls28_3.8.3-1.1ubuntu3.3/src/gl/sys_socket.in.h

- matches:
    - score: '50.0'
      matcher: 2-aho
      end_line: 101
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/license-intro_2.RULE
      from_file:
      start_line: 101
      matched_text: '   2009-05-08, licensed under LGPLv2.1+, plus portability fixes. */'
      match_coverage: '100.0'
      matched_length: 2
      rule_relevance: 50
      rule_identifier: license-intro_2.RULE
      license_expression: unknown-license-reference
      license_expression_spdx: LicenseRef-scancode-unknown-license-reference
    - score: '95.0'
      matcher: 2-aho
      end_line: 101
      rule_url: https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/rules/lgpl-2.1-plus_70.RULE
      from_file:
      start_line: 101
      matched_text: '   2009-05-08, licensed under LGPLv2.1+, plus portability fixes. */'
      match_coverage: '100.0'
      matched_length: 2
      rule_relevance: 95
      rule_identifier: lgpl-2.1-plus_70.RULE
      license_expression: lgpl-2.1-plus
      license_expression_spdx: LGPL-2.1-or-later

The exact same line is detected for both unknown-license-reference and lgpl-2.1-plus. Even the lgpl-2.1-plus has higher score, the unknown-license-reference is still returned.

The unknown-license-reference should not be returned.

chinyeungli avatar May 20 '25 03:05 chinyeungli