scancode-toolkit icon indicating copy to clipboard operation
scancode-toolkit copied to clipboard

file version extracted from file properties

Open yahym opened this issue 3 years ago • 4 comments

Short Description

Is it possible to extract also the details from the File Properties and to reflect them in the final report? This will allow a better traceability to the File version (which is not available now) and also to Copyright details. image

Possible Labels

  • new feature

Select Category

  • [ ] Enhancement
  • [ ] Add License/Copyright
  • [x] Scan Feature
  • [ ] Packaging
  • [ ] Documentation
  • [ ] Expand Support
  • [ ] Other

Describe the Update

Add a new field for extracting the file properties that could return relevant information for licensing.

How This Feature will help you/your organization

Increased report accuracy, a better traceability and increased speed for delta reviews.

Possible Solution/Implementation Details

New fields added for each file for file version, product name, product version, copyright - linked to file properties.

Example/Links if Any

Can you help with this Feature

yahym avatar May 18 '22 09:05 yahym

What about this using the --package option:

$ mkdir dll
$ wget -P dll/ https://github.com/icsharpcode/SharpZipLib/releases/download/v1.3.3/SharpZipLib-v1.3.3.nupkg
$ extractcode  dll/
$ scancode --package --yaml dll.yaml.txt dll/SharpZipLib-v1.3.3.nupkg-extract/lib/net45/ICSharpCode.SharpZipLib.dll
$ cat dll.yaml.txt
headers:
    -   tool_name: scancode-toolkit
        tool_version: 31.0.0b5
        options:
            input:
                - dll/SharpZipLib-v1.3.3.nupkg-extract/lib/net45/ICSharpCode.SharpZipLib.dll
            --package: yes
            --yaml: dll.yaml.txt
        notice: |
            Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
            OR CONDITIONS OF ANY KIND, either express or implied. No content created from
            ScanCode should be considered or used as legal advice. Consult an Attorney
            for any legal advice.
            ScanCode is a free software code scanning tool from nexB Inc. and others.
            Visit https://github.com/nexB/scancode-toolkit/ for support and download.
        start_timestamp: '2022-05-18T124204.929953'
        end_timestamp: '2022-05-18T124206.685962'
        output_format_version: 2.0.0
        duration: '1.756018877029419'
        message:
        errors: []
        warnings: []
        extra_data:
            system_environment:
                operating_system: linux
                cpu_architecture: 64
                platform: Linux-4.15.0-177-generic-x86_64-with-glibc2.17
                platform_version: '#186~16.04.1-Ubuntu SMP Wed Apr 20 09:41:17 UTC 2022'
                python_version: "3.8.12 (default, Jan 29 2022, 10:00:28) \n[GCC 5.4.0 20160609]"
            spdx_license_list_version: '3.16'
            files_count: 1
dependencies: []
packages: []
files:
    -   path: ICSharpCode.SharpZipLib.dll
        type: file
        package_data:
            -   type: winexe
                namespace:
                name: ICSharpCode.SharpZipLib
                version: 1.3.3+1b1ab013ce1df02d8f27cf582197759c614d9126
                qualifiers: {}
                subpath:
                primary_language:
                description: |
                    ICSharpCode.SharpZipLib
                    SharpZipLib (#ziplib, formerly NZipLib) is a compression library for Zip, GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented as an assembly (installable in the GAC), and thus can easily be incorporated into other projects (in any .NET language)
                release_date:
                parties:
                    -   type: organization
                        role: author
                        name: ICSharpCode
                        email:
                        url:
                keywords: []
                homepage_url:
                download_url:
                size:
                sha1:
                md5:
                sha256:
                sha512:
                bug_tracking_url:
                code_view_url:
                vcs_url:
                copyright: Copyright © 2000-2021 SharpZipLib Contributors
                license_expression: unknown AND unknown
                declared_license:
                    LegalCopyright: Copyright © 2000-2021 SharpZipLib Contributors
                    LegalTrademarks:
                    License:
                notice_text:
                source_packages: []
                file_references: []
                extra_data: {}
                dependencies: []
                repository_homepage_url:
                repository_download_url:
                api_data_url:
                datasource_id: windows_executable
                purl: pkg:winexe/[email protected]%2B1b1ab013ce1df02d8f27cf582197759c614d9126
        for_packages: []
        scan_errors: []

dll.yaml.txt

pombredanne avatar May 18 '22 12:05 pombredanne

of this:

$ scancode --package --yaml full.yaml.txt dll/SharpZipLib-v1.3.3.nupkg-extract/
cat full.yaml.txt
headers:
    -   tool_name: scancode-toolkit
        tool_version: 31.0.0b5
        options:
            input:
                - dll/SharpZipLib-v1.3.3.nupkg-extract/
            --package: yes
            --yaml: full.yaml.txt
        notice: |
            Generated with ScanCode and provided on an "AS IS" BASIS, WITHOUT WARRANTIES
            OR CONDITIONS OF ANY KIND, either express or implied. No content created from
            ScanCode should be considered or used as legal advice. Consult an Attorney
            for any legal advice.
            ScanCode is a free software code scanning tool from nexB Inc. and others.
            Visit https://github.com/nexB/scancode-toolkit/ for support and download.
        start_timestamp: '2022-05-18T124645.754674'
        end_timestamp: '2022-05-18T124647.806079'
        output_format_version: 2.0.0
        duration: '2.051415205001831'
        message:
        errors: []
        warnings: []
        extra_data:
            system_environment:
                operating_system: linux
                cpu_architecture: 64
                platform: Linux-4.15.0-177-generic-x86_64-with-glibc2.17
                platform_version: '#186~16.04.1-Ubuntu SMP Wed Apr 20 09:41:17 UTC 2022'
                python_version: "3.8.12 (default, Jan 29 2022, 10:00:28) \n[GCC 5.4.0 20160609]"
            spdx_license_list_version: '3.16'
            files_count: 14
dependencies: []
packages:
    -   type: nuget
        namespace:
        name: SharpZipLib
        version: 1.3.3
        qualifiers: {}
        subpath:
        primary_language:
        description: SharpZipLib (#ziplib, formerly NZipLib) is a compression library for Zip,
            GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented as an assembly
            (installable in the GAC), and thus can easily be incorporated into other projects
            (in any .NET language)
        release_date:
        parties:
            -   type:
                role: author
                name: ICSharpCode
                email:
                url:
        keywords: []
        homepage_url: https://github.com/icsharpcode/SharpZipLib
        download_url:
        size:
        sha1:
        md5:
        sha256:
        sha512:
        bug_tracking_url:
        code_view_url:
        vcs_url: git+https://github.com/icsharpcode/SharpZipLib
        copyright: Copyright © 2000-2021 SharpZipLib Contributors
        license_expression: mit
        declared_license: https://licenses.nuget.org/MIT
        notice_text:
        source_packages: []
        extra_data: {}
        repository_homepage_url: https://www.nuget.org/packages/SharpZipLib/1.3.3
        repository_download_url: https://www.nuget.org/api/v2/package/SharpZipLib/1.3.3
        api_data_url: https://api.nuget.org/v3/registration3/sharpziplib/1.3.3.json
        package_uid: pkg:nuget/[email protected]?uuid=e678e4ec-cc6c-4b80-84ce-ec6c5f8c9a41
        datafile_paths:
            - SharpZipLib-v1.3.3.nupkg-extract/SharpZipLib.nuspec
        datasource_ids:
            - nuget_nupsec
        purl: pkg:nuget/[email protected]
files:
    -   path: SharpZipLib-v1.3.3.nupkg-extract
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/[Content_Types].xml
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/SharpZipLib.nuspec
        type: file
        package_data:
            -   type: nuget
                namespace:
                name: SharpZipLib
                version: 1.3.3
                qualifiers: {}
                subpath:
                primary_language:
                description: SharpZipLib (#ziplib, formerly NZipLib) is a compression library
                    for Zip, GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented
                    as an assembly (installable in the GAC), and thus can easily be incorporated
                    into other projects (in any .NET language)
                release_date:
                parties:
                    -   type:
                        role: author
                        name: ICSharpCode
                        email:
                        url:
                keywords: []
                homepage_url: https://github.com/icsharpcode/SharpZipLib
                download_url:
                size:
                sha1:
                md5:
                sha256:
                sha512:
                bug_tracking_url:
                code_view_url:
                vcs_url: git+https://github.com/icsharpcode/SharpZipLib
                copyright: Copyright © 2000-2021 SharpZipLib Contributors
                license_expression: mit
                declared_license: https://licenses.nuget.org/MIT
                notice_text:
                source_packages: []
                file_references: []
                extra_data: {}
                dependencies: []
                repository_homepage_url: https://www.nuget.org/packages/SharpZipLib/1.3.3
                repository_download_url: https://www.nuget.org/api/v2/package/SharpZipLib/1.3.3
                api_data_url: https://api.nuget.org/v3/registration3/sharpziplib/1.3.3.json
                datasource_id: nuget_nupsec
                purl: pkg:nuget/[email protected]
        for_packages:
            - pkg:nuget/[email protected]?uuid=e678e4ec-cc6c-4b80-84ce-ec6c5f8c9a41
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/_rels
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/_rels/.rels
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/images
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/images/sharpziplib-nuget-256x256.png
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/net45
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/net45/ICSharpCode.SharpZipLib.dll
        type: file
        package_data:
            -   type: winexe
                namespace:
                name: ICSharpCode.SharpZipLib
                version: 1.3.3+1b1ab013ce1df02d8f27cf582197759c614d9126
                qualifiers: {}
                subpath:
                primary_language:
                description: |
                    ICSharpCode.SharpZipLib
                    SharpZipLib (#ziplib, formerly NZipLib) is a compression library for Zip, GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented as an assembly (installable in the GAC), and thus can easily be incorporated into other projects (in any .NET language)
                release_date:
                parties:
                    -   type: organization
                        role: author
                        name: ICSharpCode
                        email:
                        url:
                keywords: []
                homepage_url:
                download_url:
                size:
                sha1:
                md5:
                sha256:
                sha512:
                bug_tracking_url:
                code_view_url:
                vcs_url:
                copyright: Copyright © 2000-2021 SharpZipLib Contributors
                license_expression: unknown AND unknown
                declared_license:
                    LegalCopyright: Copyright © 2000-2021 SharpZipLib Contributors
                    LegalTrademarks:
                    License:
                notice_text:
                source_packages: []
                file_references: []
                extra_data: {}
                dependencies: []
                repository_homepage_url:
                repository_download_url:
                api_data_url:
                datasource_id: windows_executable
                purl: pkg:winexe/[email protected]%2B1b1ab013ce1df02d8f27cf582197759c614d9126
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/net45/ICSharpCode.SharpZipLib.pdb
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/net45/ICSharpCode.SharpZipLib.xml
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.0
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.0/ICSharpCode.SharpZipLib.dll
        type: file
        package_data:
            -   type: winexe
                namespace:
                name: ICSharpCode.SharpZipLib
                version: 1.3.3+1b1ab013ce1df02d8f27cf582197759c614d9126
                qualifiers: {}
                subpath:
                primary_language:
                description: |
                    ICSharpCode.SharpZipLib
                    SharpZipLib (#ziplib, formerly NZipLib) is a compression library for Zip, GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented as an assembly (installable in the GAC), and thus can easily be incorporated into other projects (in any .NET language)
                release_date:
                parties:
                    -   type: organization
                        role: author
                        name: ICSharpCode
                        email:
                        url:
                keywords: []
                homepage_url:
                download_url:
                size:
                sha1:
                md5:
                sha256:
                sha512:
                bug_tracking_url:
                code_view_url:
                vcs_url:
                copyright: Copyright © 2000-2021 SharpZipLib Contributors
                license_expression: unknown AND unknown
                declared_license:
                    LegalCopyright: Copyright © 2000-2021 SharpZipLib Contributors
                    LegalTrademarks:
                    License:
                notice_text:
                source_packages: []
                file_references: []
                extra_data: {}
                dependencies: []
                repository_homepage_url:
                repository_download_url:
                api_data_url:
                datasource_id: windows_executable
                purl: pkg:winexe/[email protected]%2B1b1ab013ce1df02d8f27cf582197759c614d9126
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.0/ICSharpCode.SharpZipLib.pdb
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.0/ICSharpCode.SharpZipLib.xml
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.1
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.1/ICSharpCode.SharpZipLib.dll
        type: file
        package_data:
            -   type: winexe
                namespace:
                name: ICSharpCode.SharpZipLib
                version: 1.3.3+1b1ab013ce1df02d8f27cf582197759c614d9126
                qualifiers: {}
                subpath:
                primary_language:
                description: |
                    ICSharpCode.SharpZipLib
                    SharpZipLib (#ziplib, formerly NZipLib) is a compression library for Zip, GZip, BZip2, and Tar written entirely in C# for .NET. It is implemented as an assembly (installable in the GAC), and thus can easily be incorporated into other projects (in any .NET language)
                release_date:
                parties:
                    -   type: organization
                        role: author
                        name: ICSharpCode
                        email:
                        url:
                keywords: []
                homepage_url:
                download_url:
                size:
                sha1:
                md5:
                sha256:
                sha512:
                bug_tracking_url:
                code_view_url:
                vcs_url:
                copyright: Copyright © 2000-2021 SharpZipLib Contributors
                license_expression: unknown AND unknown
                declared_license:
                    LegalCopyright: Copyright © 2000-2021 SharpZipLib Contributors
                    LegalTrademarks:
                    License:
                notice_text:
                source_packages: []
                file_references: []
                extra_data: {}
                dependencies: []
                repository_homepage_url:
                repository_download_url:
                api_data_url:
                datasource_id: windows_executable
                purl: pkg:winexe/[email protected]%2B1b1ab013ce1df02d8f27cf582197759c614d9126
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.1/ICSharpCode.SharpZipLib.pdb
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/lib/netstandard2.1/ICSharpCode.SharpZipLib.xml
        type: file
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/package
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/package/services
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/package/services/metadata
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/package/services/metadata/core-properties
        type: directory
        package_data: []
        for_packages: []
        scan_errors: []
    -   path: SharpZipLib-v1.3.3.nupkg-extract/package/services/metadata/core-properties/7de003b19cf14fd4a3e3d2f0354f1b91.psmdcp
        type: file
        package_data: []
        for_packages: []
        scan_errors: []


full.yaml.txt

pombredanne avatar May 18 '22 12:05 pombredanne

Long story short: this is collected as part of the "package_data" (or "packages" in the previous version) as we essentially run a parser for Windows DLL and executables to extract the metadata, including the version

See https://github.com/nexB/scancode-toolkit/blob/19d77e99c0c37270752849a1f27bb3ec9ad96c72/src/packagedcode/win_pe.py#L42 for internal details.

(NB: we also can extract data from a windows registry hive FWIW, and parse NuGet manifests. Next would be to parse VisualStudio config files too)

pombredanne avatar May 18 '22 12:05 pombredanne

Also see these closely related issues: https://github.com/nexB/scancode-toolkit/issues/2734 and https://github.com/nexB/scancode-toolkit/issues/2733 and this other approach https://github.com/nexB/commoncode/blob/main/src/commoncode/version.py

pombredanne avatar May 18 '22 12:05 pombredanne

--package does the job! Thank you for all the details and support. Issue can be closed. 👍

yahym avatar Aug 16 '22 13:08 yahym