spdx-3-model icon indicating copy to clipboard operation
spdx-3-model copied to clipboard

Enums for modelExplainability and anonymizationMethodUsed

Open bact opened this issue 1 year ago • 1 comments

From this snippet

{
    "type": "dataset_DatasetPackage",
    "dataset_anonymizationMethodUsed" : "pseudonymization",
    "description": "replace direct identifiers (such as name or social security number) with artificial identifiers to prevent the data from being directly linked back to the individual"
}
  • The anonymizationMethodUsed field is only used for a method name (possible to be from a list/enum).
  • Details of how it has been used is in the description filed.
  • Similar style of usage like this can be applied with modelExplainability field in AI Profile

This is the area that in 3.1 we can have new vocabularies: AnonymizationMethod and ModelExplainabilityAlgorithm, to be used with anonymizationMethodUsed and modelExplainability.

They may look similar to HashAlgorithm.

  • dataCollectionProcess and dataPreprocessing (all free-form text currently) are also potentially able to have these kind of enums.
  • We may even able to borrow steps from data processing "pipelines" in framework like langchain and huggingface for dataPreprocessing

bact avatar Aug 29 '24 07:08 bact

AnonymizationMethod can be a class similar to IntegrityMethod and we could have enums developed like ones in https://github.com/spdx/crypto-algorithms/

bact avatar Jul 30 '25 13:07 bact