common-domain-model icon indicating copy to clipboard operation
common-domain-model copied to clipboard

CDM Reference Data Code List Management

Open brianlynn2 opened this issue 2 years ago • 5 comments

With reference to my proposal in https://github.com/finos/common-domain-model/issues/2015 (CDM Design Guidelines), here are some thoughts on how to manage reference data such as lists of codes that may change frequently without otherwise affecting the underlying logic of CDM applications.

  • FpML Code lists are updated frequently (recently around 6 times per year on average, and at least 140 times over the past 11 years , i.e., monthly). Production implementations need an easy way to use new code lists without extensive retesting.

  • The bulk of the existing reference data (and particularly code lists) for CDM is contained in Rosetta enumerated lists (which are then mapped to Java enumerations or equivalent in other programming languages). These enumerations are either manually created or generated automatically based on FpML code lists. Sometimes there are issues mapping code values to legal enumerated values; this can require manual intervention.

  • Currently the only way for CDM deployments to take advantage of the latest code lists is to wait for a new version of CDM to be released with updated enumerations regenerated from the FpML code lists, and to take the latest version of CDM. This implies that to use the latest code lists, CDM users will need to integrate and regression test their functionality with the latest version of CDM and possibly risk breakage. (In my personal development experience with CDM, essentially every time I integrated with a new version of CDM, there was significant breakage and extensive work required to adapt to the latest version. Even for a research and development project this was challenging. For a production implementation this would be unacceptable).

  • A better solution is required to allow supportability of CDM implementations.

  • I can see 2 options for this:

  • [ ] 1) Improved packaging: move the CDM enumerations (or at least those that purely encode reference data) to a separate JAR file that does not contain any type or function definitions, so that CDM users can update the enumeration JAR file to the latest version without taking the latest version of the functionality. This would minimize the amount of regression testing required. It would still mean some deployment work, as well as CDM .jar file generation work by the CDM maintainers on each release of FpML codes.

  • [ ] 2) Dynamic code lists: move rapidly changing code lists without a direct link to CDM functionality (e.g. floating rate index names, CRPs, business centers, currencies) to a mechanism in which validation is done based on data, either data files (such as JSON files) or API calls. This will allow new codes to be retrieved dynamically without changes to the CDM implementation. Note that this is very easy to implement in CDM. I was able to prototype a complete solution in which all of FpML’s Genericode XML files were converted to JSON files and supplied as resources to CDM, and an easy to use CDM validation mechanism was able to test code values against the latest or specified versions of the code lists with minimal programming effort. It took me around a day of effort to implement this. This is far less effort than was spent addressing the issues of mapping floating rate index names to Java-legal enumerated values.

  • Of the two solutions, I believe the second to be superior, because it is more flexible (e.g. allowing different code lists to be used with the same version of the software, and changing code list versions or supporting multiple versions at run time) and requires less manual intervention and simpler software release control and distribution, but either could potentially work. (Or potentially a mix/hybrid of both; there is no reason that one implementation couldn’t use enumerations while another uses dynamic code lists, as long as the string values are consistent).

  • A second benefit of the dynamic code list solution is that these code list JSON files could contain enriched metadata which would be ignored for list validation purposes, but could be accessed by functionality needing data enrichment. For example, this could be used for mapping from floating rate index names to ISO benchmark codes, or to ISINs where they are available.

brianlynn2 avatar Jul 15 '23 15:07 brianlynn2

A further validation requirement that has become clear in the commodity derivative regulatory reporting work recently is cross-field validation of large, dynamic lists. For instance, if the bottom level commodity product type is London Brent Crude, the second level product type should be something like Oils, and the top level should be Energy. Our solutions for validation should ideally consider this as a generic requirement/feature that can be used anywhere this kind of validation is required.

brianlynn2 avatar Aug 04 '23 13:08 brianlynn2

Hi @brianlynn2 I hope you are doing well. Is this issue okay to be closed? Please let me know if there are any tags you would like added.

eteridvalishvili avatar Jan 16 '24 22:01 eteridvalishvili

Hi Eteri –

This is the issue I presented on at the SWG last week. I don’t think this should be closed until the SWG takes an action on this, but, David, do you have any thoughts?

From: eteridvalishvili @.> Sent: Tuesday, January 16, 2024 5:32 PM To: finos/common-domain-model @.> Cc: brianlynn2 @.>; Mention @.> Subject: Re: [finos/common-domain-model] CDM Reference Data Code List Management (Issue #2269)

Hi @brianlynn2 https://github.com/brianlynn2 I hope you are doing well. Is this issue okay to be closed? Please let me know if there are any tags you would like added.

— Reply to this email directly, view it on GitHub https://github.com/finos/common-domain-model/issues/2269#issuecomment-1894627239 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXUBUAINYQKS62IHLINGEDYO35VRAVCNFSM6AAAAAA2LLN6ROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGYZDOMRTHE . You are receiving this because you were mentioned. https://github.com/notifications/beacon/ABXUBUEAVKSLWU27YYELCQTYO35VRA5CNFSM6AAAAAA2LLN6ROWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTQ5W32O.gif Message ID: @.*** @.***> >

brianlynn2 avatar Jan 16 '24 23:01 brianlynn2

Hi Eteri – This is the issue I presented on at the SWG last week. I don’t think this should be closed until the SWG takes an action on this, but, David, do you have any thoughts? From: eteridvalishvili @.> Sent: Tuesday, January 16, 2024 5:32 PM To: finos/common-domain-model @.> Cc: brianlynn2 @.>; Mention @.> Subject: Re: [finos/common-domain-model] CDM Reference Data Code List Management (Issue #2269) Hi @brianlynn2 https://github.com/brianlynn2 I hope you are doing well. Is this issue okay to be closed? Please let me know if there are any tags you would like added. — Reply to this email directly, view it on GitHub <#2269 (comment)> , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXUBUAINYQKS62IHLINGEDYO35VRAVCNFSM6AAAAAA2LLN6ROVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJUGYZDOMRTHE . You are receiving this because you were mentioned. https://github.com/notifications/beacon/ABXUBUEAVKSLWU27YYELCQTYO35VRA5CNFSM6AAAAAA2LLN6ROWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTTQ5W32O.gif Message ID: @.*** @.***> >

Hi @brianlynn2 Sounds good! I was reviewing all open issues and just wanted to double-check. Thanks for confirming!

eteridvalishvili avatar Jan 17 '24 04:01 eteridvalishvili

I've attached the SWG presentation for reference.

CDM Ref Data Development Strategy for Jan 2024 v3.pdf

brianlynn2 avatar Jan 17 '24 13:01 brianlynn2

Hi @brianlynn2 As ISDA are now supporting this initiative can this Issue now be closed? Thanks!

chrisisla avatar Jan 31 '25 13:01 chrisisla

Sure, that seems reasonable.

From: Chris @.> Sent: Friday, January 31, 2025 8:26 AM To: finos/common-domain-model @.> Cc: brianlynn2 @.>; Mention @.> Subject: Re: [finos/common-domain-model] CDM Reference Data Code List Management (Issue #2269)

Hi @brianlynn2 https://github.com/brianlynn2 As ISDA are now supporting this initiative can this Issue now be closed? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/finos/common-domain-model/issues/2269#issuecomment-2627337497 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ABXUBUE7RNSBO3OFC5NTMLL2NN2XJAVCNFSM6AAAAABWHSWCJGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMRXGMZTONBZG4 . You are receiving this because you were mentioned. https://github.com/notifications/beacon/ABXUBUBMTCDCNYRI6TZFUVD2NN2XJA5CNFSM6AAAAABWHSWCJGWGG33NNVSW45C7OR4XAZNMJFZXG5LFINXW23LFNZ2KUY3PNVWWK3TUL5UWJTU4TH6RS.gif Message ID: @.*** @.***> >

brianlynn2 avatar Jan 31 '25 15:01 brianlynn2

Thanks @brianlynn2 , closing this issue now.

chrisisla avatar Jan 31 '25 15:01 chrisisla