EDI.Net icon indicating copy to clipboard operation
EDI.Net copied to clipboard

Syntax validator

Open KoosBusters opened this issue 7 years ago • 16 comments

I'm implementing a protocol that is very strict in validating the exchanged messages and requires extensive syntactic validation and detailed errors to be produced while serializing and validating the input.

At this point the Picture class only seems to be used to pick the correct formatting to read the elements and the provided length does not seem to trigger any validation errors at this point.

Ideally the serializer would take care of syntax validation and provide an exception with detailed information of all validation errors or an separate property that would indicate if the message is syntactically ok or a list of errors if not. The last option would be an non breaking change as you have to check the validation results and act on these yourself.

I'll be happy to implement this feature and create an pull request, but we need to specify exactly how this would fit into the current structure.

KoosBusters avatar Apr 14 '17 06:04 KoosBusters

Hi @koosbusters

First of all thanks for your interest in EDI.Net!

I am currently on a leave for the Easter and will be back from Wednesday. I would love to discuss this further and provide you with clear implementation guidelines.

Until then I will try to find some time to add my thoughts on this matter.

cleftheris avatar Apr 14 '17 13:04 cleftheris

Hi, I think you may be talking about CONTRL or APERAK?, I have to implement it as well but I have not think so much on it as first I need the POCO classes, I thought about extending the EdiAttributes to include how the field/value should be handle to pass the validation, like mandatory and syntax check for CONTRL and and another one per process type and property for APERAK.

Nekeniehl avatar Apr 14 '17 17:04 Nekeniehl

Hi @cleftheris, thank you for the nice library!

@Nekeniehl: Yes, it's the automation of the CONTRL and APERAK messages. These specify a lot of scenarios that need to be validated and handled by responding with a specific error code that are also part of the specification.

I created some POCO classes as a prove of concept but before implementing all the messages I like to have the format for validation as well so I can go through the specification one time very carefully and hopefully implement all the happy and unhappy flows in 1 run.

KoosBusters avatar Apr 15 '17 15:04 KoosBusters

@cleftheris did you find any time to think about some implementation guidelines? I would like to start working on this issue, but I would like to know if validations should be additional to the current attributes, should be created with a new set of attributes.

Personally I would like to set segment specific validations with attributes on the segments themselves. I think I will also need some kind of fluent api (like the modelbuilder in EntityFramework working together with the data annotation attributes) to configure validations that span multiple segments.

We should also assign a place to keep track of the validation errors, maybe an overwrite of the deserialize method with an optional out parameter that provides an validation object.

KoosBusters avatar Apr 21 '17 13:04 KoosBusters

@KoosBusters not found the time yet. Its on the list

cleftheris avatar Apr 21 '17 14:04 cleftheris

@Nekeniehl I did some more reading about the error message flow and I think the validation errors that we retrieve from the deserialization process of this library should be sufficient for the CONTRL process (syntactic validation). The APERAK messages are for rejections and confirmations at the business layer of the application, and should therefor be out of scope of any validation that the EDI parser does.

As far as I see the tasks of the validator are not that large:

  • Segment repetition (check against a minimum and a maximum number of segment repetitions)
  • Type validation (can the string supplied be parsed to the format that is specified)
  • Are all mandatory elements present in the segment
  • Length of the elements should be within specified length
  • Validate all the text in the message is within the specified character set
  • Check if the control count adds up (UNT and UNZ segment)

KoosBusters avatar Apr 24 '17 07:04 KoosBusters

Hi @KoosBusters, I think I am on the same page with you. I also love the list & its quite accurate so lets do this.

I think that validation on this list should be off by default and be enabled through a setting. That is because some transmissions are not very well formed and the receiving party as I have experienced for myself cannot do anything about it.

  1. May be tricky

Segment repetition (check against a minimum and a maximum number of segment repetitions).

  1. Parsing to clr type should throw already on failure. We could do an additional regex check before parsing regardless of the mapped type. This should be easy

Type validation (can the string supplied be parsed to the format that is specified)

  1. There is already a Mandatory property to indicate this should be present on the EdiValueAttribute but it is not used #18

Are all mandatory elements present in the segment

  1. I think it is related to (2)

Length of the elements should be within specified length

  1. I think it is related to (2)

Validate all the text in the message is within the specified character set

  1. I would love if this is implemented as #17 suggests.

Check if the control count adds up (UNT and UNZ segment)

Also, related to (2) I would like to point out that since the deserialiser passes any formatting info annotated on the properties of the CLR model, it is up to the model author to do the right thing. For example

I prefer this binding

[EdiValue(9(3), Path="XXX/1/0")]
public int? MyIntegerField { get; set; }

to the string equivalent

[EdiValue(9(3), Path="XXX/1/0")]
public string MyIntegerField { get; set; }

The library even supports binding to Enum by name or value (string/number).

cleftheris avatar Apr 24 '17 10:04 cleftheris

Hi @KoosBusters, You are right, APERAK should be check after the serialisation of the message as it can exists different agreements from the standard (AHB) (German) between partners.

The CONTRL also will check:

  • That the receiver is you and no other partner. (Not for the serializer step, but just for info)
  • The sender exists. (Same as above)
  • Not only Type and length, also that the Element value exists in a list of possible values for the segments that are predefine that way, i.e sending out a "ZRV" would be a negative CONTRL as it is not one of the possible values for the CCI segment. chrome_2017-04-24_12-11-56
  • Beside the mandatory segments, it should also checks the names, i.e for a Point of delivery (LOC segment) would be a negative CONTRL if is writen as LCO, it will also check that the Qualifier, in this case 172, is the correct one.
  • It should also check the Segment structure and the order for the elements that compound each segment.

I will start soon with this too as I already have 3 POCO classes to test with. I will share any progress regarding this as soon as possible.

Nekeniehl avatar Apr 24 '17 10:04 Nekeniehl

@cleftheris: I think that 'Validate all the text in the message is within the specified character set' is not typically done on the attributes themselves (and therefor not related to (2)), because it is an interchange wide requirement (it should also trigger an validation if there is an illegal character (e.g. @) somewhere in the transmission, even if its not included in the elements specified in our EDI POCO class). Maybe this should be added to the interchange validation/generation services as an addition to #17.

@Nekeniehl: 3: using an enum should be sufficient for these cases I think 4 + 5: I added a new todo point: I think we should validate that all edi segments/elements/components of the interchange are deserialized to the POCO class. I think this is the only way to conclusively detect all elements that are missing (because I think writing LCO should just trigger the missing segment and unrecognized segment validation).

What is left to define is the structure in which we keep track of the validation errors, as a reference all the different kind of error codes for edifact: here

If I try to conclude all the points I end up with the following list:

  1. Segment count validation (min. max. number of segments)
  2. Extend CLR validation
    • specified format validation (type and length)
    • check the Manditory property on the EDI attributes
  3. Validate that all segments/elements/components in the EDI message are parsed to the POCO class
  4. Implement #17 + character set validation

KoosBusters avatar Apr 24 '17 19:04 KoosBusters

Hi, here an updated list of error codes that is valid since 01.10.2014 - Open, since the one you are referencing are quite old and the majority has been deprecated, moved to APERAK or changed.

2 Syntax-Version or level supported
7 Interchange recipient not actual recipient
12 Invalid value
13 Missing Mandatory
16 Too many constituents
20 Character invalid as service character
21 Invalid character(s)
23 Unknown Interchange sender
25 Test indicator not supported
26 Duplicate detected
28 References do not match
29 Control count does not match number of instances received
32 Lower level empty

From my point of view, the best structure we have to keep track of validation errors is the EdiPropertyDescriptor, since is the one who contains the Value for the elements and the EdiValue for the validation.

Nekeniehl avatar Apr 25 '17 10:04 Nekeniehl

@Nekeniehl thanks for the update! EdiPropertyDescriptor sound good to me.

KoosBusters avatar Apr 25 '17 16:04 KoosBusters

@cleftheris I completed most of my poco classes what means that I need to start the message flow including validations. Is there any way you can take a look at #49 because if I start implementing I rather do it based on an generic specification interface (IFormatSpec) than the old Picture class.

KoosBusters avatar May 03 '17 08:05 KoosBusters

@cleftheris do you know if you can find any time to look at #49? If not I will create the validator in my fork since I need to validate more properties than the Picture class supports at this moment.

@Nekeniehl did you get to implementing validations for your messages?

KoosBusters avatar May 29 '17 13:05 KoosBusters

Hi @KoosBusters, sorry for the late response, I was on vacations.

I'm currently (still) working on the APERAK, I have made an xml for a specific process (MSCONS- 13008) with all the conditions on the AHB (MSCONS AHB). I parse the xml and create the conditions to later on run over the POCO Class and check all of them, so far is working, I get all the conditions that are not valid, check children, check parents value, check other fields values, and so one, but I'm facing a problem with the operator when there is more than one conditon as it can be quite complex and I'm still thinking on how to solve it. I can share it in my fork but at the moment is quite chaotic because the vacations and I have to clean it up a little bit.

How it's going the CONTRL check?

Nekeniehl avatar Jun 06 '17 16:06 Nekeniehl

I'm also interested in this. Specifically the CONTRL check. As the rules for APERAK vary between regions (countries), I agree with @KoosBusters that that is out of scope for the parser.

jkmyklebust avatar Mar 27 '18 11:03 jkmyklebust

Any update on this? Would love having the ability to validate a input file before parsing it into poco.

Patrickyp avatar Oct 30 '21 09:10 Patrickyp