Adding clarity for 1.5.1 and 5.1.4 (related 5.1.3, 1.8.1)
Suggest we augment 5.1.4 from:
| 5.1.4 | Verify that structured data is strongly typed and validated against a defined schema including allowed characters, length and pattern (e.g. credit card numbers, e-mail addresses, telephone numbers, or validating that two related fields are reasonable, such as checking that suburb and zip/postcode match). (C5) | ✓ | ✓ | ✓ | 20 |
|---|
to:
| 5.1.4 | Verify that structured data (e.g., JSON and XML) is strongly typed and validated against a defined schema, including allowed characters, length, and patterns such as regular expressions for credit card numbers, e-mail addresses, and telephone numbers. (C5) | ✓ | ✓ | ✓ | 20 |
|---|
I would use JSON and XML schema validation separately, 5.1.4 can send more clear message as "allow-listed pattern" for validation.
JSON and XML schemas are covered with requirements:
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 13.2.2 | Verify that JSON schema validation is in place and verified before accepting input. | ✓ | ✓ | ✓ | 20 |
| 13.3.1 | Verify that XSD schema validation takes place to ensure a properly formed XML document, followed by validation of each input field before any processing of that data takes place. | ✓ | ✓ | ✓ | 20 |
Thanks @elarlang - would you suggest we delete 5.1.4 as is then? It seems really JSON/XML specific.
No, I don't suggest that.
With requirement 1.5.1 must be defined, how some data must be validated and with 5.1.4 analyzt must follow the ruleset from 1.5.1 and verify that. The word schema maybe was directing you to JSON and XML fields, but otherwise I think those requirements are in correct place.
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 1.5.1 | Verify that input and output requirements clearly define how to handle and process data based on type, content, and applicable laws, regulations, and other policy compliance. | ✓ | ✓ | 1029 |
You opened separate issues for 1.5.1 (https://github.com/OWASP/ASVS/issues/1543, https://github.com/OWASP/ASVS/issues/1546)
Yes, schema implies JSON and XML schema and is throwing me off for that requirement, its confusing and need fixing IMO.
While I see how you can see where schema may imply JSON or XML, it really is using the definition meaning "identified or specified pattern". Unless we have another requirement that addresses pattern specific data fields like email address, SSN, phone number, etc., I'd say we leave 5.1.4 as is. If the word "schema" is confusing for many, maybe we swap it for something like "specified patterns".
Need to make sure we don't overlap with 1.8.1:
| 1.8.1 | [MODIFIED, MERGED FROM 8.3.4, LEVEL L2 > L1] Verify that all sensitive data created and processed by the application has been identified and classified into protection levels, and ensure that a policy is in place on how to deal with sensitive data. | ✓ | ✓ | ✓ | 213 |
|---|
Any further comments?
If 1.5.1 is pre-condition for 5.1.4, then 1.5.1 should be also level 1...
And second thing - we need to take also 5.1.3 into the game and make clear separation, which requirement is meant for what.
Note: for updated version, move to comment https://github.com/OWASP/ASVS/issues/1552#issuecomment-1968358403
Related requirements 1.5.1, 1.8.1, 5.1.3, 5.1.4.
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 1.5.1 | Verify that input and output requirements clearly define how to handle and process data based on type, content, and applicable laws, regulations, and other policy compliance. | ✓ | ✓ | 1029 | |
| 1.8.1 | [MODIFIED, MERGED FROM 8.3.4, LEVEL L2 > L1] Verify that all sensitive data created and processed by the application has been identified and classified into protection levels, and ensure that a policy is in place on how to deal with sensitive data. | ✓ | ✓ | ✓ | 213 |
| 5.1.3 | [MODIFIED] Verify that all input (HTML form fields, REST requests, URL parameters, HTTP headers, cookies, batch files, RSS feeds, etc) is validated using positive validation (allow lists). Where HTML markup must be accepted for input, refer to requirement 5.2.1. (C5) | ✓ | ✓ | ✓ | 20 |
| 5.1.4 | Verify that structured data is strongly typed and validated against a defined schema including allowed characters, length and pattern (e.g. credit card numbers, e-mail addresses, telephone numbers, or validating that two related fields are reasonable, such as checking that suburb and zip/postcode match). (C5) | ✓ | ✓ | ✓ | 20 |
I think 5.1.3 and 5.1.4 are overlapping at the moment and those should be more clear. I think those can be merged (and maybe logically related fields from 5.1.4 as separate requirement)
Goals for requirements:
- 1.5.1 - everything must be analyzed and documented - otherwise it is not possible to implement input validation and it's not possible to test, is it done correctly.
- 1.8.1 - everything must be analyzed and documented from "is it sensitive" perspective. It's included here just to point out potential overlap with 1.5.1
- handled separately (like it was in issue #1546)
- 5.1.3 everything must be validated using positive validation (allow lists) == do not use disallow/block-lists
- moved to separate issue: https://github.com/OWASP/ASVS/issues/1878
- 5.1.4 every element must be with expected type, contain only allowed characters, in min-max value range for numbers, in min-max length for texts, follow patterns for expected type (phone, email, address, ...), logically related fields are matching
If we have agreement on the requirement goals, then we can start finetune them.
ping @tghosth @jmanico - do you agree with my definitions (https://github.com/OWASP/ASVS/issues/1552#issuecomment-1497203051) and we can move it further?
Yes, I’m with you so far.
PS: And I am a bit wary of input validation in general because valid data is often still vulnerable to injection and similar.
Yes, I’m with you so far. PS: And I am a bit wary of input validation in general because valid data is often still vulnerable to injection and similar.
Defenses against injection attacks are sanitization, escaping and encoding, it's a separate section.
Agree on your definitions @elarlang
This is just an update, to make the focus for the issue more clear.
1.8.1 is solved and for 5.1.3 I opened a separate issue (https://github.com/OWASP/ASVS/issues/1878).
Now we have 1.5.1 and 5.1.4 to solve, currently those are:
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 1.5.1 | [MODIFIED, LEVEL L2 > L1] Verify that input and output requirements clearly define how to handle and process data based on type and content. | ✓ | ✓ | ✓ | 20 |
| 5.1.4 | [GRAMMAR] Verify that structured data is strongly typed and validated against a defined schema including allowed characters, length and pattern (e.g. credit card numbers, e-mail addresses, telephone numbers, or validating that two related fields are reasonable, such as checking that suburb and zipcode match). (C5) | ✓ | ✓ | ✓ | 20 |
The points listed here are agreed:
- 1.5.1 - everything must be analyzed and documented - otherwise it is not possible to implement input validation and it's not possible to test, is it done correctly.
- 5.1.4 - every element must be with expected type, contain only allowed characters, in min-max value range for numbers, in min-max length for texts, follow patterns for expected type (phone, email, address, ...), logically related fields are matching
I suggest 1.5.1 goes away and we just keep 5.1.4. 5.1.4 is clear enough.
n+1'th time - you can not develop (and later test) without having documentation requirements in place.
I agree. But ASVS is already very bulky. Some things are already implied. And no one - ever - got hacked because of missing documentation.
If you don't have documentation, on how you need to implement something, there's quite a big chance that you will not implement it (correctly), and... you may get hacked because of that.
But in general, at the moment we create documentation requirements AND implementation requirements. What we will do or how we organize the documentation requirement is up for discussion (in https://github.com/OWASP/ASVS/discussions/1831) and out of scope for this issue.
I surrender and see the value of 1.5.1. I think we are all in sync now.
Ok @elarlang I think there are some tweaks I would make here based on the "documentation and implementation" concept.
I would also like to differentiate this from 5.1.3 by referring to "business specific" data which might have business logic rules for validating it.
Also, I don't really know what output requirements are and I am not sure how they are relevant here.
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 1.5.1 | [MODIFIED, LEVEL L2 > L1] Verify that input validation rules are clearly defined for how to check the structure of business specific data such as credit card numbers, e-mail addresses, telephone numbers, and how to verify that two fields are reasonable, such as checking that suburb and zipcode match . | ✓ | ✓ | ✓ | 20 |
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 5.1.4 | [MODIFIED] Verify that business specific data is validated in according to the rules for the relevant data item or combination of items. | ✓ | ✓ | ✓ | 20 |
What do you think?
(I created https://github.dev/OWASP/ASVS/tree/v5_refresh_1552 but haven't done anything with it yet)
I think the word "business" can be misleading, that something must be related to finances. The goal from my point of view is that all logically related things must match with each other.
So instead of "business specific" I was thinking of "semantically meaningful" but I worry that this would not be clear enough.
Do we just want to call it data which is expected to be in a particular format? @elarlang
I think the word "business" can be misleading, that something must be related to finances. The goal from my point of view is that all logically related things must match with each other.
I agree with this. Good observation @elarlang.
I would just suggest:
Verify that all data is validated according to the rules for the relevant data item or combination of items.
Wordsmithing needed, but I think we should mention or spotlight the "logical combination of related items". Verify that all data is validated according to the rules for the relevant data item, including data items separately and with a logical combination of related items.
Can I suggest: Verify that all data is validated according to the rules applicable to each individual data item and that sets of related data items meet the logical and contextual validation rules when considered in combination.
My suggestion:
Verify that each data item is individually validated according to its specific rules, and that related data items are validated together to meet requirements as a group.
I had a discussion with @elarlang and based on your input @jmanico @ryarmst we came up with some split and refined requirements:
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 1.5.1 | [MODIFIED, SPLIT TO 1.5.5, LEVEL L2 > L1] Verify that input validation rules define how to check the validity of data items against an expected structure. This could be common data formats such as credit card numbers, e-mail addresses, telephone numbers, or it could be an internal data format. | ✓ | ✓ | ✓ | 20 |
| 1.5.5 | [ADDED, SPLIT FROM 1.5.1] Verify that input validation rules define how to ensure that the combination of two or more data items are reasonable, logically and contextually, such as checking that suburb and zipcode match. | ✓ | ✓ | ✓ | 20 |
| # | Description | L1 | L2 | L3 | CWE |
|---|---|---|---|---|---|
| 5.1.4 | [MODIFIED, SPLIT TO 5.1.7] Verify that data items with an expected structure are validated according to the pre-defined rules. | ✓ | ✓ | ✓ | 20 |
| 5.1.7 | [ADDED, SPLIT FROM 5.1.4] Verify that the application ensures that combinations of related data items are reasonable according to the pre-defined rules. | ✓ | ✓ | ✓ | 20 |
What do you think? @jmanico @ryarmst
This is great over all.
For 1.5.5 I suggest removing the "reasonable". Perhaps: Verify that input validation rules specify how to ensure the logical and contextual consistency of combined data items, such as checking that suburb and zipcode match.
I think 5.1.7 can go away. It's a reduced duplicate of 1.5.5