ASVS Request for Clarification/Refinement: V1.1 "not stored in an encoded or escaped" Data Storage Principle

Request for Clarification/Refinement: V1.1 "not stored in an encoded or escaped" Data Storage Principle

Open ajayojha opened this issue 4 months ago • 22 comments

trafficstars

Please refer to this disucssion reference https://github.com/OWASP/ASVS/discussions/3184 , further input I am writing here as I think the current requirement may reqire some clarity and practical alignment.

Please refer the below code reference of Discourse & Ghost where they are storing encoded/sanitized data into the database, depending on architectural needs. There are lot of other popular opensource framework which are also storing the data, if needed i will share accordingly.

Discourse : Code Reference -- Stores both raw and HTML (cooked) versions of user posts.
Ghost : Code Reference -- Stores Sanitized/Encoded data.

Here is the brief summary of the attached link of the github discussion:

Quoting and responding to a few points raised by @elarlang in his response.

it is not possible to encode data for HTML before saving it to the database

Encoding before storage is common in edge cases and application-dependent, this totally depends on the application requirement. The above shared code example is enough to think on contradicting the absolute stance.

It ruins the integrity of the data - e.g., it was not the value that the user entered

Data Integrity: When HTML sanitization happens, the data is intentionally changed by removing malicious parts (like

It assumes that the only output encoding is HTML, but if it is to be used it in JSON, CSV, displayed as just text, or whatever other format, it is already in the incorrect format

if certain fields are guaranteed to only ever be displayed in a single, fixed context like HTML, would it be acceptable — from an ASVS perspective — to encode once during write-time?

The problem to solve here is to use the correct caching mechanism.

The encoded data should be stored into the cache, if the data should be into the cache then would like to know the reason why we are stating that "never store encoded data". Does V1.1’s “stored” include caches, or is it database-specific?

The clarity and usability of the guidance are lacking, team should consider/think the following points:

Principles need context, blanket statements (like “never store encoded data”) can lead to confusion or overcorrection.
Clear usecases help avoid mistakes. When developers see good and bad usecases, they better understand the reason behind a rule—and they’re more likely to apply it the right way.
Clarify whether “stored” in the current rule applies to database only, or also includes caches.
The “not stored in an encoded or escaped” in the statement is too absolute, as there are niche cases where storing encoded or transformed values is justified, it is good to think on Reinforce "storing raw" as the default and recommended best practice.

Jun 25 '25 02:06 ajayojha

ASVS ASVS copied to clipboard

Request for Clarification/Refinement: V1.1 "not stored in an encoded or escaped" Data Storage Principle

ASVS
ASVS copied to clipboard