GLSL icon indicating copy to clipboard operation
GLSL copied to clipboard

What kind of white space is allowed in GLSL source? Is UTF-8 BOM allowed or not?

Open tksuoran opened this issue 1 year ago • 1 comments

Reading GLSL specification section 3.1. Character Set and Phases of Compilation, strictly speaking, no kind of white space is mentioned as allowed. This is probably a mistake? Meanwhile,

The source character set used for the OpenGL Shading Language is Unicode in the UTF-8 encoding scheme.

says encoding is unicode. In unicode, BOM can be considered "zero width no-break space" (at least according to https://en.wikipedia.org/wiki/Non-breaking_space). This leaves open the question regarding how BOM should be treated. I know at least one OpenGL implementation where shader program linking fails when shader source contains BOM.

tksuoran avatar Oct 01 '24 11:10 tksuoran

The handling of space is done by the preprocessor, the output of the preprocessor is a token stream. So, the definition of what is a space character is in the normative reference of the c++ specification. Reading that spec (2.5.2 lex.pptoken) we get:

Preprocessing tokens can be separated by white space; this consists of comments (2.8), or white-space characters (space, horizontal tab, new-line, vertical tab, and form-feed), or both.

From that, I'd guess that a BOM is not a valid space character.

dj2 avatar Oct 14 '24 20:10 dj2