quakespasm icon indicating copy to clipboard operation
quakespasm copied to clipboard

*Sky_ClipPoly: MAX_CLIP_VERTS* error in `lim_daviddg` map

Open alexey-lysiuk opened this issue 1 year ago • 6 comments

Another limit exceeding issue caused by lim_daviddg map from Liminal Spaces Jam. Originally reported here.

Load the map, and do setpos -256 0 0 in console. It trigger the given fatal error in Sky_ClipPoly(). nump variable is 83 which greater than 62 (MAX_CLIP_VERTS - 2).

Here is bsp file only. It's the same file as in #100.

alexey-lysiuk avatar Jun 26 '24 12:06 alexey-lysiuk

Even with bumping MAX_CLIP_VERTS to 128, when I noclip'ed around after that setpos, I immediately hit the error with nump==157.

@ericwa : Should we bump that MAX_CLIP_VERTS to 256 which seem to work?

CC'ing @vsonnier : vkquake has the same code in place.

sezero avatar Jun 26 '24 12:06 sezero

Thanks @sezero for the head-up ! (Un)Fortunatly, It seems I cannot reproduce the bug on vkQuake mainly because.. Sky_ClipPoly() is never called, guarded by a lot of outer conditions, to the point I wonder if this function is ever called in practice on any map.

Even if I hack around it to force the trigger of Sky_ClipPoly() I only managed nump <= 5 in @alexey-lysiuk example.

Boosting MAX_CLIP_VERTS seems harmless enough though. In any case, I'll follow QS change.

vsonnier avatar Jul 05 '24 15:07 vsonnier

OK @vsonnier, thanks.

@ericwa, @andrei-drexler: thoughts?

sezero avatar Jul 05 '24 17:07 sezero

Raising to 256 seems ok, I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.

ericwa avatar Jul 05 '24 19:07 ericwa

I will check if there’s a bug on my end in Ericw-tools alpha builds leading to these excessively many-sided faces showing up.

I checked and you would have to specify -maxedges 0 to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)

ericwa avatar Jul 06 '24 18:07 ericwa

I checked and you would have to specify -maxedges 0 to ericw-tools qbsp to get these faces (default behaviour is to chop faces to 64 sides if they exceed that for some reason)

Why not clamping -maxedges arguments to max = 256 for instance, whatever the input value ? Otherwise, it is just a matter of time before another map exceeed the port hardcoded value by using -maxedges 0

Sorry if that sound silly, I don't know any thing of map building :)

vsonnier avatar Jul 08 '24 05:07 vsonnier

Do we have a better solution that doesn't rape the stack?

sezero avatar Jul 18 '24 15:07 sezero

Hunk alloc + FreeToMark

ericwa avatar Jul 19 '24 01:07 ericwa

Hunk alloc + FreeToMark

Fine, for vkQuake I've done so using dynamic allocation (in fact a combination of _alloca and malloc underneath the macros) : https://github.com/Novum/vkQuake/commit/0314204399ce563d22121dc74dc80c4376341ba9

As I said above on vkQuake Sky_ClipPoly() doesn't seem to be called often, if ever.

But what about pratical performance impact of the other QS using Hunk alloc vs. stack ?

vsonnier avatar Jul 19 '24 05:07 vsonnier

@vsonnier: I tried applying the patch inlined below to qs, based on relevant vkquake commits. The alloca path is good. But the malloc path crashes miserably. (Tested by hard-coding on_heap to 1 or 0.) Will try investigating further, but if you see anything before me, please drop a note here.

EDIT: Patch updated according to the +2 comments in the next reply

gl_sky3.patch.txt

diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index a347d97..c9b11f4 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -495,14 +495,16 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
 	float	*v;
 	qboolean	front, back;
 	float	d, e;
-	float	dists[MAX_CLIP_VERTS];
-	int		sides[MAX_CLIP_VERTS];
-	vec3_t	newv[2][MAX_CLIP_VERTS];
 	int		newc[2];
 	int		i, j;
 
-	if (nump > MAX_CLIP_VERTS-2)
-		Sys_Error ("Sky_ClipPoly: MAX_CLIP_VERTS");
+	const int max_clip_verts = nump + 2;
+	const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+	int	*sides;
+	float	*dists;
+	vec3_t	*newv_0;
+	vec3_t	*newv_1;
+
 	if (stage == 6) // fully clipped
 	{
 		Sky_ProjectPoly (nump, vecs);
@@ -511,6 +513,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
 
 	front = back = false;
 	norm = skyclip[stage];
+
+	sides = (int *) (on_heap ? malloc(max_clip_verts * sizeof(int)) : alloca(max_clip_verts * sizeof(int)));
+	dists = (float *) (on_heap ? malloc(max_clip_verts * sizeof(float)) : alloca(max_clip_verts * sizeof(float)));
+
 	for (i=0, v = vecs ; i<nump ; i++, v+=3)
 	{
 		d = DotProduct (v, norm);
@@ -532,6 +538,10 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
 	if (!front || !back)
 	{	// not clipped
 		Sky_ClipPoly (nump, vecs, stage+1);
+		if (on_heap) {
+			free(dists);
+			free(sides);
+		}
 		return;
 	}
 
@@ -541,22 +551,26 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
 	VectorCopy (vecs, (vecs+(i*3)) );
 	newc[0] = newc[1] = 0;
 
+	// 2-dim vec3_t	 newv[2][MAX_CLIP_VERTS]; as 2 arrays
+	newv_0 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+	newv_1 = (vec3_t *) (on_heap ? malloc(max_clip_verts * sizeof(vec3_t)) : alloca(max_clip_verts * sizeof(vec3_t)));
+
 	for (i=0, v = vecs ; i<nump ; i++, v+=3)
 	{
 		switch (sides[i])
 		{
 		case SIDE_FRONT:
-			VectorCopy (v, newv[0][newc[0]]);
+			VectorCopy (v, newv_0[newc[0]]);
 			newc[0]++;
 			break;
 		case SIDE_BACK:
-			VectorCopy (v, newv[1][newc[1]]);
+			VectorCopy (v, newv_1[newc[1]]);
 			newc[1]++;
 			break;
 		case SIDE_ON:
-			VectorCopy (v, newv[0][newc[0]]);
+			VectorCopy (v, newv_0[newc[0]]);
 			newc[0]++;
-			VectorCopy (v, newv[1][newc[1]]);
+			VectorCopy (v, newv_1[newc[1]]);
 			newc[1]++;
 			break;
 		}
@@ -568,16 +582,24 @@ static void Sky_ClipPoly (int nump, vec3_t vecs, int stage)
 		for (j=0 ; j<3 ; j++)
 		{
 			e = v[j] + d*(v[j+3] - v[j]);
-			newv[0][newc[0]][j] = e;
-			newv[1][newc[1]][j] = e;
+			newv_0[newc[0]][j] = e;
+			newv_1[newc[1]][j] = e;
 		}
 		newc[0]++;
 		newc[1]++;
 	}
 
 	// continue
-	Sky_ClipPoly (newc[0], newv[0][0], stage+1);
-	Sky_ClipPoly (newc[1], newv[1][0], stage+1);
+	Sky_ClipPoly (newc[0], newv_0[0], stage+1);
+	Sky_ClipPoly (newc[1], newv_1[0], stage+1);
+
+	if (on_heap)
+	{
+		free(dists);
+		free(sides);
+		free(newv_0);
+		free(newv_1);
+	}
 }
 
 /*
@@ -587,9 +609,6 @@ Sky_ProcessPoly
 */
 void Sky_ProcessPoly (glpoly_t	*p)
 {
-	int			i;
-	vec3_t		verts[MAX_CLIP_VERTS];
-
 	//draw it
 	DrawGLPoly(p);
 	rs_brushpasses++;
@@ -597,9 +616,20 @@ void Sky_ProcessPoly (glpoly_t	*p)
 	//update sky bounds
 	if (!r_fastsky.value)
 	{
-		for (i=0 ; i<p->numverts ; i++)
+		const int max_clip_verts = p->numverts + 2;
+		const int num_verts = p->numverts;
+		const int on_heap = max_clip_verts > MAX_CLIP_VERTS;
+		vec3_t *verts = (vec3_t *) (on_heap ?
+				 malloc(max_clip_verts * sizeof(vec3_t)) :
+				 alloca(max_clip_verts * sizeof(vec3_t)));
+		int i = 0;
+
+		for ( ; i < num_verts; i++) {
 			VectorSubtract (p->verts[i], r_origin, verts[i]);
-		Sky_ClipPoly (p->numverts, verts[0], 0);
+		}
+		Sky_ClipPoly (num_verts, verts[0], 0);
+
+		if (on_heap) free(verts);
 	}
 }

sezero avatar Jul 25 '24 11:07 sezero

OK, as far as I can see, that + 2 needs to be taken into account back in Sky_ProcessPoly, i.e. if I allocate +2 more elements in Sky_ProcessPoly it seems to work smoothly. The relevant change for vkquake should be like the following -- @vsonnier: do you concur?

diff --git a/Quake/gl_sky.c b/Quake/gl_sky.c
index b49c60e..272548b 100644
--- a/Quake/gl_sky.c
+++ b/Quake/gl_sky.c
@@ -681,16 +681,17 @@ void Sky_ProcessPoly (cb_context_t *cbx, glpoly_t *p, float color[3])
 	// update sky bounds
 	if (need_bounds)
 	{
-		const size_t MAX_CLIP_VERTS = p->numverts;
+		const size_t MAX_CLIP_VERTS = p->numverts + 2;
+		const int num_verts = p->numverts;
 
 		TEMP_ALLOC (vec3_t, verts, MAX_CLIP_VERTS);
 
-		for (size_t i = 0; i < MAX_CLIP_VERTS; i++)
+		for (int i = 0; i < num_verts; i++)
 		{
 			poly_vert = &p->verts[0][0] + (i * VERTEXSIZE);
 			VectorSubtract (poly_vert, r_origin, verts[i]);
 		}
-		Sky_ClipPoly (MAX_CLIP_VERTS, verts[0], 0);
+		Sky_ClipPoly (num_verts, verts[0], 0);
 
 		TEMP_FREE (verts);
 	} 

sezero avatar Jul 25 '24 12:07 sezero

Thanks @sezero for your thorough investigation. At first glance your change looks inconsistent, why making verts bigger than num_verts while the iteration looks like:

for (size_t i = 0; i < num_verts; i++)
{
...
	VectorSubtract (poly_vert, r_origin, verts[i]);
}

Well the reason is that VectorSubtract is actually a macro expanding to: (thank you Visual Studio)

do {
    (verts[i])[0] = (poly_vert)[0] - (r_origin)[0]; (verts[i])[1] = (poly_vert)[1] - (r_origin)[1]; (verts[i])[2] = (poly_vert)[2] - (r_origin)[2];
} while (0)

So the farther access of verts array is actually (verts[num_elements])[2] which the compiler happily take and interpret as verts[num_elements + 2] apparently ????

My C is not good enough to see how this could be legal or not. Does the compiler even issue a warning ?

And this explains those magical +2 in the rest of Sky_ClipPoly, this is because the usage of Vector-like macros.

vsonnier avatar Jul 25 '24 17:07 vsonnier

The problem appears not in Sky_ProcessPoly, but Sky_ClipPoly where it does VectorCopy (vecs, (vecs+(i*3)) ); which results in a buffer overrun: I noticed that by running qs under valgrind.

sezero avatar Jul 25 '24 17:07 sezero

I just applied the gl_sky3.patch from above to qs as e35ecad1847de7b2d3382ce180c145015cc1e379

@vsonnier: Waiting for your further comments / replies to apply the vkquake patch and close this.

sezero avatar Jul 25 '24 18:07 sezero

Damn maybe I'm dense, but how that would even work ?

Sky_ClipPoly (int nump, vec3_t vecs, int stage) {

//so `vecs` is just a parameter, passed-by-copy...
//...

for (i = 0, v = vecs; i < nump; i++, v += 3)
{
...
}
// at that point, i = nump ?
// and so then later so we finally have :
//...
VectorCopy (vecs, (vecs + (nump * 3))); // how (vecs + (nump * 3))[n] is even a valid access, for any n ?

vsonnier avatar Jul 25 '24 18:07 vsonnier

The vecs parameter to Sky_ClipPoly is vec3_t and compiler takes that as float *, that's why it works and that's why access in VectorCopy call is not (vecs + (nump * 3))[n] but just (vecs + (nump * 3))

sezero avatar Jul 25 '24 18:07 sezero

Maybe we should just change Sky_ClipPoly to take vecs as float * instead of vec3_t to avoid confusions?

sezero avatar Jul 25 '24 18:07 sezero

Not fully C-aware humans like me (I'm an Ada dev, imagine my trauma :)) would love this, now an appropriate coment in the code like you just explained to me is probably enough. And the QS code will stay close of the other ports that way. Your choice.

vsonnier avatar Jul 25 '24 18:07 vsonnier

OK, applied the +2 change to vkquake. As for any documentation / other changes or improvements: leaving to you -- I will probably port back to qs.

Closing this as fixed now.

sezero avatar Jul 25 '24 18:07 sezero

This is great, thank you for your help and patience !

vsonnier avatar Jul 25 '24 18:07 vsonnier

This is great, thank you for your help and patience !

Well, my thanks to you: I based the qs patch on tour work

sezero avatar Jul 25 '24 18:07 sezero