apriltag icon indicating copy to clipboard operation
apriltag copied to clipboard

[Bug] Visual Studio detects stack corruption in apriltag.c homography_compute2

Open ScottJohnson2718 opened this issue 5 years ago • 5 comments

Visual Studio in Debug is telling me that the stack is getting corrupted in apriltag.c in homography_compute2(). See lines 435, 436.

double max_val = 0; int max_val_idx = -1;

I confirm in the debugger that the invalid index (-1) for max_val_idx survives to be used as an index later in the function. So I simply changed the initial value of max_val to be -1.0 and the problem went away because the first iteration of the loop then assigns both values and max_val_idx is no longer -1.

ScottJohnson2718 avatar Feb 07 '20 21:02 ScottJohnson2718

I encountered the same issue.

Will there be an update of the code that fixes the problem? Can I use the fix mentioned above in the mean time?

hartter avatar May 03 '21 14:05 hartter

@hartter

Could you attach an image saved in a lossless format (.png) to reproduce the issue?

s-trinh avatar May 03 '21 17:05 s-trinh

Unfortunately I was not able to save the image where this happend (because the process crashed). Never the less I'd recommend to implement a check, if max_val_idx is -1. This worked for me:

matd_t* homography_compute2(double c[4][4]) 
{
    	...
	if (max_val < epsilon) {
		fprintf(stderr, "WRN: Matrix is singular.\n");
	}

	if (max_val_idx == -1) {
		fprintf(stderr, "ERR: Invalid max_val_idx.\n");
		return NULL;
	}
        ...
}

int quad_update_homographies(struct quad* quad)
{
    	...

	// XXX Tunable
	quad->H = homography_compute2(corr_arr);

	if (quad->H)
	{
		quad->Hinv = matd_inverse(quad->H);
	}
    	...
}

hartter avatar May 05 '21 07:05 hartter

@hartter

Could you attach an image saved in a lossless format (.png) to reproduce the issue?

I originally used TIF but I guess the error will also show up in the PNGs

Image0 Image1

hartter avatar May 19 '21 08:05 hartter

I was able to reproduce this locally, use image magic to convert to pnm convert bad-homography.png bad-homography.pnm then make sure to use the same parameters as python detect apriltag_demo bad-homography.pnm -x 1

The problem as described above is the max index remaining -1 resulting in writes outside of the valid boundary of A later in the function.

Using a value of -1 is obviously wrong especially when it's multiplied by 9 - this is why we're hitting the guard bytes in stack-protected binaries. However, I'm not sure what brings the algorithm to a better solution yet. The other trap door to watch out for is the DIV later by A. That is what this check is warning us about. I'm doing some reading and experimenting and will propose a PR that at least prevents both crash scenarios.

jrepp avatar Feb 23 '22 20:02 jrepp

I converted the graphics above (https://user-images.githubusercontent.com/83595887/118779876-c8936580-b88b-11eb-947b-48d38a0d3840.png) via pngtopnm to the PNM format and can reproduce the issue.

Converted PNM image: 118779876-c8936580-b88b-11eb-947b-48d38a0d3840.zip

./apriltag_demo -x 1 118779876-c8936580-b88b-11eb-947b-48d38a0d3840.pnm

With AddressSanitizer enabled (-D ASAN=ON), this will show:

apriltag/apriltag_quad_thresh.c:269:25: runtime error: division by zero
apriltag/apriltag_quad_thresh.c:270:25: runtime error: division by zero
apriltag/apriltag.c:761:37: runtime error: -nan is outside the range of representable values of type 'int'
apriltag/apriltag.c:801:29: runtime error: -nan is outside the range of representable values of type 'int'
apriltag/apriltag.c:802:29: runtime error: -nan is outside the range of representable values of type 'int'
apriltag/apriltag.c:844:24: runtime error: division by zero
apriltag/apriltag.c:844:37: runtime error: division by zero
apriltag/apriltag.c:845:26: runtime error: division by zero
apriltag/apriltag.c:846:26: runtime error: division by zero
apriltag/apriltag.c:847:26: runtime error: division by zero

christian-rauch avatar Feb 04 '24 12:02 christian-rauch

👍

jrepp avatar Feb 05 '24 06:02 jrepp

I confirm in the debugger that the invalid index (-1) for max_val_idx survives to be used as an index later in the function. So I simply changed the initial value of max_val to be -1.0 and the problem went away because the first iteration of the loop then assigns both values and max_val_idx is no longer -1.

In the example data by @hartter, this was caused by invalid values in c and setting double max_val = -1; did not prevent max_val_idx from keeping its -1 value. This is fixed by proper checks for division by zero so that all values in c are now finite.

Another case that could lead to a negative max_val_idx is when all values of c are 0. Not sure under which conditions this can happen. I added an assert there to check that max_val_idx is not negative.

christian-rauch avatar Feb 05 '24 20:02 christian-rauch