libtess2 icon indicating copy to clipboard operation
libtess2 copied to clipboard

Crash where winding number invariant is broken

Open sfreilich opened this issue 7 months ago • 10 comments

I'm trying to dig into a crash with this repro case:

TEST_F(Libtess2Test, CrashRepro) {
  AddPolyline(tess, {{{1.f, 0.854620099f},
                      {0.f, -1.f},
                      {1.f, -3.f},
                      {-0.5f, 0.f},
                      {0.301400006f, -1.f},
                      {-1.f, 0.784999669f},
                      {-1.f, 1.f},
                      {1.f, -3.f},
                      {2.15526676f, 1.f},
                      {-0.f, 0.583195567f},
                      {-1.f, -1.f},
                      {-0.f, 0.f},
                      {1.f, 0.f},
                      {-1.f, -1.f},
                      {-0.f, 1.f},
                      {0.5f, 0.265409946f},
                      {0.f, 0.f},
                      {-1.f, -0.f},
                      {1.f, 0.454741627f},
                      {-1.f, 1.f},
                      {1.f, 2.56512165f},
                      {1.f, 1.f},
                      {1.f, 1.f},
                      {0.893085241f, -0.f},
                      {1.f, 0.186138511f},
                      {-0.f, 1.f},
                      {1.f, 0.f},
                      {0.637948692f, -2.65960741f},
                      {-1.f, -0.0874727592f},
                      {0.853441715f, -1.f},
                      {1.f, 1.f},
                      {-1.f, 0.f},
                      {0.5f, -2.98471045f},
                      {-0.f, 0.f},
                      {3.f, 0.34419331f},
                      {1.f, 0.140874222f},
                      {1.f, 0.f},
                      {1.f, 2.f}}});
  EXPECT_EQ(tessTesselate(tess, TESS_WINDING_POSITIVE, TESS_POLYGONS,
                          kNumTriangleVertices, kComponentCount, nullptr),
            0);
}

This is surely a degenerate squiggle, but the size of the example and domain of the values seems very reasonable.

On fastbuild this fails with:

libtess2_test: Source/sweep.c:391: AddRightEdges: Assertion `regPrev->windingNumber - e->winding == reg->windingNumber' failed.

In opt, this crashes. I currently don't know how to get better stacktraces from Googletest in this build. Downstream under ASAN it gets a bit of a stacktrace, about something invalid happening in KillFace, but it doesn't give me a lot of insight where.

==1441441==ERROR: AddressSanitizer: ILL on unknown address 0x5b13b0b01336 (pc 0x5b13b0b01336 bp 0x7ffcbdcf1910 sp 0x7ffcbdcf1890 T0)
SCARINESS: 10 (signal)
    #0 0x5b13b0b01336 in KillFace third_party/libtess2/mesh.c:0
    #1 0x5b13b0b01336 in tessMeshConnect third_party/libtess2/mesh.c:499:3

sfreilich avatar May 05 '25 18:05 sfreilich

It usually helps me to plot the data so that you have a visual guide how things are laid out. Another thing to figure out is to see at which stage the issue happens. For example, have we processed some overlapping edges already.

My gut feeling would be to check what results we get from tesedgeIntersect(), or tesedgeEval(). There's check for gapL + gapR > 0 but what happens if that sum is really really small? Some logging goes a long way.

There's also degenerate edge in the input, which might be the issue:
{1.f, 1.f}, {1.f, 1.f},

If that is the case, then it might make sense to mark that as invalid input too (or skip such edges).

memononen avatar May 05 '25 20:05 memononen

I'll focus on that first. Skipping zero-length edges seems like it would make sense, but is there some strong expectation that every input vertex is used?

sfreilich avatar May 05 '25 22:05 sfreilich

Removing that one zero-length edge doesn't seem to fix the crash / assert failure here, though.

I was able to cut the repro down to:

                      {0.f, -1.f},
                      {1.f, -3.f},
                      {-0.5f, 0.f},
                      {0.301400006f, -1.f},
                      {-1.f, 0.784999669f},
                      {-1.f, 1.f},
                      {1.f, -3.f},
                      {2.15526676f, 1.f},
                      {-0.f, 0.583195567f},
                      {-1.f, -1.f},
                      {-0.f, 0.f},
                      {1.f, 0.f},
                      {-1.f, -1.f},
                      {-0.f, 1.f},
                      {0.5f, 0.265409946f},
                      {0.f, 0.f},
                      {-1.f, -0.f},
                      {1.f, 0.454741627f},
                      {-1.f, 1.f},
                      {1.f, 2.56512165f},
                      {1.f, 1.f},
                      {0.893085241f, -0.f},
                      {1.f, 0.186138511f},
                      {-0.f, 1.f},
                      {1.f, 0.f},
                      {0.637948692f, -2.65960741f},
                      {-1.f, -0.0874727592f},
                      {0.853441715f, -1.f},
                      {1.f, 1.f},
                      {-1.f, 0.f},
                      {0.5f, -2.98471045f},
                      {-0.f, 0.f},
                      {3.f, 0.34419331f},
                      {1.f, 0.140874222f},
                      {1.f, 0.f},

sfreilich avatar May 05 '25 22:05 sfreilich

The data looks like this:

Image

As evident from the data too, there's bunch of traffic on the ones and zeros.

I recommend to check/dump the results of the tesedgeIntersect(). There's some comments next to it what the output is expected to be.

memononen avatar May 06 '25 05:05 memononen

There are a number of things in that shape that can be problematic. One is that the shape visits the same vertices. Another one is edges that intersect at another vertex (yellow below), and consecutive edges that have the same direction (green).

Those cases usually mess up the checks where we try to see how the edges or vertices relate to each other.

Image

If we end up skipping input data we need to make sure that the vertexIndexCounter is updated per the input. That is used to associate the input vertices to the ones that we output. It is fine to skip a vertex (it can happen also when we do poly booleans), but we cannot mess up that order.

memononen avatar May 06 '25 06:05 memononen

I'm trying to better understand the computing of winding numbers in AddRightEdges and getting a bit stuck:

https://github.com/memononen/libtess2/blob/9a450cc9e5b4b79c36b89648f8b92fe65b6dd407/Source/sweep.c#L328

{
        // Initialization ...
	int firstTime = TRUE;

	// Loop until e = eLast...
	regPrev = regUp;
	ePrev = eTopLeft;
	for( ;; ) {
		reg = RegionBelow( regPrev );
		e = reg->eUp->Sym;
		if( e->Org != ePrev->Org ) break;

		if( e->Onext != ePrev ) {
			// Unlink e from its current position, and relink below ePrev ...
		}
		reg->windingNumber = regPrev->windingNumber - e->winding;
		if( ! firstTime && CheckForRightSplice( tess, regPrev )) {
			AddWinding( e, ePrev );
		        // ...
		}
		firstTime = FALSE;
		regPrev = reg;
		ePrev = e;
	}
	assert( regPrev->windingNumber - e->winding == reg->windingNumber );
}

I would expect that assertion to hold given how things are set in the loop. The only case where that shouldn't happen is if the loop exits immediately, which is the case if RegionBelow( regUp)->eUp->Sym->Org != eTopLeft->Org. In that case, the invariant is equivalent to:

regUp->windingNumber - eLast->winding == RegionBelow(regUp)->windingNumber

Is that supposed to be the case? What does it mean when there's only one edge with that origin?

sfreilich avatar May 14 '25 20:05 sfreilich

Unfortunately it's been over a decade since I understood most of the code, the algorithm.md gives a nice outline how things should work.

I think it would be useful to figure out which point or edge in the picture is causing the issue.

One potential issue are the overlapping points. Maybe the sweeping algo get's confused? I wonder if that issue manifests on simpler shapes, like an hourglass:

// Shares middle point, drawn like figure 8
0,0,
2,0,
1,1,
0,2,
2,2,
1,1,

// Shares middle point, but draw as "outline"
0,0,
2,0,
1,1,
2,2,
0,2,
1,1,

memononen avatar May 14 '25 22:05 memononen

Another interesting thing about this example is that it has origin coordinates with different signs for the 0. If the 0, 0 coordinate is removed (leaving the pair of (-0, 0) coordinates), the crash doesn't happen.

The failure doesn't happen at an edge containing that vertex, though:

Winding number mismatch: regPrev->windingNumber=0, e->winding=1, reg->windingNumber=0
  e->Org=(-0.200000, -0.600000), e->Dst=(1.000000, -3.000000)
  regPrev->eUp->Org=(1.000000, -3.000000), regPrev->eUp->Dst=(0.251926, -1.503852)
  reg->eUp->Org=(1.000000, -3.000000), reg->eUp->Dst=(-0.200000, -0.600000)

(Those pairs are coords[0] and coords[1] for those vertices.)

sfreilich avatar May 27 '25 19:05 sfreilich

Neither of the simpler examples you suggest manifest the same crash.

sfreilich avatar May 27 '25 19:05 sfreilich

Changing the origin vertices to match one another in sign (in either direction) doesn't get rid of the crash, though, so maybe it's just that being part of the colinear segments, and nothing to do with the sign of the zeros.

sfreilich avatar May 27 '25 19:05 sfreilich