hnsw icon indicating copy to clipboard operation
hnsw copied to clipboard

Question - Is the test case correct?

Open asheshvidyut opened this issue 10 months ago • 1 comments

Hey Team, Is the following test case correct?

func TestGraph_AddSearch(t *testing.T) {
	t.Parallel()

	g := newTestGraph[int]()

	for i := 0; i < 128; i++ {
		g.Add(
			Node[int]{
				Key:   i,
				Value: Vector{float32(i)},
			},
		)
	}

	al := Analyzer[int]{Graph: g}

	// Layers should be approximately log2(128) = 7
	// Look for an approximate doubling of the number of nodes in each layer.
	require.Equal(t, []int{
		128,
		67,
		28,
		12,
		6,
		2,
		1,
		1,
	}, al.Topography())

	nearest := g.Search(
		[]float32{64.5},
		4,
	)

	require.Len(t, nearest, 4)
	require.EqualValues(
		t,
		[]Node[int]{
			{64, Vector{64}},
			{65, Vector{65}},
			{62, Vector{62}},
			{63, Vector{63}},
		},
		nearest,
	)
}

I think 66 is closer to 64.5 as compared to 62. Or am I missing something here?

asheshvidyut avatar Jan 05 '25 15:01 asheshvidyut

The responses from Search are not guaranteed to be sorted by distance. I figure that overhead is best left to the caller as the library intends on being a low-level implementation.

Also, the results are not guaranteed to the closest N matches—the structure makes some accuracy tradeoffs in the interest of performance. I have not looked at the code in a while and there may be an elegant way to improve the guarantees. It's not top of mind for me but I welcome any contributions.

ammario avatar Jan 06 '25 18:01 ammario