go-mysql-server icon indicating copy to clipboard operation
go-mysql-server copied to clipboard

proposal: Refactor Indexed Table Access

Open andy-wm-arthur opened this issue 1 year ago • 5 comments

This is a design proposal to refactor the way indexed table access works and what abstractions are exposed to Integrators:

The core change comes to sql.IndexLookup which will now be a concrete type constructed by the engine, not integrators:

-type IndexLookup interface {
-	fmt.Stringer
+type IndexLookup struct {
+	Str string
 	// Index returns the index that created this IndexLookup.
-	Index() Index
+	Index Index
 	// Ranges returns each Range that created this IndexLookup.
-	Ranges() RangeCollection
+	Ranges RangeCollection
 }

This implies removing sql.Index.NewLookup, which will be replaced by an integrator method to indicate whether a given index can support a given lookup:

type Index interface {
	// ID returns the identifier of the index.
	ID() string
	// Database returns the database name this index belongs to.
	Database() string
	// Table returns the table name this index belongs to.
	Table() string
	// Expressions returns the indexed expressions. If the result is more than
	// one expression, it means the index has multiple columns indexed. If it's
	// just one, it means it may be an expression or a column.
	Expressions() []string
	// IsUnique returns whether this index is unique
	IsUnique() bool
	// Comment returns the comment for this index
	Comment() string
	// IndexType returns the type of this index, e.g. BTREE
	IndexType() string
	// IsGenerated returns whether this index was generated. Generated indexes
	// are used for index access, but are not displayed (such as with SHOW INDEXES).
	IsGenerated() bool
-	// NewLookup returns a new IndexLookup for the ranges given. Ranges represent filters over columns. Each Range
-	// is ordered by the column expressions (as returned by Expressions) with the RangeColumnExpr representing the
-	// searchable area for each column expression. Each Range given will not overlap with any other ranges. Additionally,
-	// all ranges will have the same length, and may represent a partial index (matching a prefix rather than the entire
-	// index). If an integrator is unable to process the given ranges, then a nil may be returned. An error should be
-	// returned only in the event that an error occurred.
-	NewLookup(ctx *Context, ranges ...Range) (IndexLookup, error)
+	// SupportsLookup returns true if the Index supports |lookup|.
+	SupportsLookup(ctx *Context, lookup IndexLookup) (bool, error)
	// ColumnExpressionTypes returns each expression and its associated Type. Each expression string should exactly
	// match the string returned from Index.Expressions().
	ColumnExpressionTypes(ctx *Context) []ColumnExpressionType
}

Finally, IndexedTable and IndexAddressableTable collapse into a single interface that exposed a set of Index definitions, and an AccessIndex() method that asks for a RowIter given an IndexLookup:

diff --git a/sql/core.go b/sql/core.go
index d0b3ba54..8ed56b1c 100644
--- a/sql/core.go
+++ b/sql/core.go
@@ -430,28 +430,14 @@ type IndexColumn struct {
 // speed up execution of queries that reference those columns. Unlike DriverIndexableTable, IndexedTable doesn't need a
 // separate index driver to function.
 type IndexedTable interface {
-	IndexAddressableTable
 	// GetIndexes returns all indexes on this table.
 	GetIndexes(ctx *Context) ([]Index, error)
-}
-
-// IndexAddressable provides a Table that has its row iteration restricted to only the rows that match the given index
-// lookup.
-type IndexAddressable interface {
-	// WithIndexLookup returns a version of the table that will return only the rows specified by the given IndexLookup,
-	// which was in turn created by a call to Index.Get() for a set of keys for this table.
-	WithIndexLookup(IndexLookup) Table
-}
-
-// IndexAddressableTable is a table that can restrict its row iteration to only the rows that match the given index
-// lookup.
-type IndexAddressableTable interface {
-	Table
-	IndexAddressable
+	// AccessIndex constructs a RowIter from an IndexLookup.
+	AccessIndex(ctx *Context, lookup IndexLookup) (RowIter, error)
 }

andy-wm-arthur avatar Aug 04 '22 23:08 andy-wm-arthur

@max-hoffman @zachmu @Hydrocharged @reltuk interested to hear your thoughts

andy-wm-arthur avatar Aug 04 '22 23:08 andy-wm-arthur

I'm in favor of this.

IndexLookup shouldn't have Str, it should still implement fmt.Stringer

IndexedTable should either embed sql.Table, or else just become IndexAddressable

AccessIndex should be IndexedRowIter

zachmu avatar Aug 04 '22 23:08 zachmu

I’m also in favor of this. Zach’s additional comments also seem reasonable, and I have no additional comments myself.

Hydrocharged avatar Aug 04 '22 23:08 Hydrocharged

My only cons are: I don't know if this will be easy to implement given our index caching and the weirdness with dolt history tables, and I have issues in my queue that are correctness and perf problems, which I would think take precedence over interface refactoring.

max-hoffman avatar Aug 04 '22 23:08 max-hoffman

In general seems reasonable.

Table currently doesn't have a RowIter() method. Seems like this should be IndexedPartitions(*Context, IndexLookup) (PartitionIter, error), and PartitionRows() gets used for both types of partitions?

reltuk avatar Aug 08 '22 23:08 reltuk

addressed in https://github.com/dolthub/go-mysql-server/pull/1183

andy-wm-arthur avatar Aug 23 '22 19:08 andy-wm-arthur