dataloaden icon indicating copy to clipboard operation
dataloaden copied to clipboard

Pass args to dataloader

Open NickDubelman opened this issue 6 years ago • 11 comments

What is the best approach if I need to pass arguments to my dataloader?

I've seen 2 issues on here where people suggest using anonymous structs but that feels pretty janky for a few reasons.

I could also see a solution of just throwing it into context but that also doesn't feel right because we lose type safety.

Is there a better officially supported approach?

NickDubelman avatar Nov 09 '19 00:11 NickDubelman

Ahhh I think I figured out the ideal way to do this.

The basic concept is that when you retrieve the loader, you use a function that binds whatever args you need to the loader that gets created.

Here is my basic setup in case anyone else can benefit from this pattern:

type loaders struct {
        // regular loaders that dont need args
	FamilyByID        *generated.FamilyLoader 
	BugsByProducts    *generated.BugSliceLoader

       // loader that needs args
	BuildsByProjectID func(start, end *time.Time) *generated.BuildSliceLoader
}

func LoaderMiddleware(next http.Handler) gin.HandlerFunc {
	return gin.WrapF(
		http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			ldrs := loaders{}

			// set this to zero what happens without dataloading
			wait := 20 * time.Millisecond

			ldrs.FamilyByID = getFamilyByIDLoader(wait)
			ldrs.BugsByProducts = getBugsByProductsLoader(wait)

			ldrs.BuildsByProjectID = func(
				start, end *time.Time,
			) *generated.BuildSliceLoader {
				return getBuildsByProjectIDLoader(wait, start, end)
			}

			dlCtx := context.WithValue(r.Context(), ctxKey, ldrs)
			next.ServeHTTP(w, r.WithContext(dlCtx))
		}))
}

func getBuildsByProjectIDLoader(
	wait time.Duration,
	start, end *time.Time,
) *generated.BuildSliceLoader {
	return generated.NewBuildSliceLoader(
		generated.BuildSliceLoaderConfig{
			Wait:     wait,
			MaxBatch: 100,
			Fetch: func(projectIDs []int) ([][]models.Build, []error) {
				builds, err := getBuildsBatch(projectIDs, start, end)
				if err != nil {
					return nil, []error{err}
				}
				return builds, nil
			},
		},
	)
}

// GetLoadersFromContext gets the loaders object for the given context
func GetLoadersFromContext(ctx context.Context) loaders {
	return ctx.Value(ctxKey).(loaders)
}

then to pass the args:

func (r *systemResolver) Builds(
	ctx context.Context,
	obj *gqlmodels.System,
	start, end *time.Time,
) (*gqlmodels.BuildsConnection, error) {
	builds, err := dataloaders.
		GetLoadersFromContext(ctx).BuildsByProjectID(start, end).Load(obj.ID)

	ret := &gqlmodels.BuildsConnection{PageInfo: &gqlmodels.PageInfo{}}
	for i, b := range builds {
		ret.Edges = append(ret.Edges, &gqlmodels.BuildsEdge{
			Cursor: gqlmodels.EncodeCursor(i),
			Node:   &gqlmodels.Build{Build: b},
		})
	}
	return ret, err
}

NickDubelman avatar Nov 09 '19 00:11 NickDubelman

Oops, nevermind. With the approach I had, you create a new data loader each time which defeats the point

NickDubelman avatar Nov 09 '19 00:11 NickDubelman

I think you can create a named struct in the same package and use that as your key type?

vektah avatar Nov 09 '19 01:11 vektah

Yeah but even if you can, that feels wrong because the parameters are for the entire request, not for each individual key. You could just use the 0th key's params but still doesn't seem like the right way to do it. I was able to get my above solution to work by doing something like this:

func LoaderMiddleware(next http.Handler) gin.HandlerFunc {
	return gin.WrapF(
		http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
			ldrs := loaders{}

			// set this to zero what happens without dataloading
			wait := 50 * time.Millisecond

			ldrs.FamilyByID = getFamilyByIDLoader(wait)
			ldrs.BugsByProducts = getBugsByProductsLoader(wait)
			ldrs.BuildsByProjectID = func(
				start, end *time.Time,
			) *generated.BuildSliceLoader {
				// Check to see if we have already created a loader with the given
				// start and end values for the current request context
				key := ctxKeyType{fmt.Sprintf("BuildsByProjectID-%v-%v", start, end)}

				ldr := r.Context().Value(key)
				if ldr != nil {
					return ldr.(*generated.BuildSliceLoader)
				}

				ldr = getBuildsByProjectIDLoader(wait, start, end)
				r = r.WithContext(context.WithValue(r.Context(), key, ldr))
				return ldr.(*generated.BuildSliceLoader)
			}

			dlCtx := context.WithValue(r.Context(), ctxKey, ldrs)
			next.ServeHTTP(w, r.WithContext(dlCtx))
		}))
}

I think the ideal solution would be for the CLI to include a flag, like -arg (can use it multiple times), that would generate a dataloader with a fetch function signature that includes the passed args (as opposed to just keys). All of the generated functions (like Load()) would include this arg information.

Does this seem reasonable and feasible? If you think so, I could take a stab at it.

NickDubelman avatar Nov 09 '19 01:11 NickDubelman

I don't think that anyone should want to pass in every data source a loader needs with every use of the loader. That should be one time wiring that's done at the same time and place where the context is bound to the loader: per request. It's unfortunate that the only way to pass the loaders into http handlers is through the context object. The context is the only per-request part of http loaders. With gqlgen, resolvers are created per-request, which makes them a natural place to attach loaders.

vikstrous avatar Dec 11 '19 13:12 vikstrous

I’m not sure what you mean by “every data source a loader needs.” The situation I am describing is as simple as “trying to data load something that needs arguments.” For example, imagine you have a field that you want to dataload, but that field has arguments that it needs in order to resolve. Like a user has events and you want to get all the events between now and next month.

Without even considering the feasibility, I think the most desirable dev experience would be for this CLI to allow the user to pass N “-arg flags” and the generated code will include these args.

NickDubelman avatar Dec 11 '19 16:12 NickDubelman

Ah, I misunderstood the use case. I'm not sure if I fully understand yet though. Loaders are designed to be used when there's a known set of keys to query by. A query like "all events between x and y" is not a fixed data set. Trying to use a loader for that would break caching in the current implementation. I'm trying to wrap my head around the idea of using a loader-like pattern for more complex queries. Do you have some ideas for how caching and deduplication of requests might work in the general case? Or maybe in your specific case of a range query?

vikstrous avatar Dec 11 '19 17:12 vikstrous

Being able to pass additional args to the Load() function would enable support for paging or filtering.

@NickDubelman Your suggestion:

Without even considering the feasibility, I think the most desirable dev experience would be for this CLI to allow the user to pass N “-arg flags” and the generated code will include these args.

would certainly help my situation

dcwangmit01 avatar Oct 13 '20 22:10 dcwangmit01

Any updates for this?

Currently I've been doing this, but it's kinda hacky:

go run github.com/vektah/dataloaden ValueLoader 'interface{}' 'modulename/pkg/models.Value'

As interface I use a object with key and custom parameters

lacendarko avatar Jun 14 '21 13:06 lacendarko

Here is my solution for anyone who is still looking (too lazy to make a merge request)

NOTE: didn't exhaustively test or think about this, if there are certain conditions where issues such as race conditions could arise, my bad

EDIT this sample code is not example of the use case, a better use case would be where each key is expected to fetch an array on things, and this array might have a desirable filter tied to it

All this code lives in the dataloader package in my case, keep this in mind why this member capitalization works.

changes i made to the generated code (user.generated.go) are to:

  • The LoaderConfig to allow for Fetch function to accept arguments map[string]interface{} in its arguments
  • The Loader struct to also allow the fetch function to accept arguments map[string]interface{} in its arguments
  • The Loader struct to have a member called arguments map[string]interface{} to handle custom dataloader arguments
  • The end function to send to arguments to the fetch function

// UserLoaderConfig captures the config to create a new UserLoader
type UserLoaderConfig struct {
	// Fetch is a method that provides the data for the loader
	Fetch func(keys []string, arguments map[string]interface{}) ([]*model.User, []error)

	...
}

...

// UserLoader batches and caches requests
type UserLoader struct {
	// this method provides the data for the loader
	fetch func(keys []string, arguments map[string]interface{}) ([]*model.User, []error)

	// how long to done before sending a batch
	wait time.Duration

	// this will limit the maximum number of keys to send in one batch, 0 = no limit
	maxBatch int

	// add arguments to apply to dataloader
	arguments map[string]interface{}

	...
}

...

func (b *userLoaderBatch) end(l *UserLoader) {
	b.data, b.error = l.fetch(b.keys, l.arguments)
	close(b.done)
}

dataloader.go

const LoadersKey = "dataloaders"

type Loaders struct {
    Map map[string]interface{} // contains dataloaders by name
}

func For(ctx context.Context) *Loaders {
    // gets Loaders from context
}

user.loader.go

// fetches a specific dataloader from the Loaders context  
func (l UserLoader) UserLoader (arguments map[string]interface{}) (*UserLoader) {
    if value, ok := l.Map["UserLoader"]; ok {

        // apply arguments
        loader := value.(*GamePlatformLoader)
        loader.arguments = arguments

        return loader
    }

    // create new loader if doesn't exist yet
    loader := UserLoader{
        wait: 2 * time.Millisecond,
        maxBatch: 100,
        fetch: func (keys []string, arguments map[string]interface{}) ([]*model.User, []error) {
            // do your data loading shit here
        },
    }

    // apply arguments to loader
    loader.arguments = arguments

    l.Map["UserLoader"] = &loader

    return &loader
}

this code can then be used externally very similar to the original package as such:

import "dataloader"

...

dataloader.For(context).UserLoader(arguments).Load(userID)

ignore that mention vektah, apologies, the error is resolved with a small change how arguments are applied to the Loader

kim-hetrafi avatar Sep 07 '21 09:09 kim-hetrafi

You could make the key a different struct all together

//go:generate go run github.com/vektah/dataloaden UserSliceLoader  *github.com/vektah/dataloaden/example.UserParams []*github.com/vektah/dataloaden/example.User
import "dataloader"

...
params := &example.UserParams{/* add whatever args you want */}
dataloader.For(context).UserSliceLoader.Load(params)

ericraio avatar Dec 28 '21 16:12 ericraio