lxd icon indicating copy to clipboard operation
lxd copied to clipboard

lxd-generate: Supporting more handwritten queries

Open masnax opened this issue 2 years ago • 2 comments

Background

Considering the comments in #10936, and @markylaing's request for using lxd-generate to create some basic boilerplate functions that we can just pass our own statements to, I've been thinking about how we can improve the GetMany functions produced by lxd-generate.

In my opinion, the GetMany functions should do two things:

  • Reduce the amount of virtually identical code we have to duplicate by hand (copy/pasting a function and changing type/field names).
  • Prevent us from making unexpected queries by labelling arguments with the EntityFilters, and only executing the appropriate query for the non-nil fields.

I didn't want to generate a function that just takes args ...any, as that has no validation. Writing handwritten functions with Filter structs is tedious, however. So it would be nice if we could parse more types of handwritten queries and generate functions for them.

Current Behavior

Currently, lxd-generate can support some handwritten queries, if we don't generate one with the same name:

var <entity>Objects = RegisterStmt(`SELECT...`)

This will get used in the Get<Entities> function, and any lxd-generate directives of the form objects-by-Field1-And-Field2 for that entity will also append a WHERE clause to that original query as a new variable.

This has 2 pitfalls in practice:

  • Naming collisions:
    • We only accept 1 query for each filter. If two queries can use the same filter, the second one needs its own function.
  • Additional parameters not in the Filter structs:
    • If, maybe because of a JOIN, we need more arguments, we need a new function that can take those arguments.

Proposed Changes

Change the naming structure for parsed statements to be more complex, and generate more functions with thelxd-generate GetMany directive based on all statements with that naming scheme for the entity.

The variable naming scheme would be as follows:

var <entity>Objects<Label>With<Arg...>By<Filter...> = RegisterStmt(`SELECT...`)

lxd-generate would detect a variable of this name and produce a function of the form

func GetEntities<Label>With<Arg>(ctx, tx, arg1, filter) 

So that's a lot of hard to read placeholders, I'll define them and then give some examples:

  • <Label> -- This is a unique description that scopes a function and its queries, so we can have multiple statements with the same filter combinations.

  • With<Arg...> -- The And-separated Args will be added as arguments to the resulting lxd-generate function. Anything with the suffix Name, UUID, or Fingerprint will be a string, and ID would be an int. Everything else would be rejected.

Examples

Using Certificate as the entity, here are some examples. All the functions associated to each example are created at the same time with a single invocation of GetMany. Also, there would be no changes to the current generated functions.

This is an example using a <Label> but no extra arguments:

  // A handwritten statement fetching all the certificates that belong to a project.
  var certificateObjectsInAnyProject = RegisterStmt(`
  SELECT * FROM certificates
    JOIN certificates_projects ON certificate.id = certificates_projects.certificate_id
  `)

  // `InAnyProject` is used both in the function name, and all considered statements.
  func GetCertificatesInAnyProject(ctx, tx, filters ...CertificateFilter) ([]Certificate, error)

This is an example with a single extra argument (With<Arg...>), but no <Label>:

  // A handwritten statement fetching all the certificates matching the project and fingerprint.
  var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(`
  SELECT * FROM certificates
    JOIN certificates_projects ON certificate.id = certificates_projects.certificate_id
    WHERE ( certificate_projects.project_id = ? AND certificates.fingerprint = ? )
  `)

  // This function is suffixed with `WithProjectID`, and and uses `projectID` in the query parameters.
  func GetCertificatesWithProjectID(ctx, tx, projectID int, filters ...CertificateFilter) ([]Certificate, error)

This example uses every option. There is a <Label>, an extra arg, and a filter:

  // A handwritten statement fetching all the server type certificates matching the project and fingerprint.
  var certificateObjectsOfTypeServerWithProjectIDByFingerprint = RegisterStmt(`
  SELECT * FROM certificates
    JOIN certificates_projects ON certificate.id = certificates_projects.certificate_id
    WHERE ( certificate_projects.project_id = ? AND certificates.type = 2 AND certificates.fingerprint = ? )
  `)

  // This function is similar to the above example, 
  // but like the `InAnyProject` example, all statements it considers must have the same label.
  func GetCertificatesOfTypeServerWithProjectID(ctx, tx, projectID int, filters ...CertificateFilter) ([]Certificate, error)

masnax avatar Sep 26 '22 05:09 masnax

@tomponline @markylaing @stgraber If any of you have some opinions on this, I would appreciate it!

There's also a draft PR #10946 that shows how this would work in practice using the Certificate struct. certificates.go and certificates.mapper.go have all the relevant changes.

masnax avatar Sep 26 '22 05:09 masnax

@masnax thanks I'll take a look at this ASAP

tomponline avatar Sep 27 '22 09:09 tomponline

  // A handwritten statement fetching all the certificates matching the project and fingerprint.
  var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(`
  SELECT * FROM certificates
    JOIN certificates_projects ON certificate.id = certificates_projects.certificate_id
    WHERE ( certificate_projects.project_id = ? AND certificates.fingerprint = ? )
  `)

  // This function is suffixed with `WithProjectID`, and and uses `projectID` in the query parameters.
  func GetCertificatesWithProjectID(ctx, tx, projectID int, filters ...CertificateFilter) ([]Certificate, error)

This 2nd example lost me a bit. I couldn't figure out where the AND certificates.fingerprint = ? argument would come from? I figured the projectID int would be passed to the query as the first argument, and then the filters would be added afterwards, but I couldn't figure out where the certificates.fingerprint would be added?

tomponline avatar Oct 05 '22 14:10 tomponline

An additional thing I've often wondered about the DB generator (given its aim is to reduce manually written boiler plate code) is why do we need to write define every statement that could be needed by the Get<Entity>(ctx, tx, filters... <Entity>Filter) []<Entity> functions?

If we take instances as an example, there are many statement lines required to support all of the filter scenarios that could be used:

//go:generate mapper stmt -e instance objects
//go:generate mapper stmt -e instance objects-by-ID
//go:generate mapper stmt -e instance objects-by-Project
//go:generate mapper stmt -e instance objects-by-Project-and-Type
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Node
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Node-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Name-and-Node
//go:generate mapper stmt -e instance objects-by-Project-and-Node
//go:generate mapper stmt -e instance objects-by-Type
//go:generate mapper stmt -e instance objects-by-Type-and-Name
//go:generate mapper stmt -e instance objects-by-Type-and-Name-and-Node
//go:generate mapper stmt -e instance objects-by-Type-and-Node
//go:generate mapper stmt -e instance objects-by-Node
//go:generate mapper stmt -e instance objects-by-Node-and-Name
//go:generate mapper stmt -e instance objects-by-Name

This is left to the developer to write (and likely miss a combination).

Could we instead use the metadata on the entity return type to generate these statements automatically, so I just have to add:

//go:generate mapper method -i -e instance GetMany references=Config,Device

And from that the necessary statements are generated?

tomponline avatar Oct 05 '22 15:10 tomponline

```go
  // A handwritten statement fetching all the certificates matching the project and fingerprint.
  var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(`
  SELECT * FROM certificates
    JOIN certificates_projects ON certificate.id = certificates_projects.certificate_id
    WHERE ( certificate_projects.project_id = ? AND certificates.fingerprint = ? )
  `)

  // This function is suffixed with `WithProjectID`, and and uses `projectID` in the query parameters.
  func GetCertificatesWithProjectID(ctx, tx, projectID int, filters ...CertificateFilter) ([]Certificate, error)

This 2nd example lost me a bit. I couldn't figure out where the AND certificates.fingerprint = ? argument would come from? I figured the projectID int would be passed to the query as the first argument, and then the filters would be added afterwards, but I couldn't figure out where the certificates.fingerprint would be added?

That's just part of the filter struct. That particular statement will only execute if filter.Fingerprint is not nil. The args passed to the query would be []any{projectID, filter.Fingerprint}.

masnax avatar Oct 05 '22 15:10 masnax

That particular statement will only execute if filter.Fingerprint is not nil.

So I would have to define my custom SQL statements for every possible combination of the filters?

tomponline avatar Oct 05 '22 15:10 tomponline

An additional thing I've often wondered about the DB generator (given its aim is to reduce manually written boiler plate code) is why do we need to write define every statement that could be needed by the Get<Entity>(ctx, tx, filters... <Entity>Filter) []<Entity> functions?

If we take instances as an example, there are many statement lines required to support all of the filter scenarios that could be used:

//go:generate mapper stmt -e instance objects
//go:generate mapper stmt -e instance objects-by-ID
//go:generate mapper stmt -e instance objects-by-Project
//go:generate mapper stmt -e instance objects-by-Project-and-Type
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Node
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Node-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Type-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Name
//go:generate mapper stmt -e instance objects-by-Project-and-Name-and-Node
//go:generate mapper stmt -e instance objects-by-Project-and-Node
//go:generate mapper stmt -e instance objects-by-Type
//go:generate mapper stmt -e instance objects-by-Type-and-Name
//go:generate mapper stmt -e instance objects-by-Type-and-Name-and-Node
//go:generate mapper stmt -e instance objects-by-Type-and-Node
//go:generate mapper stmt -e instance objects-by-Node
//go:generate mapper stmt -e instance objects-by-Node-and-Name
//go:generate mapper stmt -e instance objects-by-Name

This is left to the developer to write (and likely miss a combination).

Could we instead use the metadata on the entity return type to generate these statements automatically, so I just have to add:

//go:generate mapper method -i -e instance GetMany references=Config,Device

And from that the necessary statements are generated?

I actually recall suggesting this behaviour early on, but we decided against it, as we would have to produce queries for every possible filter combination of nil and non-nil fields, and some dont always make sense. (Like ~getting instances by project and type, but we actually did define this one~, I think a lot of the instance ones can be removed actually. Name and Type is also a weird one.).

So instead we decided to use the -by- declarations to specify exactly which combinations of filter fields we will actually consider, or error out.

masnax avatar Oct 05 '22 15:10 masnax

That particular statement will only execute if filter.Fingerprint is not nil.

So I would have to define my custom SQL statements for every possible combination of the filters?

No, lxd-generate is capable of parsing the non-filtered statement to produce filtered ones. So you would write certificateObjectsWithProjectID and then a generator comment with objects-by-Fingerprint would generate a new statement using your custom statement, but add a filter in the WHERE clause.

You would only have to define "irregular" statements, and the non-filtered one.

masnax avatar Oct 05 '22 15:10 masnax

No, lxd-generate is capable of parsing the non-filtered statement to produce filtered ones. So you would write certificateObjectsWithProjectID and then a generator comment with objects-by-Fingerprint would generate a new statement using your custom statement, but add a filter in the WHERE clause.

You would only have to define "irregular" statements, and the non-filtered one.

Interesting. So this would be a dynamic query directly executed and not using a prepared statement (only the SQL from the static definition)?

As an aside, can you confirm that in that scenario we don't prepare a single custom query, execute it and then throw away the prepared statement?

I'm thinking:

  1. We could use the ability you describe that generates adhoc filtering from the base query to fulfil queries for filter combinations that haven't been explicitly defined.
  2. This would then mean we would only need to static define the ones where we want to gain the performance benefit of having them prepared up front (those that are hit frequently).

And then back to the topic in hand:

If we have the ability to generate filter queries adhoc based on the table info stored in the returned entity type, then could we add the ability to generate that SQL as a separate function for a specified entity type that took the filters and a pointer to a strings.Builder, and populated it with the WHERE statements?

We could then use this in custom functions to add filtering support by writing the initial query by hand inside the custom function and then passing the strings.Builder to this generated function to add the additional WHERE statements based on the filters provided (if any).

This would avoid needing to define any queries up front.

tomponline avatar Oct 05 '22 15:10 tomponline

No, lxd-generate is capable of parsing the non-filtered statement to produce filtered ones. So you would write certificateObjectsWithProjectID and then a generator comment with objects-by-Fingerprint would generate a new statement using your custom statement, but add a filter in the WHERE clause. You would only have to define "irregular" statements, and the non-filtered one.

Interesting. So this would be a dynamic query directly executed and not using a prepared statement (only the SQL from the static definition)?

As an aside, can you confirm that in that scenario we don't prepare a single custom query, execute it and then throw away the prepared statement?

Not quite, we actually create a whole new prepared statement from the original query. The dynamic queries are only used when there's multiple filter structs.

So lets say we have certificateObjects (generated) and certificateObjectsWithProjectID (handwritten).

The objects-by-<FIELD> generator comments parse the syntax tree for any variable defined as a call to RegisterStmt, whose name has the prefix certificateObjects and no By... suffix. lxd-generate then writes new variables that do have a By... suffix for the given <FIELD>.

So for objects-by-Fingerprint, on the certificate entity, we will generate new global variables certificateObjectsByFingerprint and certificateObjectsWithProjectIDByFingerprint, which are prepared statements.

I'm thinking:

1. We could use the ability you describe that generates adhoc filtering from the base query to fulfil queries for filter combinations that haven't been explicitly defined.

2. This would then mean we would only need to static define the ones where we want to gain the performance benefit of having them prepared up front (those that are hit frequently).

And then back to the topic in hand:

If we have the ability to generate filter queries adhoc based on the table info stored in the returned entity type, then could we add the ability to generate that SQL as a separate function for a specified entity type that took the filters and a pointer to a strings.Builder, and populated it with the WHERE statements?

We could then use this in custom functions to add filtering support by writing the initial query by hand inside the custom function and then passing the strings.Builder to this generated function to add the additional WHERE statements based on the filters provided (if any).

This would avoid needing to define any queries up front.

This would certainly be possible, but we would lose the performance benefit of the prepared statements. Additionally we wouldn't actually be able to "see" the queries we're performing.

masnax avatar Oct 05 '22 16:10 masnax

I have a feeling that we could make the generated code more flexible without needing such a large change or complicating the generator any further.

If any of the handwritten statements are grossly incorrect, lxd/lxd-cloud/microcloud etc will fail to start and will fail all tests, so we'll never ship anything completely broken.

The existing GetMany with multiple filters works in > 90% of situations, I've only needed to write custom queries in a few cases:

  1. Where we need a JOIN instead of a LEFT JOIN to ensure a value is not null.
  2. To join on an association table.
  3. Where we need an OR clause in an update or delete statement.

Cases 1 & 2 (SELECT statements)

I think the main thing we are likely to miss is if a column is added to a table and a new field is added to the database representation type. Handwritten statements will then become out of date and omit that column, and the rowFunc that is passed into query.SelectObjects will omit that field.

I think we can get around this by a) generating a string <typeName>Columns for use in handwritten statements containing the columns we need to select fully populate the type, and b) generating an unexported function select<typename>Objects(ctx context.Context, tx *sql.Tx, stmt int, args any...) ([]<type>, error) for internal use. We would then write a statement like this:

var getCustom<typeName>Objects = RegisterStmt(fmt.Sprintf("SELECT %s FROM <tableName> WHERE ...", <typeName>Columns))

and we would also write a function like this:

func GetCustom<typeName>s (ctx, tx, <concretely defined arguments>) ([]<type>, error) {
    return select<typeName>Objects(ctx, tx, <pass arguments directly>)
}

This way, the rowFunc will be kept up to date via the generated generic method, and we will always select all of the columns necessary to populate the type. If a column is added that requires a join and the handwritten statement does not already join that table, the handwritten statement will fail to prepare until it has been updated.

Case 3 (UPDATE / DELETE statements)

The boilerplate here is already quite minimal so I don't think we need to make any changes to handle this case.


With all that being said, my experience using the generator has been primarily with LXD Cloud and if this change will benefit generation elsewhere I'm not opposed to it. If this is the case, my only concern is with the argument specification for custom queries. The proposal has

So that's a lot of hard to read placeholders, I'll define them and then give some examples:

<Label> -- This is a unique description that scopes a function and its queries, so we can have multiple statements with the same filter combinations.

With<Arg...> -- The And-separated Args will be added as arguments to the resulting lxd-generate function. Anything with the suffix Name, UUID, or Fingerprint will be a string, and ID would be an int. Everything else would be rejected.

In LXD Cloud we have types like ClusterState, so to use that in the generated function we'd need to write WithClusterStateID and then cast the ClusterState to an int when calling the generated function which doesn't seem quite right. That being said, I'm not sure there is a way of fully specifying the type using just the name of the registered statement variable.

markylaing avatar Oct 06 '22 08:10 markylaing

I think we can get around this by a) generating a string <typeName>Columns for use in handwritten statements containing the columns we need to select fully populate the type, and b) generating an unexported function select<typename>Objects(ctx context.Context, tx *sql.Tx, stmt int, args any...) ([]<type>, error) for internal use.

I like this idea. As that could be used as part of a custom select statement.

Although I think we should get away from requiring these custom queries to be per-registered as statements.

As this is by definition a custom scenario, we don't know how frequently its going to be called, and I wouldn't want us to end up registering every possible custom query in the database on all cluster members.

Additionally, when using filtering there's a good possibility the pre-registered query isn't going to be used anyway.

Having generated helper functions that custom handwritten functions can use to simplify SELECTing the right columns and generating WHERE statements from the provided filters would be useful, and then that can be passed to the traditional query functions we use.

And in cases where this custom function is restrictive enough and called frequently enough to justify pre-registering it, then this can still be done after the SQL is generated.

tomponline avatar Oct 06 '22 09:10 tomponline

As this is by definition a custom scenario, we don't know how frequently its going to be called, and I wouldn't want us to end up registering every possible custom query in the database on all cluster members.

How costly are prepared statements? I don't think we need to prepare statements for single usage but all the statements I've been writing are for use either in API calls or in response to downstream events. In any case, I think a major benefit of using prepared statements is that they will cause tests to fail if any queries are invalid. So I'd be in favour of using them wherever possible until there is a significant performance cost.

markylaing avatar Oct 06 '22 09:10 markylaing

I don't think we need to prepare statements for single usage but all the statements I've been writing are for use either in API calls or in response to downstream events.

Just to add to this. I wouldn't really call the queries "custom", they are just standard queries that need to be handwritten because they can't be easily generated.

markylaing avatar Oct 06 '22 09:10 markylaing

Sure you can use prepared statements where you feel its worth it. My point being for custom queries we shouldn't force it (by having the generator only support using the helpers with prepared statements). Especially seeing as the DB generator already is using dynamic queries when some filtering combinations are used.

Just to add to this. I wouldn't really call the queries "custom", they are just standard queries that need to be handwritten because they can't be easily generated.

Thats my definition of custom :)

tomponline avatar Oct 06 '22 09:10 tomponline

Ahh ok I get your meaning now, thanks.

markylaing avatar Oct 06 '22 09:10 markylaing

The way I understand it now its getting rather complicated to reason about how the DB generated functions are going to operate:

  • Generally speaking you (as the developer) have to pre-define every valid combination of your based query + single filter combinations.
  • Except here @masnax mentioned that sometimes the DB generator uses the base query during the generation process and generates the pre-defined queries for us. I don't understand why it doesn't do that all the time though :)
  • Sometimes if a more complex set of filters is supplied, the DB generated functions fallback to using dynamic queries (because there is no way to predict all the filter combinations that can arrive).

So before we make it more complex still, I would like us to try and simplify the developer experience so we can all understand how it behaves.

@markylaing I think I agree with you that we should aim to prepare as many of the base queries as possible (for validation purposes, even if the prepared statement isn't actually used during query time).

But when it comes to filtering, it feels like we are in a somewhat unpredictable and complex half-way-house where we have the the developer overhead of having to define the prepared statements we want built for different filter combinations (and yet sometimes the DB generator goes ahead and builds additional ones for us), and yet we may still end up using dynamic queries.

I feel at the moment that I'm not understanding the different scenarios well enough to introduce more complexity.

@masnax have I misunderstood the situation?

I'm wondering if it would be possible to get to a place where:

  • We only have to define the base queries that the DB generator would generate (e.g //go:generate mapper stmt -e instance objects).
  • Then all filtered ones use a dynamically generated query using the base query's SQL OR have the DB generator prepare all single filter combination queries. I'm not sure which is best here.
  • Any other DB generated functions would also use generated prepared statements derived from the base one.

tomponline avatar Oct 06 '22 10:10 tomponline

I feel we're going on a bit of a tangent with the prepared statements, but they're actually a non-issue. Using the AST, if we have a reserved naming scheme for lxd-generate-compatible statement variables as I'm suggesting, we can determine dynamically whether those variables are prepared statements or query strings and handle them appropriately, as long as they're global variables. So we could define plain old strings with <entity>Objects... and other ones as RegisterStmt calls, and lxd-generate would recognize that they're different, but both queries.

Except https://github.com/lxc/lxd/issues/10944#issuecomment-1268637298 @masnax mentioned that sometimes the DB generator uses the base query during the generation process and generates the pre-defined queries for us. I don't understand why it doesn't do that all the time though :)

I'm not sure what the "all the time" that you're referring to is? The only time we have a notion of a "base" query is for objects statements, and this behaviour is consistent for all of those. This is how the -by-FIELD-and-FIELD generator directives all work. The original base query is found in the syntax tree (whether it's generated or handwritten), a WHERE clause is appended to its query string (or an existing one is amended) , and a new variable is declared containing the result.

But when it comes to filtering, it feels like we are in a somewhat unpredictable and complex half-way-house where we have the the developer overhead of having to define the prepared statements we want built for different filter combinations (and yet sometimes the DB generator goes ahead and builds additional ones for us), and yet we may still end up using dynamic queries.

The filter logic is definitely complicated, but I don't see a way around that if we want all the functionality (multiple filters, nil/non-nil filter field comparisons, query-to-filter-combination mappings). We can have a larger discussion about dropping some of these if you think it's too complicated.

My thought here was that as we already have this rigid filtering logic, we might as well reap its benefits and use it where we can. To that end, if we need a custom query, we would just write the base query and lxd-generate can then take over producing all the filters for us. But then also allow the ability to "override" lxd-generate's result if our query is particularly irregular.

I'm wondering if it would be possible to get to a place where:

* We only have to define the base queries that the DB generator would generate (e.g `//go:generate mapper stmt -e instance objects`).

* Then all filtered ones use a dynamically generated query using the base query's SQL OR have the DB generator prepare all single filter combination queries. I'm not sure which is best here.

* Any other DB generated functions would also use generated prepared statements derived from the base one.

If we take a power set of all the fields in a Filter struct, we can generate every combination of "supported" filters automatically without needing the objects-by-FIELD-and-FIELD directives. The only issue here is the memory overhead from registering a whole bunch of prepared statements that we will likely never use even once, but are technically a valid combination. I'll investigate if this is worth it from a performance standpoint, but this would have no bearing on the functionality I'm suggesting here.

masnax avatar Oct 06 '22 15:10 masnax

Thanks. It may be a tangent, but I would value being able to get this straightened out in my head as to specifically what the DB generator requires/does.

I'm still not quite following, you're saying its a non issue because it can generate prepared statements automatically from the base statement if a required query is needed, but at the same time your saying that it doesn't generate all prepared statements it may possible need due for all the filter combinations.

As someone who is not familiar with that part of the code base that feels a bit counter intuitive. Both approaches make sense for different reasons, but I am struggling to understand how/when it does both. :)

In what scenarios does the DB generator choose to generate a missing query vs requiring an explicit definition via //go:generate mapper stmt?

tomponline avatar Oct 06 '22 15:10 tomponline

Also I don't feel its too much of a tangent as we were originally talking about modifying the generator to aid in handwritten queries. If those handwritten queries are dynamically generated at run time (so handwritten logic that generates the query based on some input) then if the solution we come up requires prepared statements to be defined at generation time to work then it won't be useful in those scenarios.

tomponline avatar Oct 06 '22 15:10 tomponline

I suspect we should talk about this together in Prague, it will likely be easier.

tomponline avatar Oct 06 '22 15:10 tomponline

I'll lay out some examples here: A single handwritten query:

// A handwritten query.
var certificateObjectsWithProjectID = RegisterStmt(SELECT...)

// go:generate mapper stmt -e certificates objects-by-Fingerprint
// go:generate mapper stmt -e certificates objects-by-Fingerprint-and-Type

Produces

var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsWithProjectIDByFingerprintAndType = RegisterStmt(SELECT...)

If we also have a base objects directive:

// A handwritten query.
var certificateObjectsWithProjectID = RegisterStmt(SELECT...)

// go:generate mapper stmt -e certificates objects
// go:generate mapper stmt -e certificates objects-by-Fingerprint
// go:generate mapper stmt -e certificates objects-by-Fingerprint-and-Type

Produces:

// Produced for the handwritten query.
var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsWithProjectIDByFingerprintAndType = RegisterStmt(SELECT...)

// Produced for the base "objects" directive.
var certificateObjects = RegisterStmt(SELECT...)
var certificateObjectsByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsByFingerprintAndType = RegisterStmt(SELECT...)

If we define our own filter override:

// A handwritten query and a filtered version.
var certificateObjectsWithProjectID = RegisterStmt(SELECT...)
var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(SELECT...)

// go:generate mapper stmt -e certificates objects
// go:generate mapper stmt -e certificates objects-by-Fingerprint
// go:generate mapper stmt -e certificates objects-by-Fingerprint-and-Type

Produces:

// Produced for the handwritten query.

var certificateObjectsWithProjectIDByFingerprintAndType = RegisterStmt(SELECT...)

// Produced for the base "objects" directive.
var certificateObjects = RegisterStmt(SELECT...)
var certificateObjectsByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsByFingerprintAndType = RegisterStmt(SELECT...)

If we only give a filter override and no base query

// A handwritten query and a filtered version.

var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(SELECT...)

// go:generate mapper stmt -e certificates objects
// go:generate mapper stmt -e certificates objects-by-Fingerprint
// go:generate mapper stmt -e certificates objects-by-Fingerprint-and-Type

Produces:

// Produced for the base "objects" directive.
var certificateObjects = RegisterStmt(SELECT...)
var certificateObjectsByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsByFingerprintAndType = RegisterStmt(SELECT...)

masnax avatar Oct 06 '22 15:10 masnax

As for making prepared statements optional, currently that behaviour is from parsing variable names of the form <entity>Objects... in the current package. We don't actually care if they're a call to RegisterStmt, but we can detect that. If we find that the variable just points to a string, rather than a RegisterStmt call, we can have the produced GetMany function properly handle it like a query string.

masnax avatar Oct 06 '22 15:10 masnax

But they do need to exist at generate time though? Right?

If so then thats not really what im looking for.

tomponline avatar Oct 06 '22 15:10 tomponline

Can you give me an example of such a run-time query?

masnax avatar Oct 06 '22 15:10 masnax

A single handwritten query:

// A handwritten query.
var certificateObjectsWithProjectID = RegisterStmt(SELECT...)

// go:generate mapper stmt -e certificates objects-by-Fingerprint
// go:generate mapper stmt -e certificates objects-by-Fingerprint-and-Type

Produces

var certificateObjectsWithProjectIDByFingerprint = RegisterStmt(SELECT...)
var certificateObjectsWithProjectIDByFingerprintAndType = RegisterStmt(SELECT...)

Great examples thank you!

Straight away what caught my eye in the first example is, as a developer, how would I know to produce a SQL statement named using certificateObjectsWithProjectID? As the // go:generate lines don't mention ProjectID at all.

Would including the var name to use in the // go:generate line make things more clearer to someone who doesn't know the internal name mapping (I'm assuming the WithProjectID part is somehow derived from the properties on the certificates struct type).

tomponline avatar Oct 06 '22 15:10 tomponline

Great examples thank you!

Straight away what caught my eye in the first example is, as a developer, how would I know to produce a SQL statement named using certificateObjectsWithProjectID? As the // go:generate lines don't mention ProjectID at all.

Would including the var name to use in the // go:generate line make things more clearer to someone who doesn't know the internal name mapping (I'm assuming the WithProjectID part is somehow derived from the properties on the certificates struct type).

The whole point of certificateObjectsWithProjectID is that it requires a parameter projectID which isn't part of the struct-table mapping, and so lxd-generate can't pick it up.

Rather, lxd-generate is just picking up that we have a variable defined of the form

var <entity>Objects<Suffix> = RegisterStmt(...)

We would then have a set of possible suffixes, separated by keywords With and By:

  • Immediately after <entity>Objects, we can specify any descriptive text, which will be suffixed to all generated statements and methods based on the base query, so that we can have multiple queries using the same combination of filters.
  • Next, prefixed with With, is additional arguments like projectID, which will become a parameter on the GetMany function.
  • Next, prefixed with By are any filters.

lxd-generate would fail to build if it finds any variables that break this naming structure. Or if we wanted to be very strict, any variable call to RegisterStmt that doesn't have this exact naming structure.

masnax avatar Oct 06 '22 15:10 masnax

Can you give me an example of such a run-time query?

Certainly. Say I want to load all instances using a custom filter (perhaps a series of instance IDs from a previous query extracted based on a different table, perhaps all instances with a device connected to a particular network - something that isn't possible from a join because it needs to have profiles applied), but I also want to combine that filter with the existing filter sets we support for instances.

This presents me with 2 problems:

  1. I would have to implement all of the filter sets logic to generate the WHERE statements - and that risks me introducing a bug where I miss fields (especially ones added in the future).
  2. I would have to select all of the fields needed, which would mean maintaining that list of fields in multiple places (this is @markylaing point above).

It would be great if the DB generator would provide me a function I can use to generate WHERE statements from filters for an entity type, and to generate the column names for the SELECT part for an entity type.

tomponline avatar Oct 06 '22 15:10 tomponline

Great examples thank you! Straight away what caught my eye in the first example is, as a developer, how would I know to produce a SQL statement named using certificateObjectsWithProjectID? As the // go:generate lines don't mention ProjectID at all. Would including the var name to use in the // go:generate line make things more clearer to someone who doesn't know the internal name mapping (I'm assuming the WithProjectID part is somehow derived from the properties on the certificates struct type).

The whole point of certificateObjectsWithProjectID is that it requires a parameter projectID which isn't part of the struct-table mapping, and so lxd-generate can't pick it up.

Rather, lxd-generate is just picking up that we have a variable defined of the form

var <entity>Objects<Suffix> = RegisterStmt(...)

We would then have a set of possible suffixes, separated by keywords With and By:

* Immediately after `<entity>Objects`, we can specify any descriptive text, which will be suffixed to all generated statements and methods based on the base query, so that we can have multiple queries using the same combination of filters.

* Next, prefixed with `With`, is additional arguments like `projectID`, which will become a parameter on the `GetMany` function.

* Next, prefixed with `By` are any filters.

lxd-generate would fail to build if it finds any variables that break this naming structure. Or if we wanted to be very strict, any variable call to RegisterStmt that doesn't have this exact naming structure.

OK I see, so there's a prefix match going on between go:generate mapper stmt -e certificates objects lines and the var certificateObjects... vars.

But has this generated a usable function yet? My understanding is no, that requires another go:generate line.

If true, does it make sense for the DB generator skip/generate on its own the intermediate statements needed for the go:generate lines that actually make the functions that we would use?

I'm trying to understand why we put this intermediate step in place rather than asking DB generator to build the end function to our specification (and let it generate any required statements from the base one matching the prefix).

So something like:

//go:generate mapper method -i -e instance GetMany references=Config,Device source=certificateObjectsWithProjectID

tomponline avatar Oct 06 '22 16:10 tomponline

Having generated helpers for columns and WHERE statements is something we can definitely do, but I don't see the connection with lxd-generate being unable to find the queries before run-time.

Since these helpers are defined from lxd-generate, they would exist before the new query we're making. If we just add some other function than RegisterStmt:

// Use lxd-generate to produce these 2 helpers:
func certificateColumns() string // gets the columns for the `certificates` table.
func certificateFilterFingerprintAndName() string // gets a filter including `fingerprint` and `name`

// Then we would define this ourselves.
var certificateObjectsCustom = CustomStmt(SQL, certificateColumns(), certificateFilterFingerprintAndName())

And lxd-generate would pick up certificateObjectsCustom as a complete query.

masnax avatar Oct 06 '22 16:10 masnax