rover `subgraph new` and `client new` - A strategy for "file new" templates

Description

As a GraphQL developer, it is challenging to get up and started with a new project to consume or contribute to the supergraph. Most organizations create internal templates to provide a starting point; the goal of this feature is to provide a starting point for any GraphQL developer consuming/contributing to their supergraph. Below is my proposal to add this functionality:

All of the templates exposed through rover can live in https://github.com/apollographql/templates. Both client/subgraph commands would download the templates as a tarball (https://github.com/apollographql/templates/archive/refs/heads/main.tar.gz) and extract the necessary template. Users should be able to provide a schema that can be swapped into the template.

subgraph new /path/to/new-directory --template={TEMPLATES_FOLDER_NAME} --schema={URL or File}
client new /path/to/new-directory --template={TEMPLATES_FOLDER_NAME} --schema={URL or File}

directory (optional): Directory to create the new project in. If omitted, create template in current working directory.
--schema (optional) : If a url is provided, subgraph introspect is used to fetch the schema; a failed results exits command and displays the reason why. The schema is written to a file in the root of the templates directory
--template (optional): The folder name of the template living in the remote repository. If omitted, user is prompted with a list of templates to select from.

Jul 11 '22 18:07 michael-watson

Hey @michael-watson - thanks for opening this, excited to work on this with you.

re: subgraph new - looks great, my one question is why you need the --schema argument? in what scenario would you want to generate a project from scratch where you already have a schema? i imagine you would want to have a .apollo directory that lives in the template itself that points to a local schema file that could then be iterated on in parallel with the code. especially if we have something like a prisma template, just copying in a schema like that is going to break codegen pretty wildly, would it not?

re: client new - i'm much more hesitant to add this one since rover is not typically used with client projects, there is not a client check command (or plans for one) - i'd be worried that generating client projects w/o any extra tooling might leave a sour taste. if product has plans down the line for other client-based commands i could see this command slotting in nicely but as it stands right now rover can't really help clients out all that much and they're probably better off sticking around with lang-specific tools. (this is my own personal opinion/hunch though and definitely not set in stone - open to hearing counterpoints!).

Jul 11 '22 19:07 EverlastingBugstopper

We will also want a way to list the templates available. I also wonder if "subgraph" is the right command tree for this, especially because we will want a "list" command and sounds like eventually have non-subgraph templates. Maybe even rover template ... to keep it nicely encapsulated.

Jul 11 '22 20:07 ndintenfass

rover template makes sense to me.

as far as template selection, i imagine we should allow for separate templates to be stored in different git repositories, and each template could have a .apollo/template.yaml directory that contains some template-specific config (like type = "subgraph"). then after cloning that repo, rover would run rover {type-from-template-yaml} init and ask all the normal onboarding to studio questions, get rid of .apollo/template.yaml and give you a .apollo/config.yaml that is ready to go.

you would also have some sort of template manifest (perhaps it lives at rover.apollo.dev/templates/manifest.yml or it lives in the Studio API or we just commit it to main) that keeps track of official template repos.

rover template list might be a thing you could run to see all of the template slugs, names and types rover template create could then take the project type from manifest.yaml and run the associated rover {type} init command (at first, likely just rover subgraph init would be supported) so that after pulling down the template you're basically ready to publish to studio.

Jul 11 '22 20:07 EverlastingBugstopper

rover template would work. We should make sure that rover template create also provides an option to allow not running rover subgraph init (or just letting the user eject out at that step).

Supporting templates from separate git repositories is probably something we can incorporate in the future. We want a single place for all our templates and the current plan is to have all of our templates living in a single git repository. I could see a future with "community templates" but ideally these are contributions into the templates repository. If that gets too large, we could have a "manifest file" stored in the templates repo and used in rover. I also don't have a problem with supporting any git repo, we just don't have any plans around this for now.

Jul 11 '22 21:07 michael-watson

I find this to be a very interesting and compelling feature for rover!

rover template

I can definitely see the value in this command. I do like the idea of a similar ergonomic to creating a subgraph with or without a template though, which I think the --template (or --from-template?) flag to an existing command would allow. Anyhow, no strong feelings.

--schema (optional) : If a url is provided, subgraph introspect is used to fetch the schema; a failed results exits command and displays the reason why. The schema is written to a file in the root of the templates directory.

I can certainly see the value in a --schema argument baking in some subgraph schema content for tutorials.

Would there be a default empty schema? Though I'm curious if we'd want that for some set of base templates where the default user experience to add a subgraph would end up being removing the default template.

In terms of the implementation / effects of the --schema argument — which will vary depending on the subgraph implementation language (e.g., code generation / code-first / SDL-based ). Is the idea to have some sort of pattern or some sort of language-specific init script that is invoked after the template is setup? Sketching out a small design with a number of use-cases in different languages seems prudent because I can imagine that merely writing the schema to the root of the extracted tarball is an anti-pattern in some language — I feel like it has a specific home based on the language, no?

Supporting templates from separate git repositories is probably something we can incorporate in the future. We want a single place for all our templates and the current plan is to have all of our templates living in a single git repository.

Are you committed to this approach? I find this pattern to be challenging in a lot of ways; co-locating a lot of different languages in a single mono-template repository would seem to complicate even simple things (in the way that all monorepos tend to) like:

Having the templates wired up with the appropriate CI/CD patterns, including tests, etc. so that the template repositories themselves can be easily renovated with Greenkeeper/Renovate/Dependabot, etc.
Empowering contributors to these repositories who have language specialties and also applying a matching permission system that doesn't give all contributors access to all templates.
It would take us in a direction that is less streamlined for what I understand to be community-focused template pattern.
Making us clone possibly enormous tarballs (with the weight of all the languages it contains) and locally shaking off what we don't want.
The path for migrating away from this without introducing a new command seems a bit unclear to me. Seems like we're building ourselves into a corner?
Issues opened on GitHub for these templates are going to be the totality of every languages' implementation and it's going to be more challenging to assign owners for specific implementations.

This all seems like stuff we'd want to decompose if not immediately but in the very near-term and from an implementation standpoint it actually seems like it would be more work to operate on a directory inside of a repo/tarball than just working on a repo.

With a repo-based approach we could just have a pattern that works something like this, where {name} (as in rover template use {name}) expands to a number of places we just look at, depending on its format (just like npm, rust, etc.):

{name}: github.com/apollographql/rover-template-subgraph-{name}
- So, typescript looks at github.com/apollographql/rover-template-subgraph-typescript
- If {name} doesn't exist, we immediately fail (rather than downloading and extracting the tarball first?) and merely tell the user it doesn't exist.
github:abernix/my-template: github.com/abernix/my-template
https://gitlab.com/another/repo.git: is a URL, so we just use it?

Beyond the expansion of the {name} token itself, all of these use the same mechanisms behind the scenes and immediately sets us up for community templates and I think nicer ergonomics around the template repositories themselves.

As an extension of this, GitHub has it's own "Use template" button on GitHub that would work better with a repo-based approach when you mark a repository as a template. I'm not suggesting we have to use that, but it seems like a naturally compatible approach?

And lastly, if we wanted a manifest and list command, we could just have Rover "ask" GitHub for repositories which start with rover-template- (e.g., https://api.github.com/orgs/apollographql/repos) — though we'd have to be cognizant of rate-limiting. I do think we could punt the list option.

ideally these are contributions into the templates repository

I am pretty skeptical of this and this feels like this is going to grow unruly quickly. We're going to want the official templates to be regularly maintained and updated as dependencies change, etc. and be largely really trusted. If we have all community templates living in a folder within a single repository I worry it will be challenging to do security audits, maintenance, etc. using the tooling that exists naturally on GitHub. And again, I do not think we want all issues for all subgraph implementations (or, at some point, community templates) living in a single repository?

re: client new - i'm much more hesitant to add this one since rover is not typically used with client projects, there is not a client check command (or plans for one)

I tend to agree, but I also think we need to plan for this and also need to start somewhere. 😄

rover would run rover {type-from-template-yaml} init and ask all the normal onboarding to studio questions, get rid of .apollo/template.yaml and give you a .apollo/config.yaml that is ready to go.

I like this idea. I'd hope we workshop what lands in a template file specifically at some point though, because definitely want those to be durable. To some degree I'm a bit allergic to the idea of a template file (that can bit-rot over time and needs to be maintained) if those parameter couldn't just be driven by rover directly though. 🤔

Jul 12 '22 07:07 abernix

Came here to say what @abernix said better - also makes it really straight-forward to connect the rover experience directly back to the repo, a virtuous OSS cycle where a language community can congregate.

Jul 12 '22 08:07 ndintenfass

Love the discussion!

--schema

I was planning on having the default schema for all subgraph templates include @contact and @key defined in a Foo-way. This could be simplified or expanded on.
The code-first/code-generation schemas definitely pose a challenge. The idea behind this was to enable the output of a schema design for a new feature as a starting point for the template, but the file could just be copied over in those scenarios

separate git repositories

I'm committed to templates, the single repository just was an easier starting point for "minimal templates". Once that effort was complete, we would have to decide on what to do with the templates. I'm curious about thoughts on how we do this for the Apollo created templates:

Apollo templates live in our current GitHub organization (https://github.com/apollographql/) as separate repos
Apollo templates live in Solutions GitHub organization (https://github.com/apollosolutions/)
Establish a new GitHub org

I imagine we'll have 1-2 templates per subgraph implementation which would be ~20-40 templates based on our current compatibility repo. The proposed convention for {name} makes sense to me as well.

GitHub Templates

I'm going to check this out, we should probably fit into this. I don't think we should rely on the API with naming rover-template-*, I would prefer an curated manifest somewhere so we can ensure templates are meeting a certain threshold for being maintained (to be determined). I definitely agree we can punt on list in favor of a docs page or something.

Potential .apollo template file in template projects

I didn't have any plans here and agree we should workshop whatever this would be.

Jul 12 '22 19:07 michael-watson

--schema

Ok, it seems like some sort of declaration will live inside the .apollo directory of these templates that defines where the schema lives in that template and then rover dev can rely on that in order for it to drive composition (and the associated file-watching)? If that makes sense then we should probably make sure the design is sorted out for whatever that subgraph-level manifest needs to exist to define this and (maybe it's an extension of the work in https://github.com/apollographql/rover/issues/1173) then circle back to apply that detail as a finishing-touch to the template before we can consider this done, right? If that sounds right then we don't need to decide anything right this minute to unblock @michael-watson's ability to start creating the templates in the general sense, but we do need to make sure we figure out that mechanism, @EverlastingBugstopper.

separate git repositories I'm committed to templates, the single repository just was an easier starting point for "minimal templates"

In my opinion, the up-front work of creating the extra Git repositories now is worth it on account of that separation being relatively cheap to create (repos are free) and because of the natural compatibility that has with an ecosystem of templates, whether or not we have a manifest that defines specifically what those are. It also seems materially more simple than trying de-composing and disentangle a monolithic multi-language template repository in the future while trying to maintain compatibility with it, not to mention the technical weight of cloning a much larger repository than we need to. In my mind, it just seems like it avoids us walking through some doors that seem harder to walk back through later. Put another way, I would be strongly in favor of starting with dedicated Git repositories for each template since it seems like a very flexible starting point and avoids us needing to special-case a monolithic template repo in the rover code-path (for a while and even potentially indefinitely?) or abandoning a mono-template repository in the mid-term. It seems like it would be more natural to make a repository optionally support multiple templates in the future as an iteration rather than introducing the opposite later, if that's something we decide we want to do.

Apollo templates live in our current GitHub organization (https://github.com/apollographql/) as separate repos

Apollo templates live in Solutions GitHub organization (https://github.com/apollosolutions/)

Establish a new GitHub org

I imagine we'll have 1-2 templates per subgraph implementation which would be ~20-40 templates based on our current compatibility repo. The proposed convention for {name} makes sense to me as well.

Is our intention to have 20-40 repos for our initial version of this functionality or is that a more mid/long-term goal? I would feel pretty good about supporting maybe 2-6 templates in the next ~eight weeks and perhaps react to our learnings after getting an initial set of templates/functionality out the door. (I would strongly suggest we not do anything more ambitious than that, but maybe there's an investment we've already made or a requirement in this regard that i'm not aware of?)

Assuming that we're optimizing for the near-term (and the lower number of templates I'm suggesting we bound ourselves within), I propose that we not create a new GitHub org at this point in time. Even if it's a good mid-term/next-steps consideration, I think that today it has more overhead than we're prepared for and also begs for administrative oversight and organizational alignment.

In a similar scope-reduction effort, I also think we should punt on the {name} convention (I know, I suggested it, but I think it's an eventual nice to have) for now and start with an approach that requires being explicit about the path to the repo. Meaning, for now, we just pass the full URL like rover subgraph new name --template https://github.com/apollographql/subgraph-template-typescript and in the future we could make a {name} shorthand for that — possibly even driven by our registry. We can even transfer that repo to another org when we do that and maybe at that point it just becomes --template typescript 🤷 . I think this prioritizes accomplishing the story with a series of doors we can walk through after an initial release.

I would prefer an curated manifest somewhere so we can ensure templates are meeting a certain threshold for being maintained

I like the idea of having a manifest, but I'm not sure I see this being an important initial requirement. While I don't disagree in having a registry of templates or sorts that we can validate programmatically, this seems like it could be a follow-up. Today, we have a limited window of opportunity to align our stories here and work toward a common goal on the same foundation. The ecosystem of templates is going to take time to emerge I believe we'll have time to figure out that out after we get to a minimum viable story.

Thoughts?

Aug 02 '22 13:08 abernix

Agree with @abernix - except maybe for the manifest of short names - we could allow short name OR repo URL, for instance. It's a much nicer experience to get to list out available templates - also saves us having to build against a specific git client of ours have simple https urls to released zips.

Aug 02 '22 15:08 ndintenfass

Ok, it seems like some sort of declaration will live inside the .apollo directory of these templates that defines where the schema lives in that template and then rover dev can rely on that in order for it to drive composition (and the associated file-watching)?

i'd love if we could just have it be a command! what command do we use to start this subgraph? we can then detect things like the port and do introspection so the schema doesn't even need to be committed, it can be changed directly in code and we're off to the races. this means i only need to implement introspect polling instead of introspect polling AND file watching, and makes integration with other subgraph implementations much cleaner.

if it's not possible to have a command that starts the server and that template requires that you start it up before running rover dev that is fine too, but it should have an endpoint committed to that config file. so essentially the minimal config is like

.apollo/default/subgraph.yaml
---
startup:
  command: npm run start

OR

.apollo/default/subgraph.yaml
---
startup:
  endpoint: http://localhost:4000

and agree with @ndintenfass that a small manifest even if it's hard coded in rover is a 👍🏼

Aug 02 '22 16:08 EverlastingBugstopper

I'm merely trying to suggest we can add short names in the future but, sure, we can hard-code something into Rover for now.

In a very general sense, I do not know if we should commit to an introspection-only approach for subgraphs, though we could try starting with that approach and see if we get along with it. Some reasons for that are:

Requiring subgraphs to be running in order to publish them to the registry is a bit of a heavy ask in my mind. I think it is quite a bit nicer to be able to check and publish against a static artifact when possible.
Some implementations use SDL as an input and some use code to generate that SDL. I don't know if the response from federation introspection is going to be durably deterministic enough to avoid flapping? In all implementations?
We made a pretty big commitment a while ago to optimizing for non-introspection based workflows. In fact, for a while we were divesting from introspection as a recommended approach and this seems like a bit of a pivot from that. We should just make sure we're not reversing previous decisions without realizing the impact of that change because it feels like we'd be doing that with an introspection-only/first approach.

Aug 02 '22 17:08 abernix

i'd love if we could just have it be a command! what command do we use to start this subgraph?

Just want to give some perspective from a Java/Rust dev.

For Java I would 100% want to start my subgraph from my IDE, the unfortunate reality is that hot-reload doesn't really work with Java, so the amount of time spent in the debugger is much higher. (JRebel exists but still...).

For Rust the debugger is much less useful, but the compile times are killer. Even for a local project which is small any sort of hot watch compilation is slower than just having a keybinding for kicking off a build.

I guess what I'm saying is that I'd prefer rover just ask me for the URL to watch rather than try to take over my dev workflow. Work with the way I work etc. I can imagine that this would be different in node.js land though.

Aug 08 '22 10:08 BrynCooke

For auto-detection of subgraphs does a full scan take place? Or just well known ranges?

Aug 08 '22 10:08 BrynCooke

With the initial version of rover template generally available 🎉, I think this issue can be closed! There are definitely some follow-up features to add in the future, but those should be new issues.

Dec 14 '22 22:12 dbanty

rover rover copied to clipboard

`subgraph new` and `client new` - A strategy for "file new" templates

Description

rover
rover copied to clipboard