cursorless
cursorless copied to clipboard
Language support tables
The problem
As a contributor, it is hard to know exactly what needs to be implemented for a new language, or what is missing from an existing language. As a user, it is hard to know what our support level is for an existing language.
I really like the support tables in https://github.com/cursorless-dev/cursorless/pull/1911 and https://github.com/cursorless-dev/cursorless/pull/1962. We should formalize these somehow with the following goals:
- make it easier for people to contribute languages
- make it easier for contributors to see what's missing for a given language
- make it harder to forget to write tests
- make it easier for users to know what a given language supports
The solution
There are multiple levels of this solution, in increasing levels of investment
1. Add support tables to our new language PR template
This is the lowest overhead, and has the benefit that it's easy for the contributor and core maintainers to update as a PR progresses.
The drawback is that it's not helpful for users, and doesn't help as much after the initial PR. It's also not formalized in any way
2. Manually add support tables to per-language pages (https://github.com/cursorless-dev/cursorless/issues/1642)
This solution is probably the highest overhead for new language contributors, but doesn't require much investment on our part, and is helpful both for users and for contributors seeing where a language is currently
It also isn't automatically enforced in any way, so could get stale / be wrong just like 1) above
We'd also want to add a checkbox to the PR template saying we've updated this page. Probably just the new-language template but could be the general template
3. Add structured representation of language scope support
We could add types that represent different aspects of scope support. Something like the following
const scopeSupportFacets = [
"list",
"list.interior",
"map",
"map.interior",
"collectionKey",
"namedFunction",
"namedFunction.interior",
"functionName",
"anonymousFunction",
"anonymousFunction.interior",
"name",
"value.assignment",
"value.assignment.removal",
"value.return",
"value.return.removal",
"value.collectionItem",
"value.collectionItem.removal",
"statement",
"ifStatement",
"condition.if",
"condition.while",
"condition.doWhile",
"condition.for",
"condition.ternary",
"branch",
"comment.line",
"comment.block",
"string.singleLine",
"string.multiLine",
"textFragment",
"functionCall",
"functionCallee",
"argumentOrParameter.argument",
"argumentOrParameter.argument.removal",
"argumentOrParameter.parameter",
"argumentOrParameter.parameter.removal",
"class",
"class.interior",
"className",
"type",
] as const;
type ScopeSupportFacet = (typeof scopeSupportFacets)[number];
interface ScopeSupportFacetInfo {
label: string;
description: string;
scopeType: SimpleScopeTypeType;
}
const scopeSupportFacetInfos: Record<ScopeSupportFacet, ScopeSupportFacetInfo> =
{
list: {
label: "List",
description: "A list of items",
scopeType: "list",
},
"list.interior": {
label: "List interior",
description: "Excludes the opening and closing delimiters of the list",
scopeType: "list",
},
map: {
label: "Map",
description: "A map of key-value pairs",
scopeType: "map",
},
"map.interior": {
label: "Map interior",
description: "Excludes the opening and closing delimiters of the map",
scopeType: "map",
},
};
enum ScopeSupportFacetLevel {
supported,
unsupported,
notApplicable,
}
type LanguageScopeSupportFacetMap = Record<
ScopeSupportFacet,
ScopeSupportFacetLevel
>;
const { supported, unsupported, notApplicable } = ScopeSupportFacetLevel;
// Example support table for a language
const typescriptSupport: LanguageScopeSupportFacetMap = {
list: supported,
"list.interior": supported,
map: supported,
"map.interior": supported,
collectionKey: supported,
namedFunction: supported,
"namedFunction.interior": supported,
functionName: supported,
anonymousFunction: supported,
"anonymousFunction.interior": supported,
name: supported,
"value.assignment": supported,
"value.assignment.removal": supported,
"value.return": supported,
"value.return.removal": supported,
"value.collectionItem": supported,
"value.collectionItem.removal": supported,
statement: supported,
ifStatement: supported,
"condition.if": supported,
"condition.while": supported,
"condition.doWhile": supported,
"condition.for": unsupported,
"condition.ternary": notApplicable,
};
The strong typing would force us not to forget any scopes. The above would be the source of truth from which we could generate the following:
- Support tables for #1642
- Language support tables for scope pages in our docs
- Examples of different aspects of support in our docs, using test cases
We could also check that test cases exist by adding an optional supportFacet
field to our recorded test cases, and then checking in CI for a test case that mentions any support face that is marked as supported
Add "example" field with some quick inline example?
One major drawback is that if you add a new scope / facet in the process of developing a language, you then need to add a field to all existing languages. There are a couple of problems with that:
- It increases the barrier to adding new languages / facets
- You might not know whether a facet / scope makes sense for some language that is totally unfamiliar to you
- If a couple such PRs are in-flight at the same time, you could get merge conflicts across all of the language files
A couple possible solutions:
- Make the language support tables
Partial
- Pros: solves all of the problems listed above
- Cons: we no longer get the benefit of being forced to think about the different facets of a given scope for every language
- Add a new
unknown
value to theScopeSupportFacetLevel
enum- Pros: solves 2) above, and still forces you to think about facets of scopes for languages
- Cons: Doesn't solve 1) and 3) above
We're leaning towards implementing the simplest possible version of 3) (structured representations) as our first step. It
- wouldn't be required to declare the support table for every language
- we could make the scope support tables
Partial
Then we'd have the immediate benefit of:
- PRs could signal their scope support in a structured fashion that could be enforced down the road
- For languages which have these tables, we could see at a glance what's supported
Some drawbacks compared to starting with 1) or 2)
- Given it's
Partial
there isn't a list of checkboxes indicating what should commonly be contributed in initial lang support. We could have a template that we use for this purpose, though - There aren't checkboxes for whether tests have been added for the given facet, and there's no enforcement that the tests exist. 1) would have such checkboxes
I guess we could add the scopeSupportFacet
field to our recorded tests, and then only check it for those languages that have support tables. That's a bit more work though, and adding that field to recorded tests will be a bit annoying without also adding some tooling to make it easier.
@josharian I'd love your feedback on this one when you get a minute, either in comments here or at a meet-up. It seems there is a possible short-term win here but don't want to lead ourselves to trouble in the long-term
I can’t quite envision what the long-term trouble would be here.
I would say the primary goals are to (a) get anything at all working, and (b) minimize the effort required to update the table. So any design that requires O(n) entries be added, or altered, is strictly worse than one that requires O(1). So I guess I’m partial to partials.
I am really excited about this idea. I would love to have a place that I could very easily check every month or two to see what scopes Go is now missing, perhaps because something got added that I wasn’t aware of at the time.
Oh, and I’m not too worried about needing checkboxes. I suspect that most people who are implementing languages are going to refer to this table a lot and will be excited to update it.
I added that to discuss label because I might want to take a crack at this sooner than anticipated. I do have some questions about facets.
- If something requires a different removal range or not might be very language dependent. Therefore I do wonder if not every single scope potentially could have a removal? And once we actually do proper scope test we would test the removal range of every scope.
- Should iteration scopes be part of the facets?
-
branch.ternary
that contains two branches. Would that be one or two facets? I would assume a single test would capture both branches.