html icon indicating copy to clipboard operation
html copied to clipboard

New attribute to control UA-provided writing assistance

Open bmathwig opened this issue 3 years ago • 47 comments

The current specification allows for the autocomplete attribute to exist on elements of type <input>, <textarea>, and <select>. With the rise in popularity of rich text controls using contenteditable, we should consider allowing elements who have contenteditable=true to utilize the autocomplete attribute. While not a common scenario within the scope of form fields, there are applications for text hinting and autofill within contenteditable elements.

One existing example of a form field being replaced by contenteditable exists in the example section of its specification.

bmathwig avatar Mar 22 '23 21:03 bmathwig

cc @whatwg/forms

annevk avatar Mar 23 '23 07:03 annevk

Microsoft is interested in implementing this in Edge and Chromium :)

bmathwig avatar Mar 23 '23 21:03 bmathwig

cc @DimiDL @galich @masayuki-nakano

zcorpan avatar Mar 27 '23 15:03 zcorpan

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

zcorpan avatar Mar 28 '23 09:03 zcorpan

WebKit is also interested in this.

(For maximum clarity, Chromium and Edge count as a single implementer for WHATWG purposes.)

annevk avatar Mar 28 '23 15:03 annevk

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

I think contenteditable will always be a Text-Multiline host. I can't think of any cases where the other groups would apply. We also may want to expand the Field Name to include additional types of content in the future.

bmathwig avatar Mar 28 '23 16:03 bmathwig

Here is our proposal to adjust the wording of 4.10.18.7.1 Autofill to allow for editing host elements to be eligible for autofill and the autocomplete attribute.

https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

bmathwig avatar Jul 12 '23 23:07 bmathwig

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

sanketj avatar Jul 20 '23 17:07 sanketj

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

Seems reasonable, but I'm not an autocomplete expert. I do worry about the leakage of sensitive information, if autocomplete can more easily be tricked into filling general <div> with PII. But as mentioned, that risk already exists with <input> so I'm not sure why this would be worse. @battre would have better input from Chrome's side.

mfreed7 avatar Aug 03 '23 23:08 mfreed7

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

cc @johanneswilm

annevk avatar Aug 25 '23 10:08 annevk

@bmathwig What kind of input type are you thinking of using for the corresponding beforeinoput and input events? And how does this fit with Microsoft's plans with replacing a lot of contenteditable usage with EditContext which Microsoft is also working on?

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from? Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

johanneswilm avatar Aug 25 '23 10:08 johanneswilm

@bmathwig Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

johanneswilm avatar Aug 25 '23 10:08 johanneswilm

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

  • the majority of <input> elements don't have autocomplete attributes,
  • the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

  1. Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components. With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).

  2. Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields. With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).

  3. Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s. With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

battre avatar Aug 25 '23 20:08 battre

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

  • the majority of <input> elements don't have autocomplete attributes,
  • the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

  1. Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components. With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).
  2. Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields. With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).
  3. Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s. With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

Thanks @battre! The intent of this proposal is not to make contenteditable elements targets for form fill (although it technically already can be today - see below). Rather, it is to extend the scope of the autocomplete attribute beyond just form autofill scenarios. For editable regions, the use cases for autocomplete are mainly for writing assistance to allow the user to write faster, not necessarily for filling forms.

There are a couple of subtleties in terms of interactions with form elements:

  • For text control elements (ex. textarea), UAs may use autocomplete for both writing assistance and form fill use cases.
  • It is technically possible for a contenteditable to be form associated if it is part of a form-associated custom element (https://html.spec.whatwg.org/multipage/forms.html#categories). For such cases, similar to other text control elements, UAs may use autocomplete on contenteditables to signal both writing assistance and form fill.

An alternative solution could be to create a new attribute to support "autocomplete for writing assistance" scenarios. However, since these are also autocompletion scenarios, it would be ideal to just reuse the existing autocomplete attribute.

sanketj avatar Aug 29 '23 07:08 sanketj

What kind of input type are you thinking of using for the corresponding beforeinput and input events?

Autocompletion on input elements fires these events today. I would expect them to fire similarly for contenteditables.

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from?Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable? Storing and inserting the autocomplete text as plain text seems sufficient.

sanketj avatar Aug 29 '23 07:08 sanketj

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

johanneswilm avatar Aug 29 '23 07:08 johanneswilm

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

What categories of events are you referring to? It seems like eventing for autocomplete should work similar to the user just replacing/inserting that content via manual input. Events for input, composition, etc. are already fired on text control elements like this, I would expect those events to also work the same way for contenteditables.

sanketj avatar Aug 29 '23 07:08 sanketj

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec. Thus, this would work similar to the user manually replacing/inserting that same content via text input methods (ex. typing, composition), which wouldn't be site breaking.

sanketj avatar Aug 29 '23 08:08 sanketj

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable?

Looking at the kind of autocomplete existing richtext editors based on contenteditable do, they would for example put tags around a specific term that was inserted through auto-completion to give it a different color or style. The code editor mentioned in your explainer [1] would likely need to do that if it is supposed to work like other web-based code editors.

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec.

Ok that would remove one potential issue.

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

johanneswilm avatar Aug 29 '23 08:08 johanneswilm

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill. I've updated the use cases section in the explainer, hopefully that helps. For the use cases I can imagine, they all seem to be about inserting content where the user is typing (replacing the user's selected text if necessary). My reasoning for leaving the decision about how to insert content into the DOM up to the UA is that autocomplete is about browser-powered functionality and it is unclear what browsers might come up with in the future. The intent is that regardless of what types of writing assistance UAs add, authors should be able to control it with the autocomplete attribute. The existing spec also does not prescribe how exactly autofill should insert content into the DOM, it just mentions that UAs must act "as if the user had modified the control's data". This allows a wide range of use cases to be supported.

sanketj avatar Aug 31 '23 17:08 sanketj

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill.

Ok, that makes more sense. So it looks like you are planning for a scenario where a UA or an operating system is providing something like text completion using a large language model (LLM) either by directly completing the text or by figuring out that this would be an appropriate place to insert the users phone number, credit card number or similar and then offering that as something to easily fill in.

I can see how it would make sense to signal to either the UA or browser extensions (like Grammarly and new similarly IA-based offerings) that this is a field where such assistance would be desired or it should not be offered. This kind of assistance is qualitatively different from spell-checking, so therefore you need to have two distinct keywords.

I wonder then, given that this usage is quite different from the form-filling help that the autocomplete attribute offers, whether it would not make sense to use a different term to make it less confusing. Maybe something like textcompletion?

I also think it should be made clear which input type (before input event and input event) will be used for this. There is called called insertReplacementText and the usage is described as "replace existing text by means of a spell checker, auto-correct or similar". I can see from the name of it that it was initially meant to be used for spell- and grammar checkers, but it would still seem like the most appropriate one. Else, maybe we need to add another type to the chart.

Under these circumstances, would it not also make sense to add this to EditContext in parallel? I have seen your notes on that, but the use cases you like, like a Facebook editor, already use a sophisticated and highly complex contenteditable based-editor that will also possibly be replaced by EditContext once shipped.

johanneswilm avatar Aug 31 '23 18:08 johanneswilm

From the explainer:

Many sophisticated editors that could benefit from the EditContext API also integrate their own writing assistance features and thus may opt out of browser-powered autocompletion (ex. Google Docs, Word Online). Therefore, it is unclear whether supporting the autocomplete attribute on EditContext editable hosts will be useful.

Most production level JS richtext editors on the web are quite sophisticated and will consist of thousands of lines of code and have 5-20 years of development behind them. However, a lot of these can be run completely in JavaScript (open source libraries such as CKEditor, ProseMirror, TinyMCE, etc.). And most of the more robust ones already do what EditContext promises in that they diff the dom after browser-initiated DOM changes and then potentially roll back some of those. Switching to EditContext will in many such cases mean a simplification of the code as one can skip diffing and rolling DOM changes back. So if and when EditContext actually ships eevrywhere, I would think that a lot of these libraries will eventually switch to it.

However, hosting a LLM on a server is a bit more complicated than serving a JS-based editor on a website. I would therefore think it makes a lot of sense to add both spell checking and this new feature also to EditContext. That would also be consistent with other decisions you made, such as adding using the native selection as an option to EditContext even though Google Docs and other larger online word processors don't make use of it, precisely because it is also to be useful for smaller sites.

johanneswilm avatar Aug 31 '23 19:08 johanneswilm

I have proposed to add this to the agenda of the Web Editing Working Group at TPAC.

johanneswilm avatar Sep 01 '23 14:09 johanneswilm

There are two parts to this proposal, and one is less documented than the other. The first is a proposal to change autocomplete to be a global attribute that can be used on any element type. I understand that part. The second, which is not documented in the explainer (unless I missed it?), is about what values the autocomplete attribute may have when it is used on a non-form field. It sounds like the intention for now is simply to allow autocomplete=off to disable UA behaviors. Is that correct? Or are you proposing to allow all of the existing autocomplete values (e.g. autocomplete="street-address")? Or are you proposing new values entirely (e.g. autocomplete=suggestions)?

mfreed7 avatar Sep 01 '23 17:09 mfreed7

I wonder then, given that this usage is quite different from the form-filling help that the autocomplete attribute offers, whether it would not make sense to use a different term to make it less confusing. Maybe something like textcompletion?

This is one of the alternative solutions mentioned in the explainer: https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md#text-prediction-attribute. The main downside with something like textprediction or textcompletion is that it may not cover future use cases. Ex (completely hypothetical): A browser may decide to ship an in-built meme generator in the future. In this case, the autofill suggestions would be images instead of text. In fact, even the input could be more than just text as well, if the UA wanted to allow the user's own images to be turned into memes. This proposal should be easily extensible to cover such future scenarios, and we wouldn't want to create a new attribute each time.

The autocomplete attribute feels like a good fit because it is already used today for "UA-driven autofill". In addition, there is an established pattern with details tokens for authors to hint to the UA on the type of autofill that is desired. So, if needed in the future, it would be easy to support something like autocomplete='text-suggestions' or autocomplete='image-suggestions'.

sanketj avatar Sep 05 '23 06:09 sanketj

I also think it should be made clear which input type (before input event and input event) will be used for this. There is called called insertReplacementText and the usage is described as "replace existing text by means of a spell checker, auto-correct or similar". I can see from the name of it that it was initially meant to be used for spell- and grammar checkers, but it would still seem like the most appropriate one. Else, maybe we need to add another type to the chart.

Yes, I agree. Perhaps the insertText input type[1] should be used if it is purely an insertion scenario, such as these use cases, and insertReplacementText should be used if any content (ex. the user's selection) is also being replaced?

[1] https://www.w3.org/TR/input-events-1/#interface-InputEvent-Attributes

sanketj avatar Sep 05 '23 06:09 sanketj

From the explainer:

Many sophisticated editors that could benefit from the EditContext API also integrate their own writing assistance features and thus may opt out of browser-powered autocompletion (ex. Google Docs, Word Online). Therefore, it is unclear whether supporting the autocomplete attribute on EditContext editable hosts will be useful.

Most production level JS richtext editors on the web are quite sophisticated and will consist of thousands of lines of code and have 5-20 years of development behind them. However, a lot of these can be run completely in JavaScript (open source libraries such as CKEditor, ProseMirror, TinyMCE, etc.). And most of the more robust ones already do what EditContext promises in that they diff the dom after browser-initiated DOM changes and then potentially roll back some of those. Switching to EditContext will in many such cases mean a simplification of the code as one can skip diffing and rolling DOM changes back. So if and when EditContext actually ships eevrywhere, I would think that a lot of these libraries will eventually switch to it.

However, hosting a LLM on a server is a bit more complicated than serving a JS-based editor on a website. I would therefore think it makes a lot of sense to add both spell checking and this new feature also to EditContext. That would also be consistent with other decisions you made, such as adding using the native selection as an option to EditContext even though Google Docs and other larger online word processors don't make use of it, precisely because it is also to be useful for smaller sites.

This is good feedback, thanks. I do agree that spellcheck, and with this proposal autocomplete, are potential gaps in the EditContext API. I'll follow up and share additional details on our thinking here. cc: @dandclark @alexkeng

sanketj avatar Sep 05 '23 06:09 sanketj

The second, which is not documented in the explainer (unless I missed it?), is about what values the autocomplete attribute may have when it is used on a non-form field. It sounds like the intention for now is simply to allow autocomplete=off to disable UA behaviors. Is that correct? Or are you proposing to allow all of the existing autocomplete values (e.g. autocomplete="street-address")?

UAs should continue to respect autofill details tokens even in non-form fill scenarios since these are ways for authors to filter to the type of autofill that is desired. So if the author used autocomplete='street-address', the UA should only provide address suggestions. Similarly, as variations of this example, the UA would suggest phone numbers only if the author set autocomplete='tel' or emails only if the author set autcomplete='email'.

Or are you proposing new values entirely (e.g. autocomplete=suggestions)?

New values are not currently being proposed, but that would be how I see this evolving. (see this related comment). Since writing assistance scenarios like text predictions are not supported with form fill, we will need new tokens for authors to hint about these new types of autofill.

On the other hand, it might be good practice to add a new token whenever a browser introduces a new type of autofill, in which case perhaps a token like suggestions or text-suggestions should be introduced for these use cases. Curious to hear other perspectives on this.

sanketj avatar Sep 05 '23 07:09 sanketj

The main downside with something like textprediction or textcompletion is that it may not cover future use cases. Ex (completely hypothetical): A browser may decide to ship an in-built meme generator in the future. In this case, the autofill suggestions would be images instead of text. In fact, even the input could be more than just text as well, if the UA wanted to allow the user's own images to be turned into memes. This proposal should be easily extensible to cover such future scenarios, and we wouldn't want to create a new attribute each time.

The issue you are having here is with terms that include the term "text", right? How about picking a term that does not include "text"?

So this will initially be plaintext and then in the future could also produce other markup. Images will be added inline then? Or how do you communicate to the editor that there is an image? If it can contain media, then maybe a way to do it would be to make it a type of paste with a DataTransfer object.

The autocomplete attribute feels like a good fit because it is already used today for "UA-driven autofill". In addition, there is an established pattern with details tokens for authors to hint to the UA on the type of autofill that is desired. So, if needed in the future, it would be easy to support something like autocomplete='text-suggestions' or autocomplete='image-suggestions'.

Maybe I don't fully understand, but it sounds to me like a very different feature than what autocomplete is today, such as:

Current autocomplete:

  • Is tied to form input.
  • Only works for simple input (plaintext/select).
  • Will replace the entire value of the element.

The autocomplete function mentioned in this proposal:

  • Is not related to form input.
  • Is meant to be used for complex richtext content.
  • Will replace parts of or add to the already existing value/contents of the element.

Is that correctly understood? So while you may want to specify details on where the input comes from, none of the existing detail tokens will be useful for what you are trying to achieve, correct? And how will you specify, given the current syntax, that the autocomplete is to provide both image and text suggestions? And maybe you additionally need to specify that this contenteditable field is to use "casual college student style" (or some such thing) to give the LLM a better idea about what kind of text it is to produce?

Perhaps the insertText input type[1] should be used if it is purely an insertion scenario, such as these use cases, and insertReplacementText should be used if any content (ex. the user's selection) is also being replaced?

Don't worry about the name of types. The important part are the situations they are to be used in according to the specification. "insertText" is defined as to be used for "insert typed plain text". Given that this isn't text that is being typed, that does not seem like the right one. "insertReplacementText" is to be used for every type of text that originates from "a spell checker, auto-correct or similar". All the types can be used for both replacing existing content and for inserting entirely new content. The point of using the different types is to let the JS editor app know where the text comes from so that it can react differently to it.

For example, a school writing app may allow aid from LLMs, but requires the purely LLM-generated text to be marked in some way. College professors in some places are currently making such requirements, but it's very difficult for students given the current tools to keep track of the parts that are AI-generated. This sort of marking requirement might even become part of legislation in some places with new AI legislation.

That being said, when we wrote this, we had not anticipated that "a spell checker, auto-correct or similar" would create entirely new text without replacing something else, which is why the term "insertReplacementText" was chosen. Solutions for this could be to either add a new term for this kind of content (for example "insertFromGenerator" to also accommodate future options of inserting other types of content) or to simply change the description of "insertReplacementText" to clarify that it is also to be used when adding new text without replacing existing content.

[1] https://www.w3.org/TR/input-events-1/#interface-InputEvent-Attributes

johanneswilm avatar Sep 05 '23 08:09 johanneswilm

Maybe I don't fully understand, but it sounds to me like a very different feature than what autocomplete is today, such as:

Current autocomplete:

  • Is tied to form input.
  • Only works for simple input (plaintext/select).
  • Will replace the entire value of the element.

The autocomplete function mentioned in this proposal:

  • Is not related to form input.
  • Is meant to be used for complex richtext content.
  • Will replace parts of or add to the already existing value/contents of the element.

Is that correctly understood? So while you may want to specify details on where the input comes from, none of the existing detail tokens will be useful for what you are trying to achieve, correct? And how will you specify, given the current syntax, that the autocomplete is to provide both image and text suggestions? And maybe you additionally need to specify that this contenteditable field is to use "casual college student style" (or some such thing) to give the LLM a better idea about what kind of text it is to produce?

I agree that these differences are substantial and autocomplete would work quite differently in a form control than in an editable region. The main appeal to use autocomplete is that there is an existing mechanism for details tokens that could be re-used, and the name fits reasonably well. That said, I see limited precedent for re-using attributes in this way, so I'm open to introducing a new attribute instead if we believe that's better. Curious to hear other perspectives on this.

sanketj avatar Sep 07 '23 23:09 sanketj