html
html copied to clipboard
Proposal: HTML passwordrules attribute
HTML passwordrules
attribute
Motivation
Some user agents offer to generate random per-site passwords on behalf of the user. Safari has built-in support for this, and add-on password managers such as 1Password add this functionality. This feature improves user security by guaranteeing high-entropy passwords and avoiding reuse of the same password on multiple sites.
One challenge with this approach is that sites have different rules for valid passwords. Many sites require characters from specific sets to be present, or have other constraints. The best known solution is to have a generator rule that matches the password requirements of many sites, plus a curated list of per-site quirks for sites with unusual requirements.
A better solution would be for the website to express its password requirements in machine-readable form, and in a format that is suited for use with a generation algorithm. While the pattern
attribute allows expressing many value constraints, it's very hard to use it to drive a generator. It's also tricky to express many popular password constraints (such as a limit on the number of consecutive repeated characters) in a regexp.
Proposed Solution
We propose a new content attribute on the HTML input element called passwordrules
and define a mini syntax for web authors to use to express their requirements (rules). We describe how a user agent will makes use of these rules and the minimum requirements for the user agent to honor these rules below.
Extensions to HTML
We propose the following new content attribute be added to the HTML input element:
passwordrules
Using the passwordrules
attribute
The passwordrules
attribute, when specified, describes the set of extra restrictions on the value of the element's value
attribute that a user agent must consider when generating a password and performing client-side form validation. Its value is a semicolon delimited string of one or more property/value pairs and has the form:
required: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); allowed: (<identifier> | <character-class>), ..., (<identifier> | <character-class>); max-consecutive: <non-negative-integer>
An <identifier>
must case-insensitively match one of the following strings: upper
, lower
, digit
, special
, ascii-printable
, and unicode
. These identifiers correspond to the set of ASCII uppercase letters (A-Z), lowercase letters (a-z), digits (0-9), all other ASCII printable characters - including the space character - (-~!@#$%^&*_+=`|(){}[:;"'<>,.? ]), all ASCII printable characters, and all Unicode characters, respectively.
A <character-class>
is a custom characters class.
A <non-negative-integer>
is a valid non-negative integer.
The missing value default for passwordrules
is allowed: ascii-printable
. There is no invalid value default.
The values of multiple ~~required
/~~allowed
properties are concatenated together and multiple max-consecutive
properties behave as if a single max-consecutive
property was specified whose value is the minimum of all max-consecutive
properties. Duplicate property values are ignored. Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes. Empty character classes are ignored. Properties without a value are ignored. The following examples illustrate the aforementioned equivalences:
~~required: upper; required: lower
<=> required: upper, lower
~~
allowed: upper; allowed: lower
<=> allowed: upper, lower
max-consecutive: 4; max-consecutive: 2
<=> max-consecutive: 2
required: upper, lower, upper
<=> required: upper, lower
required: [abc], [def]
<=> required: [abcdef]
allowed: upper, []
<=> allowed: upper
required: ; allowed: upper
<=> allowed: upper
NOTE: The expression required: upper; required: lower
is NOT equivalent to required: upper, lower
. See Requiring that a password contain certain characters.
If you do not specify the max-consecutive
property then it defaults to being unbounded. That is, the user agent can generate a password with one or more arbitrary length runs of the same character (e.g. ooops).
If you specify the required
property and do not specify the allowed
property then the user agent will infer the value of the allowed
property according to the rules in How a user agent determines the allowed
characters.
For example, to require a password have at least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number, and at most two consecutive characters, add this to your markup:
<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; max-consecutive: 2">
To require at least one digit or one of -().&@?'#,/"+ (not both), add this to your markup:
<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit, [-().&@?'#,/"+]; max-consecutive: 2">
Or to require at least one of -().&@?'#,/"+, add this to your markup:
<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; required: [-().&@?'#,/"+]; max-consecutive: 2">
Alternatively, to optionally allow one of -().&@?'#,/"+, add this to your markup:
<input type="password" minlength="8" passwordrules="required: upper; required: lower; required: digit; allowed: [-().&@?'#,/"+]; max-consecutive: 2">
Another example, to allow a password to contain an arbitrary mix of letters, numbers, and -().&@?'#,/"+, add this to your markup:
<input type="password" minlength="8" passwordrules="allowed: upper, lower, digit, [-().&@?'#,/"+]">
WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.
NOTE: Setting the passwordrules
attribute to allowed: unicode
provides the most entropy for a user agent generated password. Omitting the passwordrules
attribute or setting it to the empty string provides the second most entropy for a user agent generated password.
Custom character classes
A custom character class is a list of ASCII characters that are surrounded by square brackets (e.g. [abc]). Any non-ASCII printable characters in the set are ignored. The dash character (-) is reserved as a special character. To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.
Specifying the characters allowed to be in a password
The value of the allowed
property is a comma-separated list of character class identifiers or custom character classes, or both. Each custom characters class represents a set of characters that are allowed to be in the generated password. For example, if the allowed
property is set to [*]]
then the generated password is allowed to contain ']' and '*', but it is not allowed to contain '[' among other non-listed characters. If the allowed
property is set to digit, [@!]
then the generated password is allowed to contain one or more ASCII digits, one or more '@'s and one or more '!'s, but it is not allowed to contain '[' among other non-listed characters.
Requiring that a password contain certain characters
~~You can require that a password contain certain characters or classes of characters by setting the value of the required
property to a comma-separated list of character class identifiers or custom character classes, or both. For example, if the required
property is set to upper, digit
then the user agent MUST generate a password that contains at least one ASCII uppercase letter and at least one digit. If required
is set to upper, [@!]
then the user agent MUST generate a password that contains at least one ASCII uppercase letter and either '@' or '!'.~~
A user agent must generate a password that contains at least one character from each required
property. For example, if the passwordrules
attribute is set to required: upper; required: digit
then the user agent MUST generate a password that contains at least ASCII uppercase letter and at least one digit. If there is a single required
property that is set to upper, digit
then the user agent MUST generate a password that contains at least one ASCII uppercase letter or at least one digit. If there is a single required
property that is set to upper, [@!]
then the user agent MUST generate a password that contains at least one ASCII uppercase letter or '@' or '!'.
Limiting the number of consecutive repeated characters
The value of max-consecutive
is a non-negative integer that represents the maximum length of a run of consecutive identical characters that can be present in the generated password. For example, set max-consecutive
to 2
to disallow a user agent from generating a password that contains a run of more than 2 of the same character (e.g. "ooops" - contains three consecutive o's).
How a user agent determines the allowed
characters
The set of required characters MUST always be a subset of the set of allowed characters. If the value of passwordrules
violates this constraint then the user agent MUST adjust the value of allowed
to satisfy it. The following implications immediately fall out from this constraint:
- If you specify the
required
property and do not specify theallowed
property then theallowed
property is inferred to be the value of therequired
property. - If you set both the
required
property and theallowed
property then the user agent behaves as if theallowed
property were set to the union of the value of theallowed
property and the value of therequired
property. For example, if therequired
property is set tolower
and theallowed
property is set to[abc0123]
then the user agent MUST behave as if theallowed
property were set tolower, [0123]
. Another example, if therequired
property is set tolower
and theallowed
property is set toupper
then the user agent MUST behave as if theallowed
property were set tolower, upper
. - If neither the
required
property nor theallowed
property are specified then the user agent behaves as if theallowed
property was set toascii-printable
.
How a user agent generates a password based on passwordrules
A user agent will generate a password using an algorithm or heuristic of its choice that respects the following attributes of a password element (not necessarily in order): minlength
, maxlength
, and passwordrules
. If the set of constraints imposed by the aforementioned attributes fail to meet the following minimum restrictions then they are considered nonconforming and the user agent is REQUIRED to ignore them:
- The maximum password length cannot be less than 12.
- Allowed characters must consist of at least two of the following character classes: ASCII uppercase letter, ASCII lowercase letters, digits.
Characters in the generated password MUST be expressed in Normalization Form C and must conform to the following UAX31 profile:
- Start := ID_Continue + Pattern_Syntax + Pattern_White_Space, plus all characters from Table 3, Table 3a, and Table 3b except Join_Control characters (i.e. ZWJ, ZWNJ).
- Continue := Start.
- Medial := None.
Interaction with client-side form validation
It is not recommended to specify both the pattern
attribute and the passwordrules
attribute.
The passwordrules
attributes participates in constraint validation. If the element's value
attribute does not satisfy the criterion specified by the value of the passwordrules
attribute then the element is in the "suffering from a passwordrules
mismatch" validity state and the element is invalid for the purposes of constraint validation.
Confirmation password field
Some web pages have both a password field ("primary password field") and a confirmation password field. The passwordrules
attribute needs only to appear on one of these fields. If both fields have the passwordrules
attribute then you must ensure that they have the same value. Otherwise, the user agent will behave as if both fields have set their passwordrules
attribute to the result of the union of both field's required
property (if any) and the intersection of both field's allowed
property (if any) after simplifying the passwordrules
attribute of both fields according to rules in Using the passwordrules
attribute. For example, if a page contains the following markup:
<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/"+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper; allowed: [!]; max-consecutive: 3">
Then the user agent must behave as if the markup was:
<input type="password" name="password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/"+]; max-consecutive: 2">
<input type="password" name="confirmation-password" minlength="8" passwordrules="required: upper, lower, digit, [-().&@?'#,/"+]; max-consecutive: 2">
CC @annevk @zcorpan @tkent-google @rniwa @hober @othermaciej
/cc @battre
possible refinement, since a perverse required
could apparently shrink the space a lot: "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60" or something like that -- otherwise I think there are some very low-entropy edge-cases that come about due to too many required
elements effectively turning the generated password into a mere permutation of those elements
@bsitter Good catch! I agree that it's bad if perverse password rules limit the number of possibilities to an overly low number.
For an implementation requirement like "the password constraints will be ignored if they would reduce the number of possible passwords below 2**60", the standard would need to include an algorithm to calculate the number of possible passwords to enable interoperable behavior. If browsers calculated it slightly differently, it would be a significant interop problem.
We tried to evade the need for a full entropy calculation by having higher-level rules to ensure a wide enough range of passwords. Specifically, passwordrules must be ignored if the max length too low, or the set of allowed characters is too small a range. You are right that excessive "required" directives could also overly limit the passwords. In the spirit of the easier to determine rule for rejecting overly restrictive "passwordrules", how about setting an upper limit on the number of "required" directives that may be present?
First of all, I really like this! Giving declarative credential generation more love is great.
My main worry here is the complexity of the attribute and requiring another custom parser for it. Can we consolidate that with something somehow? Perhaps just having more attributes or going full JSON?
Should we also integrate this with https://w3c.github.io/webappsec-credential-management/ somehow? I understand that has adoption due to WebAuthn so presumably it's something that'll stick around and we need to account for?
(The other thing we should include in the examples advocating this technique is autocomplete=current-password
and autocomplete=new-password
. This is only needed for the latter (and only for the first of its kind on a page, per OP).)
@othermaciej indeed, and I actually considered including such a wrinkle on my original comment but realized that some required
values aren't particularly bad this way while others are, and evaluating them this way is a little problematic (approaching the complexity of overall entropy computation.) A very rough approximation might be: maxlength
may be no less than 12 + the number of "trivial" required
elements. To be considered trivial, a required
element must permit no more than 35 possibilities in the printable ASCII range. correction: cutoff was supposed to be 31 - this means allowing punctuation as a non-trivial required
element, which should satisfy lots of existing rules without undue penalty
Another question: is character class merging the intended behavior for required
? It seems like it shouldn't be, but this suggests otherwise:
Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes
Otherwise required
only ever has at most one character of influence
@bsittler I think what this proposal says is right for allowed
but probably not for required
. required: upper, lower
should require at least one ASCII alphabetic character, while required: upper; required: lower
should require at least one uppercase character and at least one lowercase. I am not sure what @dbates-wk 's intent was when writing this but I think that's how it should work. For allowed
, multiple directives and a single directive with commas would be equivalent under any reasonable interpretation.
On the "trivial character class" rule, that makes sense to me as an approach, but the specific proposal would require a minimum length of 15 instead of 12 for passwords with the typical "must include at least one uppercase, at least one lowercase, at least one number" restriction. If in addition a special character is required, that would be a minimum length of 16. That seems excessive, as adequate entropy is possible for 12-character passwords with either of these common restrictions.
@annevk We care more about the capabilities than the syntax. That said:
- Multiple attributes is possible, but it would result in three attributes of which two have (similar) nontrivial syntax, so it would not avoid the need for an extra mini-parser.
- JSON seems like overkill.
- Credential Management is programmatic, while this is declarative (and that's part of the use case). So not clear how they could be integrated. I don't think the parts of Credential Management that aren't required for WebAuthN are likely to get wide traction.
I disagree with the fundamental premise of this. :( Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.
Do we really want to be adding a feature whose primary use-case is making it easier for already-broken sites to continue being broken?
Restrictions on passwords are indeed bad. I agree it would be best if they went away. But it also seems unlikely they will go away any time soon.
Password generators are extremely good. About the safest thing anyone can do for their online security is to use a unique randomly generated password for each site.
If password generators can't work with the existing password restrictions of websites, then that leads to a bad user experience (user counts on generator, then the site rejects their password) and poor security (user makes up a weak or reused password on the spot). The current state of the art is to maintain a list of site-specific quirks to get the password generator to do its job right. Safari has a pretty extensive set. We'd like password generators (including ours) to be able to do a good job without needing a quirks list.
Thus, even though password restrictions are likely harmful on net (other than minimum length), the most practical harm reduction is for sites with restrictions to make it obvious and machine readable what those restrictions are.
@annevk:
Restrictions on passwords beyond minimum length (and maybe a large maximum length) are all fundamentally bad, particularly restricted characters - such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database). Required characters are also generally a bad restriction - it's much better to simply increase the minimum length and let people use whatever characters they (or their pw generators) want.
If the WHATWG decides to add a passwordrules
attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless and Bad and that storing passwords as plain text is Very Bad. This news still has not percolated through to many IT organizations; any opportunity to forcefully communicate this to them is valuable. As long as password restrictions remain a common practice on the web, for better and for worse, the new attribute could be a good opportunity for the WHATWG to emphatically recommend that web developers not use password restrictions at all.
It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?
Another question: is character class merging the intended behavior for required? It seems like it shouldn't be, but this suggests otherwise:
Specifying multiple character classes is equivalent to specifying one character class that represents the union of the characters in all character classes Otherwise required only ever has at most one character of influence
You're right! I updated my proposal to remove this sentence (indicated by a strikethrough).
If the WHATWG decides to add a passwordrules attribute, the attribute’s specification could include an informative note stressing that password restrictions are Useless
I take it you feel that the WARNING paragraph in the proposal is not sufficient?
It seems like many, but not all, use cases in the OP can be covered by the existing pattern attribute. (For example, specifying allowed or disallowed characters.) Could we consider scoping this down to only the use cases that cannot be accomplished with today's technology?
Although some of the use cases could be accomplished with today's technology they cannot be accomplished easily or succinctly. For instance, consider the following common variant of the first example in the proposal that disregards the consecutive character requirement: a password that has least 8 characters consisting of a mix of uppercase and lowercase letters, at least one number. This can be accomplished with today's technology. It is non-trivial to do so. Accomplishing this task or variants of it are exemplified by the regexps in https://stackoverflow.com/questions/19605150/regex-for-password-must-contain-at-least-eight-characters-at-least-one-number-a.
@js-choi I don't think password restrictions are related to storing passwords in plaintext. They are either because of dumb legacy system limitations (max lengths, very restricted set of allowed characters), actually good (minimum length limit) or well-intentioned attempts to get users to make handmade passwords that are resilient to guessing or offline dictionary attack against a leaked hashed password database (for example, the popular "one letter, one number, one special" requirement).
@othermaciej: I agree insofar that many cases of password restrictions are due to dumb legacy system limitations or well-intentioned encouragement of better handmade passwords. I was mostly responding to @tabatkins’s saying that "such restrictions indicate that the site is storing passwords very badly (in plaintext, with bad escaping practices when interacting with their database)”, which may well also be sometimes true.
@dbates-wk: The currently worded warning:
WARNING: With the exception of the NOTE below, each property/value pair reduces the entropy of a user agent generated password and makes the password more likely to be guessed or brute-forced. The more characters that are required the more likely the user agent generated password can be guessed or brute-forced.
…is not quite forceful or empathetic in discouraging password restrictions in general, a discouragement that @tabatkins probably believes ought to be done. I personally am sympathetic to his view, but I am also sympathetic to making usability better for users of password managers. From my own field, bad password restrictions are a particularly pernicious problem in healthcare/clinical applications.
Addressing password restrictions at all may be seen by developers as a general statement from WHATWG on its disposition toward password restrictions, for better or for worse. Care should therefore be crafted in how its specification is worded: it probably would not hurt for that warning above to be more forceful and empathetic against password restrictions in general. Such force may somewhat ameliorate @tabatkins’s general reservations against addressing password restrictions at all.
@domenic We thought about just using the pattern
attribute, but there are two challenges:
(1) Consider a common limitation like: "must contain at least one letter and one at least number, and may contain !@#$%^&*()_+-=". It's possible to do with a regexp but it's pretty non-obvious.
Here is the clearest regexp I could come up with that implements this rule: (([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*)|(([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[0-9]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*[A-Za-z]([A-Za-z]|[0-9]|[-!@#$%^&*()_+=])*)
. That is a lot harder to write correctly and a lot harder to understand than "required: upper, lower; required: number; allowed: [-!@#$%^&*()_+=]". Using regexps to represent this rule is likely too hard for web developers to do correctly.
(2) In theory it's possible to use a regexp to drive generation rather than matching, but it's pretty hard. Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory, but way harder than getting it to produce a random password that matches rules of the type that passwordrules
supports. Also, password generators may try to be clever and make passwords that are easy to type (for cases where you have to log into another device without benefit of autofill), at least when rules are flexible enough to allow it. It's straightforward to do this with the limited kinds of rules that passwordrules
supports but infeasible to do with a generator that can be driven by a regexp. You could argue that maybe only a subset of regexps should be supported, but how do you decide what subset? It can take a very complex regexp just to represent a simple rule. It's also not very good for web devs if they are supposed to use pattern
but must be very careful what they put in it or it will be ignored.
So even though passwordrules
is technically redundant with pattern
, it's still a practical addition because it makes the password requirements easier to write, easier to understand, easier to verify, and easier to feed to a generator. This is what made us conclude that we need a new feature and can't just reuse pattern
.
@js-choi I am fine with having a more assertive warning. I think the wording in the spec will have very little influence on prevalence of password restrictions one way or the other, but we should do our best to avoid proliferating restrictions even a little.
Are there many sites out there which restrict passwords in this way, yet still receive enough attention from developers who would be likely to add this attribute? It's anecdotal, but the only sites on which I've encountered restrictive password limitations are ones which have not seen updates for years.
I'm also a bit concerned that adding an attribute - despite warnings in the spec - might encourage more sites to introduce restrictions. Do you have data showing how common password restrictions are today, and if their usage is declining? It seems like this might become a smaller problem within a few years, as older systems get replaced.
Getting a password generator to produce a uniformly random password that matches an arbitrary regexp is possible in theory...
It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the pattern
attribute until you find the most preferable one which is allowed. How long is the list of sites with unusual rules, and how quirky are they?
Unless we foresee other uses for it besides covering the remaining cases for password rules, the added complexity of introducing a unique syntax ought to be avoided.
@Zirro Many sites have password restrictions, including ones that are popular and actively maintained.
For example, here's the restrictions from etrade.com (as stated by the site):
- Needs 8-32 characters with no spaces
- Needs at least one number
- Needs at least one uppercase and one lowercase letter
- Cannot be the same as your user ID
Other sites have hidden restrictions. They don't name any up front, but reject some passwords in practice.
It seems like it should be possible to cover 99% of cases by generating ~50 passwords according to different rules and transparently match them against the
pattern
attribute until you find the most preferable one which is allowed.
This is inefficient and likely to still fail in edge cases, so I doubt we'd adopt this over a quirks list. Also, the bigger problem with pattern
is that it's very hard to write regexps that correctly implement many popular password limitations. Site authors could use pattern
today but they don't.
While I am sympathetic to the desire to avoid technically redundant features, I think framing password rules in a more direct way will solve a real practical problem that can't be solved just by pushing existing features harder.
@othermaciej:
@bsittler I think what this proposal says is right for allowed but probably not for required. required: upper, lower should require at least one ASCII alphabetic character, while required: upper; required: lower should require at least one uppercase character and at least one lowercase.
Fixed this up to match your expectation.
I updated the proposal. With the exception of the example sections, I demarcated removals from- and additions to- the original proposal using strikethrough and italic, respectively.
@dbates-wk the updates are improvements from my point of view. A few issues still concern me:
- So far as I can tell there is no limit on how many narrow-character-class
required:
limitations a site can impose, which means:- unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the
passwordrules
- in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of
allowed:
; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination
- unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the
- The character class syntax differs from JS regular expressions; is this intentional? If so, it should be noted more prominently; if not, the gaps should be closed
- As an example: how would a requirement for one of
[
,-
or]
be expressed? I believe in JS regular expressions it would be[-[\]]
@othermaciej
Many sites have password restrictions, including ones that are popular and actively maintained.
The actively maintained sites could be educated to lessen their technical restrictions. They could still give their users recommendations for chosing a good password without actually reducing the space of possible passwords.
@bsittler
So far as I can tell there is no limit on how many narrow-character-class required: limitations a site can impose, which means: unacceptable entropy-reduction is possible; I believe target minimum entropy levels for generators should be clearly stated in the proposal, even if no formula is given to compute the level based on the passwordrules
Do you have a particular minimum entropy level in mind? Otherwise, I will think about it and get back to you.
in these cases selecting conforming password candidates from a less-restricted candidate list may take too long to terminate - e.g. a list generated naively based on random selection (for each character) from the printable-ASCII subset of allowed:; to overcome this I think the proposal needs to include at least a rough outline or pseudocode for a conforming generator with guaranteed termination
The character class syntax differs from JS regular expressions; is this intentional?
Yes, this is intentional because we do not need to represent arbitrary character ranges given that a custom character class syntax is designed to only contain ASCII printable characters and we expose literals to represent all the common ASCII printable character ranges (e.g. "lowercase' is equivalent to regex [a-z]+). The current proposal reserves '-' should we need to support arbitrary character ranges. See section "Custom character classes" or my reply to your last question for details on how to express '-' using the proposed syntax.
If so, it should be noted more prominently;
OK. I can add a remark about this.
[...] As an example: how would a requirement for one of [, - or ] be expressed? I believe in JS regular expressions it would be [-[]]
No escaping is necessary to express '[': [[]. The third from the last sentence and last sentence of section "Custom character classes" explain how to express '-' and ']', respectively. Quoting the proposal:
To list '-' as a literal character it must appear immediately after the opening square bracket '['. The right square bracket (]) is also reserved as a special character. To list ']' as a literal character it must appear immediately before the closing square bracket ']'.
I think 90 bits is a reasonable minimum, but will happily defer to real cryptographers.
And thank you for addressing the rest of those questions! It might be worth including the [-[]] example as its syntax may be a bit surprising for someone familiar with regular expressions
90 bits of entropy is excessive. For the "repeatedly guess" threat model, a much lower number of bits will stop the attacker (so long as the website has reasonable rate limits and/or an attempt limit). Even 20 bits is reasonably effective for this case (though obviously not ideal). 20 bits is equivalent to a 6-digit numeric passcode.
For the "offline attack against leaked database" threat model, the number of bits needed depends on the quality of password hashing used by the website. I did the math on this a while ago based on fastest known password cracking and then assuming a few power of two speedups on top of that:
Strong (bcrypt, PBKDF, scrypt): ~47 bits needed Decent (SHA512): ~49 bits needed Poor (SHA1): ~66 bits needed Terrible (NTLM, DES CRYPT, MD5): just give up
So it's probably not right to have a hard limit significantly higher than 47 bits. Note that if the site allows entropy somewhat below the limit, it's probably still better to know their password rules and make a generated password, instead of ignoring them and forcing the user to make a manual one.
As an extra safety margin, Safari tries to generate passwords with >70 bits of entropy, but we would still want to generate something on sites that won't allow our full format.
Also, entropy calculations are nontrivial, especially in the presence of multiple required character classes. We can't just make it a vague requirement without including the calculation algorithm. Based on this I don't think we should have a direct entropy limit at all. Instead, we should have limitations that are more readily checkable.
90 bits is assuming "just give up"-quality hashing (unfortunately still widely used), an offline attack, a well-funded attacker, and cheap hardware on a large scale (e.g. botnet or dedicated cryptomining-style hardware farms)