pebble icon indicating copy to clipboard operation
pebble copied to clipboard

This implements more sleeps when clients are expected to poll

Open squizzling opened this issue 3 years ago • 9 comments

Specifically this adds two additional delay points:

  • Between completing validation but before updating the order state
  • After signing the certificate and before marking it as ready

The first is controlled by the existing PEBBLE_VA_* environment variables, and the second is controlled by equivalent PEBBLE_CA_* environment variables.

It also corrects a minor typo in the README.md (default is 5s, not 15s), and removes shadowing of the internal len function.

squizzling avatar Jun 15 '21 04:06 squizzling

Because this has the potential to add artificial delays in something like a CI environment, I am particularly concerned about the "default enabled" for the new feature. The worst case is up to 5 seconds per CA issuance, although that could be multiplied a bunch depending on what is being done.

I'm not sure what the guiding principal is for this, so I went with consistency of existing behavior, but happy to reverse it.

squizzling avatar Jun 15 '21 04:06 squizzling

This seems reasonable to me! The other approach that comes to mind would be to instead add new PEBBLE_NOSLEEP and PEBBLE_SLEEPTIME env vars, have both the CA and VA reference the same vars (with the VA preferring the new ones, and falling back to the old ones), and then eventually deprecate the _VA_ versions. I'm not sure which is better -- Pebble is more widely used by other folks than Boulder is, but I'm not (personally) aware of other times we've deprecated/removed Pebble features. Might require a 3.0 release.

aarongable avatar Jun 15 '21 15:06 aarongable

I like the idea of consolidating and deprecating. Having 2 things under the VA flag is a bit of a code smell.

Since it's a new interface, I'll collapse it down to just PEBBLE_SLEEPTIME, and handle 0 for disabling.

As a draft proposal, I think VA should control both VA and CA (despite the name, because users in this configuration already have delays), emit deprecation warnings if present, and favor PEBBLE_SLEEPTIME

Summary:

  • PEBBLE_VA_* missing, PEBBLE_SLEEPTIME missing, 5s delay on both
  • PEBBLE_VA_* present, PEBBLE_SLEEPTIME missing, VA delay (possibly 0 if NOSLEEP) on both, emit a deprecated message on startup
  • PEBBLE_VA_* missing, PEBBLE_SLEEPTIME present, specified delay on both
  • PEBBLE_VA_* present, PEBBLE_SLEEPTIME present, use PEBBLE_SLEEPTIME, emit a warning message on startup that both are present and one is deprecated.

All combinations would log their eventual configuration, integer parse errors are handled by treating the env var as missing (PEBBLE_SLEEPTIME=foo falls back to PEBBLE_VA_SLEEPTIME and PEBBLE_VA_SLEEPTIME=foo falls back to 5 second default)

Sound reasonable?

squizzling avatar Jun 15 '21 16:06 squizzling

Yep, sounds like a good plan to me.

aarongable avatar Jun 15 '21 20:06 aarongable

In theory this is ready, however I have two concerns:

  • There's an API change (passing sleepTime to ca.New and va.New), so it may trigger a major version bump. I don't know if this project does versioning based on "application interface" or "developer interface". I could do some refactoring to not do this, but this seems the clean approach.
  • certbot uses PEBBLE_VA_NOSLEEP (ref), so I'm not sure what coordination is required there. We're still backwards compatible though.

squizzling avatar Jun 22 '21 09:06 squizzling

Should not be merged until #351 is handled, as it will increase the polling and make failures more likely.

squizzling avatar Jun 22 '21 09:06 squizzling

Hey folks, noticed that this PR ended up stalling but I am possibly looking for some of these sleeps to test timeouts. Any thoughts on how to proceed? Anything changed since 2 years ago?

I might just end up trying to find a way around this, but I'm was thinking of trying to use this to test post-challenge certificate order timeouts.

vancluever avatar Sep 20 '23 18:09 vancluever

I drifted away from the project, but it looks like #351 is handled, so I can clean up the PR and re-submit it.

edit: wrong account, but it's still me.

tiedotguy avatar Sep 20 '23 23:09 tiedotguy

@tiedotguy I'm going to look to unit testing for the thing I need to test, so no worries if this might be a challenge to implement.

I'm just looking to test vancluever/terraform-provier-acme#349, but can probably just test what it's translated to in lego configuration itself. I haven't reviewed the code too much on the pebble side to know exactly where the sleep would need to be inserted anyway, and I'd need it to be static value (or at least a minimum value, versus the random sleep that I think happens right now).

vancluever avatar Sep 21 '23 16:09 vancluever