llm
llm copied to clipboard
Monotonic ULIDs to fix the tests
I keep running into tests that fail in weird ways because ULIDs in the same ms don't sort reliably.
It turns out the ULID spec covers this: https://github.com/ulid/spec/blob/master/README.md#monotonicity
But the Python library I am using does not (yet) have it:
- https://github.com/mdomke/python-ulid/issues/1
This other Python library does support this, but doesn't have as much development activity or usage (stars etc). https://github.com/itsrainingmani/py-ulid/blob/main/README.md#monotonic-ulids
Trying this out...
llm \
-f https://raw.githubusercontent.com/mdomke/python-ulid/refs/heads/main/ulid/__init__.py \
-f https://raw.githubusercontent.com/ulid/spec/refs/heads/master/README.md \
-s 'Implement a subclass of ULID that adds monotonic generation by persisting state as to the last one it created on the class itself' \
-m o3
https://gist.github.com/simonw/5a20bf58a977d6c96747cbe13a38aa0f
That code might just work!
OK, I knocked it into shape and I think this is good:
NANOSECS_IN_MILLISECS = 1000000
TIMESTAMP_LEN = 6
RANDOMNESS_LEN = 10
_lock: Final = threading.Lock()
_last: bytes | None = None # 16-byte last produced ULID
def monotonic_ulid() -> ULID:
"""
Return a ULID instance that is guaranteed to be *strictly larger* than every
other ULID returned by this function inside the same process.
It works the same way the reference JavaScript `monotonicFactory` does:
* If the current call happens in the same millisecond as the previous
one, the 80-bit randomness part is incremented by exactly one.
* As soon as the system clock moves forward, a brand-new ULID with
cryptographically secure randomness is generated.
* If more than 2**80 ULIDs are requested within a single millisecond
an `OverflowError` is raised (practically impossible).
"""
global _last
now_ms = time.time_ns() // NANOSECS_IN_MILLISECS
with _lock:
# First call
if _last is None:
_last = _fresh(now_ms)
return ULID(_last)
# Decode timestamp from the last ULID we handed out
last_ms = int.from_bytes(_last[:TIMESTAMP_LEN], "big")
# If the millisecond is the same, increment the randomness
if now_ms == last_ms:
rand_int = int.from_bytes(_last[TIMESTAMP_LEN:], "big") + 1
if rand_int >= 1 << (RANDOMNESS_LEN * 8):
raise OverflowError(
"Randomness overflow: > 2**80 ULIDs requested "
"in one millisecond!"
)
randomness = rand_int.to_bytes(RANDOMNESS_LEN, "big")
_last = _last[:TIMESTAMP_LEN] + randomness
return ULID(_last)
# New millisecond, start fresh
_last = _fresh(now_ms)
return ULID(_last)
def _fresh(ms: int) -> bytes:
"""Build a brand-new 16-byte ULID for the given millisecond."""
timestamp = int.to_bytes(ms, TIMESTAMP_LEN, "big")
randomness = os.urandom(RANDOMNESS_LEN)
return timestamp + randomness
This test fails with ulid.ULID() but passes with monotonic_ulid():
def test_monotonic_ulids():
ulids = [monotonic_ulid() for i in range(1000)]
assert ulids == sorted(ulids)