Binary client doesn't obey TTL in some cases
I have a wrapper for the cache layer that enforces a TTL on sets. Under some conditions which we haven't identified (Folsom 1.2.1, Oracle JDK1.8 in all cases, memcached 1.4.15 in Linux and memcached 1.5.13 in Mac), we have identified that TTL doesn't work properly in binary protocol, but works perfectly as expected for same test if I change the connection line from connectBinary to connectAscii. With connectBinary, it works well in some cases (a Ubuntu laptop with memcached 1.5.6) but not in others (Mac with memcached 1.5.13). This is the equivalent pseudo-code:
@Test
public void testExpiredKey() throws Exception {
int ttl = 5;
CacheClient<CachedValue> client = instanceClient(CachedValue.class, ttl);
String key = randomKey();
CachedValue value = randomValue();
client.put(key, value).join();
await().atMost(ttl * 2, TimeUnit.SECONDS).untilAsserted(() -> assertNull(client.get(key).join()));
}
Thanks for the report! I will look into it.
To clarify - what kind of error do you see? Are entries expiring too late or too early?
Entries not expiring at all - or maybe way too late for my tests to identify. My test just sets a key with a TTL, and expects a GET to that key to return null after the TTL passes (plus a substantial margin).
BTW I'm reproducing with pure folsom code also:
public class FolsomExpirationTest {
private MemcacheClient<byte[]> memcacheClient;
@Before
public void setUp() throws Exception {
memcacheClient = MemcacheClientBuilder.newByteArrayClient().withAddress("127.0.0.1").connectBinary();
ConnectFuture.connectFuture(memcacheClient).toCompletableFuture().get(30, TimeUnit.SECONDS);
}
@After
public void tearDown() {
memcacheClient.shutdown();
}
@Test
public void textExpiration() {
int ttl = 5;
String key = randomKey();
byte[] value = randomValue();
memcacheClient.set(key, value, ttl).toCompletableFuture().join();
await().atMost(ttl * 2, TimeUnit.SECONDS)
.untilAsserted(() -> assertNull(memcacheClient.get(key).toCompletableFuture().join()));
}
private static byte[] randomValue() {
byte[] b = new byte[5];
new Random().nextBytes(b);
return b;
}
private static String randomKey() {
return UUID.randomUUID().toString();
}
}
changing connection to connectAscii makes test pass, whereas connectBinary fails consistently here.
Passing consistently on:
- Ubuntu 18.04, Oracle JDK 1.8.0_201, Memcached 1.5.6
- OS X 10.14.2, Oracle JDK 1.8.0_66 (Laptop 2), Memcached 1.5.8 and 1.5.12
Failing consistently on:
- OS X 10.14.2, Oracle JDK 1.8.0_181 (Laptop 1), Memcached 1.5.12
Randomly failing/passing on:
- AWS Linux 4.14.77-70.82.amzn1.x86_64, Oracle JDK 1.8.0_162, Memcached 1.4.15
I could reproduce it by changing the memcached version, see: https://github.com/spotify/folsom/pull/113
It looks more like a regression in memcached itself than a problem in folsom.
I see. It's possible, but it seems to fail too in 1.4.15 randomly... that's why we initially discarded memcache version (1.4.15+Linux failing sometimes, 1.5.6+Linux passing, 1.5.12+Mac failing).
Testing with more versions of memcached, it seems to depend more on type of image than version. Fails on ol-7 but works on debian / regular builds.
According to specification:
Some commands involve a client sending some kind of expiration time
(relative to an item or to an operation requested by the client) to
the server. In all such cases, the actual value sent may either be
Unix time (number of seconds since January 1, 1970, as a 32-bit
value), or a number of seconds starting from current time. In the
latter case, this number of seconds may not exceed 60*60*24*30 (number
of seconds in 30 days); if the number sent by a client is larger than
that, the server will consider it to be real Unix time value rather
than an offset from current time.
A workaround from our side could be to always send it as a unix timestamp if possible (carefully trying to avoid the year 2038 problem)
I see. Thanks for the input. I'll move all my code to use timestamps instead.
Actually in the binary version of the SetRequest command, this is done:
public ByteBuf writeRequest(final ByteBufAllocator alloc, final ByteBuffer dst) {
final int expiration = Utils.ttlToExpiration(ttl);
so timestamps are already in-use...
BTW I'm not sure this code is correct either:
public static int ttlToExpiration(final int ttl) {
return (ttl == 0) ? 0 : (int) (System.currentTimeMillis() / 1000) + ttl;
}
as it doesn't take into account if the provided "ttl" is already a timestamp.
(ASCII version uses TTL as-is, whereas binary enforces passing of TTL. I think the ttlToExpiration should check if the provided ttl is already above 606024*30 before doing that operation)
With ASCII and timestamps obtained thru Utils.tottlToExpiration it also fails...
Good catch! I'll take a look at that.
The provided ttl should not be a timestamp - we could extend the API to be more explicit, like adding .set(key, value, instant) or .set(key, value, duration)
Does this change make sense to start with? Do you think it would solve your issues? https://github.com/spotify/folsom/pull/114
Partially solved in release 1.3.0.