dalli icon indicating copy to clipboard operation
dalli copied to clipboard

No Server Available on TLS connection

Open paul-mesnilgrente opened this issue 10 months ago • 10 comments

I created a Serverless memcached using my AWS infrastructure. I confirmed I can read and write to the Memcached server from my pods using the openssl command:

# write "hello_world" in the "test_key"
printf "set test_key 0 900 11\r\nhello_world\r\n" | openssl s_client -quiet -connect memcache.........amazonaws.com:11211 2> /dev/null

# get the "test_key"
printf "get test_key\n" | openssl s_client -quiet -connect memcache...........amazonaws.com:11211 2> /dev/null

I even tested a python script:

import ssl
from pymemcache.client.base import Client

# The ElastiCache endpoint + port
ENDPOINT = 'memcache-serverless-uat-gln3am.serverless.apse2.cache.amazonaws.com'
PORT     = 11211

# Create an SSL context that ignores certificate validation (for simplicity).
# If you want real validation, remove these lines and configure the CA properly.
ssl_context = ssl.create_default_context()
ssl_context.check_hostname = False
ssl_context.verify_mode    = ssl.CERT_NONE

# Initialize the client with ssl_context
client = Client(
    (ENDPOINT, PORT),
    # The key param is "ssl_context" in pymemcache 4.0+
    tls_context=ssl_context
)

try:
    client.set('test_key', b'Hello from Python TLS')
    val = client.get('test_key')
    print(f"Memcached read: {val}")
finally:
    client.close()

and it works too.

But I have no luck when I use Dalli. I'm just using a script than I run like this: ./test_memcache_tls.rb to avoid any Rails misconfiguration or anything, here's the script:

#!/usr/bin/env ruby

require 'dalli'
require 'openssl'

# 1. Build an SSL context that ignores cert validation for simplicity
ssl_context = OpenSSL::SSL::SSLContext.new
ssl_context.set_params(
  min_version: :TLS1_2, # Can set max_version as well if desired
  verify_hostname: true, # Skip this line if using JRuby
  verify_mode: OpenSSL::SSL::VERIFY_PEER
  # ca_file: <path to PEM file> # Omit this line to use the default CA files
)
# ssl_context.verify_mode = OpenSSL::SSL::VERIFY_NONE


endpoint = "memcache-serverless-uat-gln3am.serverless.apse2.cache.amazonaws.com:11211"
client = Dalli::Client.new(endpoint, ssl_context: ssl_context)

puts "Setting key..."
client.set("hello", "world")

I tried both ssl_set_params or the simplest ssl_context.verify_mode = OpenSSL::SSL::VERIFY_NONE but both configs are raising:

W, [2025-02-26T23:54:52.881230 #1621] WARN -- : memcache-serverless-uat-gln3am.serverless.apse2.cache.amazonaws.com:11211 failed (count: 0) Timeout::Error: IO timeout: {host: "memcache-serverless-uat-gln3am.serverless.apse2.cache.amazonaws.com", port: 11211, down_retry_delay: 30, socket_timeout: 1, socket_max_failures: 2, socket_failure_delay: 0.1, keepalive: true, ssl_context: #<OpenSSL::SSL::SSLContext:0x00007fbed6ae8200 @verify_mode=0, @verify_hostname=false>} Dalli::RingError => No server available

on client.set("hello", "world") line. Is there a way to get more logs from Dalli to debug this issue? Or maybe you have a clue on what's going here?

paul-mesnilgrente avatar Feb 27 '25 00:02 paul-mesnilgrente

I also tested it using Ruby without Dalli, using only 'socket' and 'openssl', it works too. Here's the code if that helps:

#!/usr/bin/env ruby

require 'socket'
require 'openssl'

HOST = 'memcache........amazonaws.com'
PORT = 11211

begin
  # 1. Open a plain TCP socket
  tcp_socket = TCPSocket.new(HOST, PORT)

  # 2. Create an SSL context (turn off cert verification for simplicity)
  ssl_context = OpenSSL::SSL::SSLContext.new
  ssl_context.verify_mode = OpenSSL::SSL::VERIFY_NONE

  # 3. Wrap the TCP socket with TLS
  ssl_socket = OpenSSL::SSL::SSLSocket.new(tcp_socket, ssl_context)
  ssl_socket.sync_close = true

  # 4. Initiate the TLS handshake
  ssl_socket.connect

  # 5. Send a Memcached command in text protocol
  #    (e.g., the "version" command to see what Memcached returns)
  ssl_socket.puts("get test_key\r\n")

  # 6. Read the server's response line(s)
  while line = ssl_socket.gets
    puts "Server response: #{line}"
  end

  # 7. Close the TLS socket
  ssl_socket.close

rescue => e
  puts "Error connecting or reading: #{e.class} - #{e.message}"
end

paul-mesnilgrente avatar Feb 27 '25 00:02 paul-mesnilgrente

Did you adjust any of the timeout options, is it simply timing out when trying to get the initial connection?

danmayer avatar Mar 11 '25 04:03 danmayer

@paul-mesnilgrente @danmayer I’m running into the same issue while setting up a serverless Memcached instance in ElastiCache. I was able to successfully write and read using Paul's first command-line approach, but the socket and OpenSSL Ruby code didn’t work for me—I'm encountering an output of Server response: END.

Does Dalli support AWS ElastiCache Memcached when configured as serverless? If not, are there alternative Ruby gems or approaches you’d recommend for integrating serverless Memcached with a Rails application?

jaredblumer avatar Mar 13 '25 15:03 jaredblumer

I'm experiencing the same issue. I turned the value for socket_timeout to 10 from default value of 1. Nothing improved. Just waits 10 seconds now before failing.

mruhlin avatar May 12 '25 22:05 mruhlin

I will need to setup AWS serverless infra to try this out as it seem to not have the same issues outside of that configuration.

danmayer avatar May 13 '25 18:05 danmayer

Thanks for taking a look. If it helps, I'm setting it up with pretty much all the default settings. Using the AWS terraform providers:

resource "aws_elasticache_serverless_cache" "memcached" {
  engine = "memcached"
  name   = "sa-memcached-${var.name}"
  cache_usage_limits {
    data_storage {
      maximum = 3
      unit    = "GB"
    }
    ecpu_per_second {
      maximum = 5000
    }
  }
  description          = "SA Memcached"
  major_engine_version = "1.6"
  security_group_ids   = [aws_security_group.memcached.id]
  subnet_ids           = module.vpc.private_subnets.ids
}

mruhlin avatar May 13 '25 18:05 mruhlin

Hi, I found following notes.

Supported and restricted Valkey, Memcached, and Redis OSS commands - Amazon ElastiCache https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/SupportedCommands.html

Binary protocol is not supported, as it is officially deprecated in memcached 1.6.

And in my recognition, dalli only implement binary protcol. So the above error may happen because dalli try to talk with memocached on ElastiCache Servercess with binary protcol.

https://github.com/petergoldstein/dalli/blob/c6d85e4e4b440b22bf4a784eca77b234798553e5/3.0-Upgrade.md?plain=1#L22-L27

Is there a plan to implement protocols other than binary at an early stage?

mikanbox avatar May 29 '25 04:05 mikanbox

I found the way to connect memcached on ElastiCache Serverless with meta protcol in dalli using following command.

meta_client = Dalli::Client.new(<your endpoint>, protocol: :meta, :ssl_context => ssl_context)

I hope this will be helpful for you.

mikanbox avatar May 29 '25 08:05 mikanbox

OK, yeah we haven't removed binary protocol from dalli yet and it is still the default... but this makes sense as the cause of the issue. It is unfortunate that it didn't give more clear errors on the problem.

  • we have a branch that has removed all binary protocol, but are considering the best roll out strategy
  • I am thinking perhaps a major version bump with meta protocol as the default and finally another major bump where we remove binary entirely from the main release

Alternatively we could do that cut over as one major release and if there are bad bug fixes we could just back port them to the binary releases for X amount of time before we drop support for that release branch.

danmayer avatar Jun 13 '25 16:06 danmayer

This sounds like a good reason to start moving forward with the binary cleanup timeline as more folks will start to hit issues like this.

danmayer avatar Jun 13 '25 16:06 danmayer