dragonfly
dragonfly copied to clipboard
Key size limit and dump.rdb question
Discussed in https://github.com/dragonflydb/dragonfly/discussions/1389
Originally posted by jaesonrosenfeld1 June 11, 2023 Trying to tryout dragonflydb as a drop-in replacement for redis and excited to see what it can do.
I have an existing application that writes to a dump.rdb file for persistence and reloads about 25GB of data into memory when redis-server is restarted.
I'm noticing that when I try to load dragonfly while pointed to the dump.rdb file written by redis to the host, it doesn't load these keys into the db (dbsize remains 0). Is this because I need to change the default format for the dump.rdb to redis by changing --df_snapshot_format=False and then it can also read the existing dump.rdb from redis format? I tried this and still was unable to get the dump.rdb to be loaded into memory when launching dragonfly.
Secondly, when I then try to write new python files using the redis python package, a few keys write fine but it gets to a slightly larger key to write (350 MB in pandas) and I get the message "Error 32 writing to socket. Broken Pipe". Is there a keysize limitation I should know about that could be modified? I know the limit in Redis is 512MB. Here is the code for launching the dragonflydb container as well as the code for writing the files from python:
docker run --log-driver awslogs --log-opt awslogs-region=us-east-2 --log-opt awslogs-group=WebServerLogsRFG --log-opt awslogs-stream=DockerLogsRedis --name myredis -p 6380:6380 --network my-network -v /home/ubuntu/redis/data:/data --ulimit memlock=-1 docker.dragonflydb.io/dragonflydb/dragonfly dragonfly --port 6380
def openRedisCon(): pool = redis.ConnectionPool( host=REDIS_HOST, port=REDIS_PORT, db=0, ) r = redis.Redis(connection_pool=pool) return r
r = openRedisCon()
def storeDFInRedis(alias, r, df): buffer = io.BytesIO() df.reset_index(drop=True).to_feather(buffer, compression="zstd") buffer. Seek(0) # re-set the pointer to the beginning after reading res = r.set(alias, buffer. Read())
Thanks!
@adiholden see the context in the discussion.
- We should increase the limit to 256MB
- We should introduce a page
limitsunder https://www.dragonflydb.io/docs/managing-dragonfly where we state dragonfly limits in a clear manner. This can include blob size, max number of elements in the array etc.
@romange Could you please advise on rdb part?
I'm particularly interested in the way determine if it's loading the dataset from RDB file or not.
In Redis you can easily get the answer by querying info persistence "loading" or info server "uptime_in_seconds" as uptime only starts when loading is done.
What would be the easy way to get the status of DF?
Thank you!
@royjacobson do you happen to know the answer?
I've also noticed that it takes longer to start DF from RDB file than a standard Redis. Is there a way to tune this process to make it faster?
Btw to a topic starter question. I was able to configure DF to restore from rdb file with
dragonfly --logtostderr --dbfilename dump.rdb --nodf_snapshot_format
@eliskovets I suggest switching to DF format once you load from rdb - it should be much faster than loading from rdb.
For that, just use dragonfly --logtostderr --dbfilename dump or alternatively you can run save df in redis-cli
@royjacobson do you happen to know the answer?
The quickest way to do that (and to generally check if the DB is available) I think is to PING the server and to see if you get a PONG.
We should add the 'loading: 0' field to INFO PERSISTENCE, though. Will open a separate ticket.
@eliskovets I suggest switching to DF format once you load from rdb - it should be much faster than loading from rdb. For that, just use
dragonfly --logtostderr --dbfilename dumpor alternatively you can runsave dfin redis-cli
Thank you! It's way way faster. 🚀
Closing as completed.