Unable to restore from .dfs unless --force_epoll=true

hgorni · July 14, 2025, 8:01pm

My Dragonfly instance is configured with --cache_mode and --maxmemory=160G , running on a GCP Compute Engine VM with 22 vCPUs and 176 GB of RAM (c3-highmem-22 , Intel Sapphire Rapids, x86_64, Debian 6.1.140-1).

I’ve noticed that while my .dfs snapshots are around 50 GB, the restore process works smoothly. However, once the snapshot size grows above ~150 GB, Dragonfly detects the file, starts loading, and then immediately gets stuck, typically before even reaching 1 GB of memory usage.

Initially, I suspected snapshot corruption and rebuilt the cache from scratch. But after using the SAVE command and restarting Dragonfly, the issue reappeared with the newly generated snapshot.

I’ve tried several approaches to work around this:

Reduced proactor threads to 6, then to 4
Changed disk type
Recreated the VM from scratch
Tuned various runtime parameters

The only configurations that successfully restore large snapshots (~150 GB+) are:

Setting --proactor_threads=1
Or enabling --force_epoll=true

Currently, I’m using --force_epoll, since restoring with a single thread is too slow. But I understand that Dragonfly prefers the default io_uring I/O engine for performance.

I’d appreciate any insight into why this might be happening and whether there’s a way to restore large .dfssnapshots using multiple threads without disabling io_uring.

For referrence, this is how I start my Dragonfly instance:

docker run -d --name "app_cache" \
    --network host \
    --log-driver=gcplogs \
    --log-opt gcp-log-cmd=true \
    -v /data:/data \
    -m "172g" \
    docker.dragonflydb.io/dragonflydb/dragonfly \
    --port "6379" \
    --logtostderr \
    --force_epoll=true \
    --dir /data \
    --cache_mode \
    --maxmemory 160G

Thanks in advance!

joezhou_df · July 17, 2025, 6:44pm

Hey @hgorni,

Please take a look at this GitHub issue.

There seems to be a bug in the kernel version you’re using. For Dragonfly Cloud, we run on 6.8.x versions. Is it possible to upgrade on your side?

This could be related to the maxmemory issue you have in the other thread as well.

Topic		Replies	Views
Memory usage keeps increasing after reaching --maxmemory limit Dragonfly Technical	1	79	July 17, 2025
Running Dragonfly in Memcached mode doesn't support payloads >~ 200 bytes? Dragonfly Technical	4	51	August 30, 2023
Running Dragonfly in Memcached mode doesn't support value >~ 64 Kbytes? Dragonfly Technical	25	114	November 3, 2023
After migration from Redis to Dragonfly issues Dragonfly Technical	5	317	October 1, 2024
easy installation on Linux and data loss Dragonfly Technical	2	91	May 6, 2024

Unable to restore from .dfs unless --force_epoll=true

Related topics