Hi everyone,
I’m evaluating DragonflyDB for a feature store use case at scale.
I’m trying to understand how cluster management is supposed to work with the Dragonfly operator, From what I understand of Dragonfly’s cluster model, slot ownership and any data movement are driven by DFLYCLUSTER CONFIG, and migrations are explicit (via the migrations section in the config) and there isn’t automatic resharding or “migrate for me” when a new node is added.
Is there a recommended path for “N → N+1 nodes”?
Thanks!
Hi, I’m not part of the Dragonfly team, having said that, to answer your question Dragonfly team currently doesn’t support automatic migration for standalone installations, since it’s an important part of their cloud service (search for dragonfly cloud enterprise) offering along with automatic failovers, an interface for managing clustering (which is what you want), monitoring and much more.
Dragonfly’s zero downtime migration means that there are two distinct DFLYCLUSTER CONFIG that needs to be sent, one for triggering the migration and the next one for relinquishing the slot ownership after the migration is verified as FINISHED (DFLYCLUSTER SLOT-MIGRATION-STATUS). As such, automating this without considering all scenarios can lead to dataloss and it makes sense that it’s offered as part of their cloud suite
However, if you are looking for migration in standalone environments you can use their Python script (cluster_mgr.py) that they’ve quite generously provided.
It should more or less satisfy your requirement for needing a “migrate for me” , with the exception that you have to run it once whenever you add or remove a node.
You can write your own automation to trigger that script with the new node details however you’d like, but I’d recommend you test any potential automation extensively before using it in a production environment as doing so, like I said, can very easily lead to data loss
However, If your dragonfly nodes are in Kubernetes, I’ve created a fork of the dragonfly-operator with some limited automatic clustering and automatic migration to add and remove nodes. But I cannot guarantee that you will not face issues like dataloss with it unlike the official cloud offering from dragonfly, so use it at your own risk
. You can find the code and the instructions to test it out at Dragonfly Operator Cluster