High CPU Usage When Running As Dockerized Container On Ubuntu 20.04

foxmuldersatya · August 6, 2024, 8:14am

Hello everyone, I have setup Dragon Fly as a dockerized container which is accessed by a couple of spring boot services. The CPU usage of dragon fly seems to be higher when compared to Redis in idle and under load scenarios, details below:

1 - Dragon Fly container running with no load - CPU usage sitting at 25 %

2 - Dragon Fly container running with some load from spring boot services , CPU usage at 175 %

These stats are from Ubuntu 20.04.

When I tried swapping the Dragon Fly image with Redis image, then the same numbers are at 5% and 28% respectively. Are there any configuration changes that I can do to the dragon fly container which results in less CPU usage?

joezhou_df · August 7, 2024, 5:47pm

As suggested by Dragonfly core team member Shahar on the Discord server, limiting the number of threads Dragonfly uses by setting the --proactor_threads=X parameter would help in this situation.

joezhou_df · August 7, 2024, 5:52pm

Personally, I would also like to share that Dragonfly performs best when running on bare metal without noisy neighbours on the same machine. It is also helpful to set the number of threads to match the number of cores on the machine. Not that Dragonfly is high maintenance, but if you think about it, it’s a platform responding to millions of ops/sec constantly within sub-millisecond latency.

However, orchestrated containers definitely ease some operational work.

foxmuldersatya · August 8, 2024, 5:49am

Thank you for the suggestions, will try and update you folks.

foxmuldersatya · August 9, 2024, 2:53pm

A couple of updates:

Setting proactor_threads=1 does help significantly in a local environment, however what are the repercussions of this when running in production, should this even be used in production?
Understand that ideally it should be running on bare metal and thats when we should really be looking at the CPU usage, thanks!
My code was polling dragon fly using KEYS command, which I have now refactored to use SCAN, and that helped to bring the CPU usage down by about 20%

joezhou_df · August 12, 2024, 5:24pm

Good to know!

I agree that proactor_threads=1 is not a very good choice for production. It should match the cores you have on that production machine.

In terms of commands, KEYS is definitely discouraged for production usage because it’s very expensive. SCAN is much better, but it’s also not the strongest command in terms of performance. As an in-memory key-value data store, Dragonfly is the strongest if the data access patterns are by keys.

Maybe you can share your use case, and we can try to find ways to reduce the SCAN command usage.

foxmuldersatya · August 13, 2024, 12:01pm

hey @joezhou_df , the use case at a high level is described below:

So we have multiple sorted sets (which are basically like priority queues), but they are fairly large so instead maintaining in memory queues , we are maintaining on Dragon Fly.

Each Sorted set is identified by a key, this key itself is dynamic based on the inputs received.

We need to perform certain actions for each item on the queue stored against each key.

Since we are not aware of the exact key and the number of keys, we are polling dragon fly to check 2 things:

1.Get all the keys, lets say we retrieved 5 keys.

2.Iterate on all the 5 keys and check for each sorted set stored against each key, see if the sorted set has any elements, if yes, pick the first element of the set for processing.

This process keeps going on indefinitely.

Have a look at the image I attached, here M1, M2 , M3 , M4 are the keys which I am finding using the scan command using the pattern M*.

Then I get 4 queues OR sorted sets as they are called in dragon fly and process each of those elements.

I have been sort of thinking what would be better between polling and doing a publish subscribe, however need to keep in mind that we can have multiple instances of the spring boot service which does this process

Let me know your thoughts.

joezhou_df · August 26, 2024, 2:42pm

Hey, thanks for the information.

Each sorted set is identified by a key, this key itself is dynamic based on the inputs received.
Since we are not aware of the exact key and the number of keys, we are polling Dragonfly.

I think this is the problem. For Dragonfly, the access pattern should ideally be from key → value, which means that a deterministic key is crucial. Once you have all the keys, operating on them however you like generally won’t be a problem.

The bottleneck is still your access pattern. Maybe you can think of a better way in your application to deal with this logic. If SCAN is not avoidable, then I’d say that Dragonfly is still not used in the best way.

Topic		Replies	Views
After migration from Redis to Dragonfly issues Dragonfly Technical	5	268	October 1, 2024
DragonFly vs Redis Performance Benchmark Replication Dragonfly Technical	1	170	February 19, 2024
metrics to monitor latency at dragonfly server Dragonfly Technical	6	104	April 2, 2024
Having problems with large pipeline commands using the py-redis client Migration to Dragonfly	1	43	October 7, 2024
SSD Tiering and docker Dragonfly Technical	6	126	October 30, 2024

High CPU Usage When Running As Dockerized Container On Ubuntu 20.04

Related topics