LATENCY command and related subcommands

absolutely

so basically dragonfly does not even bother using its CPU

Also here’s most of the spec sheet from the supplier on that machine (at least the important parts):

1x Dell PowerEdge R660
1x Intel Xeon Platinum 8468 2.1G 48c/96t
2x 960GB SSD vSAS Read Intensive 12Gbps (Raid 1)
16x 32GB DIMM DDR5 4800MT/s Dual Rank
1x Broadcom 10/25GbE SFP NIC

can you please paste 'redis-cli INFO ALL` command?

it does not contain any data there

yep no worries

# Server
redis_version:6.2.11
dragonfly_version:df-v1.12.1
redis_mode:standalone
arch_bits:64
multiplexing_api:iouring
tcp_port:6379
thread_count:96
uptime_in_seconds:581841
uptime_in_days:6

# Clients
connected_clients:905
client_read_buffer_bytes:234240
blocked_clients:0
dispatch_queue_entries:0

# Memory
used_memory:122685944
used_memory_human:117.00MiB
used_memory_peak:886319928
used_memory_rss:1427558400
used_memory_rss_human:1.33GiB
used_memory_peak_rss:2169511936
comitted_memory:1773666304
maxmemory:427967859916
maxmemory_human:398.58GiB
object_used_memory:18446744072240473152
table_used_memory:116698312
num_buckets:1401120
num_entries:13154
inline_keys:9997
strval_bytes:1637924352
updateval_amount:-791269584
listpack_blobs:0
listpack_bytes:0
small_string_bytes:422192
pipeline_cache_bytes:0
dispatch_queue_bytes:0
dispatch_queue_peak_bytes:0
client_read_buffer_peak_bytes:409600
cache_mode:store
maxmemory_policy:noeviction

# Stats
total_connections_received:1805562
total_commands_processed:25326333240
instantaneous_ops_per_sec:66828
total_pipelined_commands:0
total_net_input_bytes:1304468534320
total_net_output_bytes:232612687595
instantaneous_input_kbps:-1
instantaneous_output_kbps:-1
rejected_connections:-1
expired_keys:1879868192
evicted_keys:0
hard_evictions:0
garbage_checked:1496957
garbage_collected:59133
bump_ups:0
stash_unloaded:91332
oom_rejections:0
traverse_ttl_sec:12
delete_ttl_sec:10
keyspace_hits:2285631900
keyspace_misses:19639946432
total_reads_processed:25258620568
total_writes_processed:25224819403
defrag_attempt_total:0
defrag_realloc_total:0
defrag_task_invocation_total:0
eval_io_coordination_total:0
eval_shardlocal_coordination_total:35617462
eval_squashed_flushes:0
tx_schedule_cancel_total:0

# Tiered
tiered_entries:0
tiered_bytes:0
tiered_reads:0
tiered_writes:0
tiered_reserved:0
tiered_capacity:0
tiered_aborted_write_total:0
tiered_flush_skip_total:0

# Persistence
last_save:1701127319
last_save_duration_sec:0
last_save_file:
loading:0
rdb_changes_since_last_save:3386506308

# Replication
role:master
connected_slaves:0
master_replid:1ebafee12327bb0c44c62a5c8e179c1a2e8aebbd

# Commandstats
cmdstat_CLIENT:calls=40,usec=127,usec_per_call=3.175
cmdstat_COMMAND:calls=9,usec=21307,usec_per_call=2367.44
cmdstat_EVAL:calls=35617462,usec=1078148396,usec_per_call=30.2702
cmdstat_EXPIRE:calls=30289476,usec=1421746,usec_per_call=0.0469386
cmdstat_GET:calls=5962461194,usec=123420008405,usec_per_call=20.6995
cmdstat_HELLO:calls=20,usec=668,usec_per_call=33.4
cmdstat_INCR:calls=35617462,usec=6421336,usec_per_call=0.180286
cmdstat_INFO:calls=28,usec=28611,usec_per_call=1021.82
cmdstat_LATENCY:calls=12,usec=628,usec_per_call=52.3333
cmdstat_LPOP:calls=15897210221,usec=283684521484,usec_per_call=17.8449
cmdstat_PING:calls=35736612,usec=114230685,usec_per_call=3.19646
cmdstat_RPUSH:calls=35617962,usec=787849303,usec_per_call=22.1194
cmdstat_SET:calls=3293782736,usec=69795708123,usec_per_call=21.1901
unknown_LATEHCY:1
unknown_VERSION:1

# Search
search_memory:0
search_num_indices:0
search_num_entries:0

# Errorstats
syntax_error:17
unknown_cmd:2

# Keyspace
db0:keys=13154,expires=3157,avg_ttl=-1

# Cpu
used_cpu_sys:392505.145294
used_cpu_user:222523.791315
used_cpu_sys_children:0.0
used_cpu_user_children:0.0
used_cpu_sys_main_thread:6178.161887
used_cpu_user_main_thread:3333.97309

# Cluster
cluster_enabled:0

ok, according to these stats, Dragonfly behaves perfectly well, it does not even feel any load.

I suspect the latency is either from networking or on the client side

yeah totally, that’s something I’ve observed from the outside, just cant verify it from the stats…

I feel as if the latency is just wrong in our case, I can’t imagine 1s latency for anything we run within our datacenter (redis/memcached/kafka) most are very low 1-5ms on avg

I mean 1ms latency is also high :slightly_smiling_face:

of course it’s not 1s

I saw your memtier run with average latency of 1ms

i think it’s too high

okay

is there common tuning steps? or do you think it’s a networking issue?

i mean, maybe it’s ok - I am not familiar with on-prem networking standards so much

the test I ran was from a VM in our VMWare cluster which is in the same, or a few racks down from this new machine