客户端为 8C16G centos 7.6;Server 端为Mysql 5.7 2C4G 100G 本地 SSD
参考链接:
性能概述_MySQL版_性能白皮书_云数据库 RDS-阿里云
yum install gcc gcc-c++ autoconf automake make libtool bzr mysql-devel git mysql
git clone <https://github.com/akopytov/sysbench.git>
cd sysbench
**git checkout 1.0.20**
./autogen.sh
./configure --prefix=/usr --mandir=/usr/share/man
make
make install
# 说明 ffffffff表示使用32个CPU(1个f表示4个CPU)。请根据实际配置修改,例如ECS为8核CPU,则输入ff。
sh -c 'for x in /sys/class/net/eth0/queues/rx-*; do echo ff > $x/rps_cpus; done'
sh -c 'for x in /sys/class/net/eth0/queues/rx-*; do echo 8192 > $x/rps_flow_cnt; done' # 32768 / N
sh -c "echo 32768 > /proc/sys/net/core/rps_sock_flow_entries"
# 准备数据
sysbench --db-driver=mysql --mysql-host=rm-uf6hmibzyjeo00r2j.mysql.rds.aliyuncs.com --mysql-port=3306 --mysql-user=bench --mysql-password=fJyR6gT6RIZn --mysql-db=sbtest --table_size=25000 --tables=250 --events=0 --time=600 oltp_read_write prepare
# 运行workload
sysbench --db-driver=mysql --mysql-host=rm-uf6hmibzyjeo00r2j.mysql.rds.aliyuncs.com --mysql-port=3306 --mysql-user=bench --mysql-password=fJyR6gT6RIZn --mysql-db=sbtest --table_size=25000 --tables=250 --events=0 --time=600 --threads=4 --percentile=95 --report-interval=1 oltp_read_write run
# 清理数据
sysbench --db-driver=mysql --mysql-host=rm-uf6hmibzyjeo00r2j.mysql.rds.aliyuncs.com --mysql-port=3306 --mysql-user=bench --mysql-password=fJyR6gT6RIZn --mysql-db=sbtest --table_size=25000 --tables=250 --events=0 --time=600 --threads=4 --percentile=95 oltp_read_write cleanup
SQL statistics:
queries performed:
read: 1061382
write: 303252
other: 151626
total: 1516260
transactions: 75813 (126.35 per sec.)
queries: 1516260 (2526.93 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0395s
total number of events: 75813
Latency (ms):
min: 26.24
avg: 31.66
max: 237.80
95th percentile: 37.56
sum: 2400015.50
Threads fairness:
events (avg/stddev): 18953.2500/1137.92
execution time (avg/stddev): 600.0039/0.01
# 修正可用区后的测试结果
SQL statistics:
queries performed:
read: 2785804
write: 795944
other: 397972
total: 3979720
transactions: 198986 (331.63 per sec.)
queries: 3979720 (6632.51 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0312s
total number of events: 198986
Latency (ms):
min: 7.27
avg: 12.06
max: 503.64
95th percentile: 17.32
sum: 2399889.65
Threads fairness:
events (avg/stddev): 49746.5000/1678.32
execution time (avg/stddev): 599.9724/0.01
# Mysql 5.7 混合 SSD
SQL statistics:
queries performed:
read: 4303586
write: 1229596
other: 614798
total: 6147980
transactions: 307399 (512.32 per sec.)
queries: 6147980 (10246.41 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0115s
total number of events: 307399
Latency (ms):
min: 5.66
avg: 7.81
max: 344.62
95th percentile: 11.45
sum: 2399684.09
Threads fairness:
events (avg/stddev): 76849.7500/858.93
execution time (avg/stddev): 599.9210/0.00
# x86 通用型 4 线程
SQL statistics:
queries performed:
read: 5946374
write: 1698964
other: 849482
total: 8494820
transactions: 424741 (707.88 per sec.)
queries: 8494820 (14157.70 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0129s
total number of events: 424741
Latency (ms):
min: 4.65
avg: 5.65
max: 208.69
95th percentile: 7.17
sum: 2399569.82
Threads fairness:
events (avg/stddev): 106185.2500/3760.42
execution time (avg/stddev): 599.8925/0.00
# x86 10线程
SQL statistics:
queries performed:
read: 13006854
write: 3716244
other: 1858122
total: 18581220
transactions: 929061 (1548.38 per sec.)
queries: 18581220 (30967.63 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0193s
total number of events: 929061
Latency (ms):
min: 4.63
avg: 6.46
max: 211.35
95th percentile: 7.43
sum: 5998875.11
Threads fairness:
events (avg/stddev): 92906.1000/4512.15
execution time (avg/stddev): 599.8875/0.01
# 100线程
SQL statistics:
queries performed:
read: 5294716
write: 1512776
other: 756388
total: 7563880
transactions: 378194 (628.95 per sec.)
queries: 7563880 (12579.07 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 601.3055s
total number of events: 378194
Latency (ms):
min: 6.43
avg: 158.81
max: 3795.42
95th percentile: 623.33
sum: 60061378.06
Threads fairness:
events (avg/stddev): 3781.9400/139.13
execution time (avg/stddev): 600.6138/0.37
SQL statistics:
queries performed:
read: 4672976
write: 1335136
other: 667568
total: 6675680
transactions: 333784 (556.29 per sec.)
queries: 6675680 (11125.85 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0140s
total number of events: 333784
Latency (ms):
min: 5.78
avg: 7.19
max: 162.70
95th percentile: 7.56
sum: 2399604.13
Threads fairness:
events (avg/stddev): 83446.0000/235.12
execution time (avg/stddev): 599.9010/0.00
# threads 为 100
SQL statistics:
queries performed:
read: 5320448
write: 1520128
other: 760064
total: 7600640
transactions: 380032 (633.10 per sec.)
queries: 7600640 (12661.96 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.2722s
total number of events: 380032
Latency (ms):
min: 30.13
avg: 157.91
max: 1405.64
95th percentile: 248.83
sum: 60011428.36
Threads fairness:
events (avg/stddev): 3800.3200/578.23
execution time (avg/stddev): 600.1143/0.06
# Mysql 5.7
# 第一次测试
SQL statistics:
queries performed:
read: 6476946
write: 1850556
other: 925278
total: 9252780
transactions: 462639 (771.05 per sec.)
queries: 9252780 (15420.99 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0106s
total number of events: 462639
Latency (ms):
min: 4.60
avg: 5.19
max: 133.18
95th percentile: 5.67
sum: 2399483.60
Threads fairness:
events (avg/stddev): 115659.7500/1057.14
execution time (avg/stddev): 599.8709/0.00
# 第二次测试
SQL statistics:
queries performed:
read: 7181076
write: 2051736
other: 1025868
total: 10258680
transactions: 512934 (854.87 per sec.)
queries: 10258680 (17097.41 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0123s
total number of events: 512934
Latency (ms):
min: 4.23
avg: 4.68
max: 98.30
95th percentile: 5.00
sum: 2399311.81
Threads fairness:
events (avg/stddev): 128233.5000/570.49
execution time (avg/stddev): 599.8280/0.00
# threads 10
SQL statistics:
queries performed:
read: 10031532
write: 2866152
other: 1433076
total: 14330760
transactions: 716538 (1194.18 per sec.)
queries: 14330760 (23883.64 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.0226s
total number of events: 716538
Latency (ms):
min: 4.29
avg: 8.37
max: 103.55
95th percentile: 26.68
sum: 5999048.55
Threads fairness:
events (avg/stddev): 71653.8000/1096.57
execution time (avg/stddev): 599.9049/0.01
# threads 100
SQL statistics:
queries performed:
read: 8147524
write: 2327864
other: 1163932
total: 11639320
transactions: 581966 (969.61 per sec.)
queries: 11639320 (19392.11 per sec.)
ignored errors: 0 (0.00 per sec.)
reconnects: 0 (0.00 per sec.)
General statistics:
total time: 600.2078s
total number of events: 581966
Latency (ms):
min: 5.44
avg: 103.11
max: 300.15
95th percentile: 150.29
sum: 60009011.66
Threads fairness:
events (avg/stddev): 5819.6600/227.54
execution time (avg/stddev): 600.0901/0.05
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: test1-0001: GNU/Linux
OS: GNU/Linux -- 3.10.0-1062.12.1.el7.x86_64 -- #1 SMP Tue Feb 4 23:02:59 UTC 2020
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 2: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 3: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 4: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 5: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 6: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 7: Intel(R) Xeon(R) Gold 6266C CPU @ 3.00GHz (6000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
15:32:04 up 12 min, 1 user, load average: 0.00, 0.01, 0.01; runlevel 3
------------------------------------------------------------------------
Benchmark Run: 日 8月 16 2020 15:32:04 - 16:00:05
8 CPUs in system; running 1 parallel copy of tests
# 测试聚焦在字符串处理,没有浮点运算操作。这个测试用于测试链接器编译、代码优化、内存缓存、等待状态、整数数据类型等,硬件和软件设计都会非常大的影响测试结果。
Dhrystone 2 using register variables 39735614.9 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4713.5 MWIPS (9.8 s, 7 samples)
Execl Throughput 3817.8 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1028946.0 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 274095.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3173110.3 KBps (30.0 s, 2 samples)
Pipe Throughput 1801318.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 73406.7 lps (10.0 s, 7 samples)
Process Creation 12619.6 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 9117.6 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 4310.4 lpm (60.0 s, 2 samples)
System Call Overhead 2410379.4 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 39735614.9 3404.9
Double-Precision Whetstone 55.0 4713.5 857.0
Execl Throughput 43.0 3817.8 887.9
File Copy 1024 bufsize 2000 maxblocks 3960.0 1028946.0 2598.3
File Copy 256 bufsize 500 maxblocks 1655.0 274095.9 1656.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 3173110.3 5470.9
Pipe Throughput 12440.0 1801318.7 1448.0
Pipe-based Context Switching 4000.0 73406.7 183.5
Process Creation 126.0 12619.6 1001.6
Shell Scripts (1 concurrent) 42.4 9117.6 2150.4
Shell Scripts (8 concurrent) 6.0 4310.4 7184.0
System Call Overhead 15000.0 2410379.4 1606.9
========
System Benchmarks Index Score 1648.6
------------------------------------------------------------------------
Benchmark Run: 日 8月 16 2020 16:00:05 - 16:28:07
8 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables 317358744.4 lps (10.0 s, 7 samples)
Double-Precision Whetstone 37639.5 MWIPS (9.8 s, 7 samples)
Execl Throughput 27667.8 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1007888.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 267638.7 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3234483.4 KBps (30.0 s, 2 samples)
Pipe Throughput 14301960.8 lps (10.0 s, 7 samples)
Pipe-based Context Switching 1921916.8 lps (10.0 s, 7 samples)
Process Creation 80416.8 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 46763.4 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 6647.0 lpm (60.0 s, 2 samples)
System Call Overhead 5942405.4 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 317358744.4 27194.4
Double-Precision Whetstone 55.0 37639.5 6843.6
Execl Throughput 43.0 27667.8 6434.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 1007888.4 2545.2
File Copy 256 bufsize 500 maxblocks 1655.0 267638.7 1617.2
File Copy 4096 bufsize 8000 maxblocks 5800.0 3234483.4 5576.7
Pipe Throughput 12440.0 14301960.8 11496.8
Pipe-based Context Switching 4000.0 1921916.8 4804.8
Process Creation 126.0 80416.8 6382.3
Shell Scripts (1 concurrent) 42.4 46763.4 11029.1
Shell Scripts (8 concurrent) 6.0 6647.0 11078.3
System Call Overhead 15000.0 5942405.4 3961.6
========
System Benchmarks Index Score 6396.4
======= Script description and score comparison completed! =======
# c6s.2xlarge.2
Benchmark Run: 二 8月 18 2020 12:37:11 - 13:05:23
8 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 40924937.2 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4376.2 MWIPS (10.9 s, 7 samples)
Execl Throughput 3759.5 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 1052875.8 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 282652.4 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 3145030.1 KBps (30.0 s, 2 samples)
Pipe Throughput 1860167.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 72757.2 lps (10.0 s, 7 samples)
Process Creation 12278.6 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 9092.8 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 3742.6 lpm (60.0 s, 2 samples)
System Call Overhead 2499296.4 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 40924937.2 3506.8
Double-Precision Whetstone 55.0 4376.2 795.7
Execl Throughput 43.0 3759.5 874.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 1052875.8 2658.8
File Copy 256 bufsize 500 maxblocks 1655.0 282652.4 1707.9
File Copy 4096 bufsize 8000 maxblocks 5800.0 3145030.1 5422.5
Pipe Throughput 12440.0 1860167.7 1495.3
Pipe-based Context Switching 4000.0 72757.2 181.9
Process Creation 126.0 12278.6 974.5
Shell Scripts (1 concurrent) 42.4 9092.8 2144.5
Shell Scripts (8 concurrent) 6.0 3742.6 6237.6
System Call Overhead 15000.0 2499296.4 1666.2
========
System Benchmarks Index Score 1631.2
------------------------------------------------------------------------
Benchmark Run: 二 8月 18 2020 13:05:23 - 13:33:29
8 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables 222339067.7 lps (10.0 s, 7 samples)
Double-Precision Whetstone 33766.3 MWIPS (10.0 s, 7 samples)
Execl Throughput 22473.1 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 882385.9 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 243283.2 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2858956.2 KBps (30.0 s, 2 samples)
Pipe Throughput 9234831.8 lps (10.0 s, 7 samples)
Pipe-based Context Switching 1512274.9 lps (10.0 s, 7 samples)
Process Creation 65841.1 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 36440.6 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 5059.0 lpm (60.0 s, 2 samples)
System Call Overhead 7914561.2 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 222339067.7 19052.2
Double-Precision Whetstone 55.0 33766.3 6139.3
Execl Throughput 43.0 22473.1 5226.3
File Copy 1024 bufsize 2000 maxblocks 3960.0 882385.9 2228.2
File Copy 256 bufsize 500 maxblocks 1655.0 243283.2 1470.0
File Copy 4096 bufsize 8000 maxblocks 5800.0 2858956.2 4929.2
Pipe Throughput 12440.0 9234831.8 7423.5
Pipe-based Context Switching 4000.0 1512274.9 3780.7
Process Creation 126.0 65841.1 5225.5
Shell Scripts (1 concurrent) 42.4 36440.6 8594.5
Shell Scripts (8 concurrent) 6.0 5059.0 8431.6
System Call Overhead 15000.0 7914561.2 5276.4
========
System Benchmarks Index Score 5352.9
======= Script description and score comparison completed! =======
[root@ecs-aa16-d080 ~]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m17.777s
user 0m17.777s
sys 0m0.000s
[root@ecs-aa16-d080 ~]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m17.778s
user 0m17.778s
sys 0m0.000s
[root@ecs-aa16-d080 ~]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m17.780s
user 0m17.779s
sys 0m0.001s
# 内存测试 <https://github.com/gregs1104/stream-scaling>
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 140247040 (elements), Offset = 0 (elements)
Memory per array = 1070.0 MiB (= 1.0 GiB).
Total memory required = 3210.0 MiB (= 3.1 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 1
Number of Threads counted = 1
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 97352 microseconds.
(= 97352 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 9651.5 0.233516 0.232498 0.234068
Scale: 11766.1 0.191044 0.190714 0.191289
Add: 12406.7 0.271513 0.271300 0.271743
Triad: 12364.9 0.272600 0.272217 0.273224
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
Number of Threads requested = 2
Function Best Rate MB/s Avg time Min time Max time
Triad: 24298.7 0.138610 0.138523 0.138677
Number of Threads requested = 3
Function Best Rate MB/s Avg time Min time Max time
Triad: 36138.4 0.093440 0.093140 0.093883
Number of Threads requested = 4
Function Best Rate MB/s Avg time Min time Max time
Triad: 27027.1 0.124726 0.124539 0.125044
Number of Threads requested = 5
Function Best Rate MB/s Avg time Min time Max time
Triad: 35105.3 0.101549 0.095881 0.103823
Number of Threads requested = 6
Function Best Rate MB/s Avg time Min time Max time
Triad: 38238.4 0.088223 0.088025 0.088327
Number of Threads requested = 7
Function Best Rate MB/s Avg time Min time Max time
Triad: 43929.0 0.076806 0.076622 0.077039
Number of Threads requested = 8
Function Best Rate MB/s Avg time Min time Max time
Triad: 49254.9 0.068445 0.068337 0.068624
========================================================================
BYTE UNIX Benchmarks (Version 5.1.3)
System: iZuf6c9w0i9d0wuto87kozZ: GNU/Linux
OS: GNU/Linux -- 3.10.0-957.21.3.el7.x86_64 -- #1 SMP Tue Jun 18 16:35:19 UTC 2019
Machine: x86_64 (x86_64)
Language: en_US.utf8 (charmap="UTF-8", collate="UTF-8")
CPU 0: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 1: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 2: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 3: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 4: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 5: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 6: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
CPU 7: Intel(R) Xeon(R) Platinum 8269CY CPU @ 2.50GHz (5000.0 bogomips)
Hyper-Threading, x86-64, MMX, Physical Address Ext, SYSENTER/SYSEXIT, SYSCALL/SYSRET
15:37:52 up 3 min, 1 user, load average: 0.21, 0.06, 0.03; runlevel 3
------------------------------------------------------------------------
Benchmark Run: 日 8月 16 2020 15:37:52 - 16:05:55
8 CPUs in system; running 1 parallel copy of tests
Dhrystone 2 using register variables 39275991.8 lps (10.0 s, 7 samples)
Double-Precision Whetstone 4679.9 MWIPS (9.9 s, 7 samples)
Execl Throughput 3489.2 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 653760.5 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 170862.0 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2140522.7 KBps (30.0 s, 2 samples)
Pipe Throughput 872162.5 lps (10.0 s, 7 samples)
Pipe-based Context Switching 93645.3 lps (10.0 s, 7 samples)
Process Creation 9767.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 8182.2 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 3363.2 lpm (60.0 s, 2 samples)
System Call Overhead 933885.7 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 39275991.8 3365.6
Double-Precision Whetstone 55.0 4679.9 850.9
Execl Throughput 43.0 3489.2 811.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 653760.5 1650.9
File Copy 256 bufsize 500 maxblocks 1655.0 170862.0 1032.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 2140522.7 3690.6
Pipe Throughput 12440.0 872162.5 701.1
Pipe-based Context Switching 4000.0 93645.3 234.1
Process Creation 126.0 9767.5 775.2
Shell Scripts (1 concurrent) 42.4 8182.2 1929.8
Shell Scripts (8 concurrent) 6.0 3363.2 5605.4
System Call Overhead 15000.0 933885.7 622.6
========
System Benchmarks Index Score 1234.5
------------------------------------------------------------------------
Benchmark Run: 日 8月 16 2020 16:05:55 - 16:34:01
8 CPUs in system; running 8 parallel copies of tests
Dhrystone 2 using register variables 218980623.4 lps (10.0 s, 7 samples)
Double-Precision Whetstone 32928.3 MWIPS (10.0 s, 7 samples)
Execl Throughput 17687.8 lps (30.0 s, 2 samples)
File Copy 1024 bufsize 2000 maxblocks 809622.4 KBps (30.0 s, 2 samples)
File Copy 256 bufsize 500 maxblocks 227996.9 KBps (30.0 s, 2 samples)
File Copy 4096 bufsize 8000 maxblocks 2638306.1 KBps (30.0 s, 2 samples)
Pipe Throughput 5594657.7 lps (10.0 s, 7 samples)
Pipe-based Context Switching 920645.7 lps (10.0 s, 7 samples)
Process Creation 49501.5 lps (30.0 s, 2 samples)
Shell Scripts (1 concurrent) 32753.9 lpm (60.0 s, 2 samples)
Shell Scripts (8 concurrent) 4597.2 lpm (60.0 s, 2 samples)
System Call Overhead 5117448.4 lps (10.0 s, 7 samples)
System Benchmarks Index Values BASELINE RESULT INDEX
Dhrystone 2 using register variables 116700.0 218980623.4 18764.4
Double-Precision Whetstone 55.0 32928.3 5987.0
Execl Throughput 43.0 17687.8 4113.4
File Copy 1024 bufsize 2000 maxblocks 3960.0 809622.4 2044.5
File Copy 256 bufsize 500 maxblocks 1655.0 227996.9 1377.6
File Copy 4096 bufsize 8000 maxblocks 5800.0 2638306.1 4548.8
Pipe Throughput 12440.0 5594657.7 4497.3
Pipe-based Context Switching 4000.0 920645.7 2301.6
Process Creation 126.0 49501.5 3928.7
Shell Scripts (1 concurrent) 42.4 32753.9 7725.0
Shell Scripts (8 concurrent) 6.0 4597.2 7662.1
System Call Overhead 15000.0 5117448.4 3411.6
========
System Benchmarks Index Score 4370.9
======= Script description and score comparison completed! =======
[root@iZbp1bldyjzra1jvby7mpvZ sysbench]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m18.307s
user 0m18.305s
sys 0m0.000s
[root@iZbp1bldyjzra1jvby7mpvZ sysbench]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m18.297s
user 0m18.296s
sys 0m0.001s
[root@iZbp1bldyjzra1jvby7mpvZ sysbench]# time echo "scale=5000; 4*a(1)" | bc -l -q > /dev/null
real 0m18.276s
user 0m18.274s
sys 0m0.001s
=== Testing up to 8 cores ===
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 140247040 (elements), Offset = 0 (elements)
Memory per array = 1070.0 MiB (= 1.0 GiB).
Total memory required = 3210.0 MiB (= 3.1 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Number of Threads requested = 1
Number of Threads counted = 1
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 100969 microseconds.
(= 100969 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 9779.8 0.230451 0.229448 0.231886
Scale: 11984.5 0.187431 0.187238 0.187529
Add: 12455.4 0.270397 0.270238 0.270519
Triad: 12441.2 0.270694 0.270547 0.271262
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
Number of Threads requested = 2
Function Best Rate MB/s Avg time Min time Max time
Triad: 24899.3 0.135398 0.135182 0.135631
Number of Threads requested = 3
Function Best Rate MB/s Avg time Min time Max time
Triad: 34882.6 0.096724 0.096493 0.097081
Number of Threads requested = 4
Function Best Rate MB/s Avg time Min time Max time
Triad: 41385.6 0.081970 0.081331 0.082727
Number of Threads requested = 5
Function Best Rate MB/s Avg time Min time Max time
Triad: 35742.4 0.108973 0.094172 0.114944
Number of Threads requested = 6
Function Best Rate MB/s Avg time Min time Max time
Triad: 35032.2 0.097299 0.096081 0.098133
Number of Threads requested = 7
Function Best Rate MB/s Avg time Min time Max time
Triad: 38849.6 0.088005 0.086640 0.089738
Number of Threads requested = 8
Function Best Rate MB/s Avg time Min time Max time
Triad: 42364.3 0.079894 0.079452 0.080624
测试工具 memtier-benchmark 读写比 10:1
yum install autoconf automake make gcc-c++ git pcre-devel zlib-devel libmemcached-devel -y
wget <https://github.com/libevent/libevent/releases/download/release-2.1.12-stable/libevent-2.1.12-stable.tar.gz>
tar xfz libevent-2.1.12-stable.tar.gz
pushd libevent-2.1.12-stable
./configure
make
make install
popd
export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig:${PKG_CONFIG_PATH}
git clone <https://github.com/RedisLabs/memtier_benchmark.git>
cd memtier_benchmark
git checkout 1.3.0
autoreconf -ivf
./configure
make
make install
memtier_benchmark -s redis-f0945b0-dcs-ccd7.dcs.huaweicloud.com -p 6379 -c 400 -d 32 --threads=10 --test-time=600
10 Threads
200 Connections per thread
10000 Requests per client
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 18128.80 --- --- 10.25700 1396.23
Gets 181088.74 0.00 181088.74 10.01400 7054.19
Waits 0.00 --- --- 0.00000 ---
Totals 199217.53 0.00 181088.74 10.03600 8450.42
memtier_benchmark -s redis-f0945b0-dcs-ccd7.dcs.huaweicloud.com -p 6379 -c 400 -d 32 --threads=8 --test-time=600
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 600 secs] 0 threads: 113578265 ops, 191478 (avg: 189293) ops/sec, 7.94MB/sec (avg: 7.84MB/sec), 16.64 (avg: 16.90) msec latency
8 Threads
400 Connections per thread
600 Seconds
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 17210.96 --- --- 17.08500 1325.87
Gets 172083.02 16.00 172067.02 16.88400 6703.63
Waits 0.00 --- --- 0.00000 ---
Totals 189293.98 16.00 172067.02 16.90200 8029.50
memtier_benchmark -s redis-f0945b0-dcs-ccd7.dcs.huaweicloud.com -p 6379 -c 100 -d 32 --threads=8 --test-time=180
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 180 secs] 0 threads: 36856696 ops, 210027 (avg: 204757) ops/sec, 8.70MB/sec (avg: 8.48MB/sec), 3.81 (avg: 3.91) msec latency
8 Threads
100 Connections per thread
180 Seconds
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 18616.45 --- --- 3.91100 1434.13
Gets 186141.98 48.89 186093.09 3.90400 7252.43
Waits 0.00 --- --- 0.00000 ---
Totals 204758.42 48.89 186093.09 3.90500 8686.56
memtier_benchmark -s r-bp186rrp1prgdq4sxb.redis.rds.aliyuncs.com -p 6379 -c 400 -d 32 --threads=10 --test-time=180
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 180 secs] 0 threads: 31112421 ops, 171815 (avg: 172833) ops/sec, 7.13MB/sec (avg: 7.16MB/sec), 23.26 (avg: 23.14) msec latency
10 Threads
400 Connections per thread
180 Seconds
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 15726.22 --- --- 24.67500 1211.08
Gets 157107.03 0.00 157107.03 22.98800 6120.37
Waits 0.00 --- --- 0.00000 ---
Totals 172833.25 0.00 157107.03 23.14200 7331.46
memtier_benchmark -s r-bp186rrp1prgdq4sxb.redis.rds.aliyuncs.com -p 6379 -c 400 -d 32 --threads=8 --test-time=180
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 180 secs] 0 threads: 32896314 ops, 177650 (avg: 182748) ops/sec, 7.35MB/sec (avg: 7.57MB/sec), 17.97 (avg: 17.51) msec latency
8 Threads
400 Connections per thread
180 Seconds
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 16621.76 --- --- 18.77100 1280.14
Gets 166130.51 0.00 166130.51 17.38200 6471.52
Waits 0.00 --- --- 0.00000 ---
Totals 182752.27 0.00 166130.51 17.50800 7751.65
memtier_benchmark -s r-bp186rrp1prgdq4sxb.redis.rds.aliyuncs.com -p 6379 -c 100 -d 32 --threads=8 --test-time=120
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 100%, 120 secs] 0 threads: 23097679 ops, 192351 (avg: 192478) ops/sec, 7.97MB/sec (avg: 7.97MB/sec), 4.16 (avg: 4.16) msec latency
8 Threads
100 Connections per thread
120 Seconds
ALL STATS
=========================================================================
Type Ops/sec Hits/sec Misses/sec Latency KB/sec
-------------------------------------------------------------------------
Sets 17501.09 --- --- 4.16700 1348.11
Gets 174977.50 20.00 174957.50 4.15300 6816.52
Waits 0.00 --- --- 0.00000 ---
Totals 192478.58 20.00 174957.50 4.15500 8164.63