物理资源性能对比
有幸安装成功了一次greenplum,便想测试一下已安装的机器的性能。故如下对比。gp的版本一致。安装的方式,也是参照相同的安装方式。
性能对比,结论:插入性能相差不大,但是在计算最值,速度相差有10倍以上。
对比
72核
机器的具体配置:
uname -r
3.10.0-957.el7.x86_64
uname -a
Linux sz53011 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
6个固态硬盘,24个seg
内存
| 1 | total | used | free | shared | buff/cache | available | |
|---|---|---|---|---|---|---|---|
| 2 | Mem: | 377G | 9.0G | 197G | 6.2G | 171G | 360G |
| 3 | Swap: | 31G | 64M | 31G |
- 创建表,并插入1千万数据
CREATE TABLE test_demo_gen WITH (
APPENDONLY = TRUE,
COMPRESSLEVEL = 5,
ORIENTATION = COLUMN,
COMPRESSTYPE = ZLIB
) AS
select generate_series(1,10000000) as id ,md5(random()::text) as rnd
DISTRIBUTED BY (id);
SELECT 10000000
Time: 10392.693 ms
- 插入上亿数据测试
insert into test_demo_gen select generate_series(1,100000000) as id ,md5(random()::text) as rnd;
ERROR: Canceling query because of high VMEM usage. Used: 7332MB, available 814MB, red zone: 7372MB (runaway_cleaner.c:202) (seg10 slice1 10.181.86.12:55010 pid=6174) (runaway_cleaner.c:202)
Time: 19431.844 ms
- 插入1千万数据
insert into test_demo_gen select generate_series(1,10000000) as id ,md5(random()::text) as rnd;
INSERT 0 10000000
Time: 10275.093 ms
- 查找性能
select count(*),max(id),min(id),max(rnd),min(rnd) from test_demo_gen;
count | max | min | max | min
----------+----------+-----+----------------------------------+----------------------------------
20000000 | 10000000 | 1 | d684e87496865c1d12093354e9e96b68 | 5ba528f623dc140aa826324dede1f3d9
(1 row)
Time: 2511.471 ms
4核
机器配置
4 vCPU 16 GiB (I/O优化)
ecs.g5.xlarge 50Mbps (峰值)
4个seg,共用一个固态硬盘
- 创建表,并插入1千万数据
CREATE TABLE test_demo_gen WITH (
APPENDONLY = TRUE,
COMPRESSLEVEL = 5,
ORIENTATION = COLUMN,
COMPRESSTYPE = ZLIB
) AS
select generate_series(1,10000000) as id ,md5(random()::text) as rnd
DISTRIBUTED BY (id);
SELECT 10000000
Time: 14633.800 ms
插入速度,相差不大。
- 插入上亿数据测试
insert into test_demo_gen select generate_series(1,100000000) as id ,md5(random()::text) as rnd;
ERROR: Canceling query because of high VMEM usage. Used: 7379MB, available 813MB, red zone: 7372MB (runaway_cleaner.c:202) (seg2 slice1 172.19.241.151:55002 pid=9274) (runaway_cleaner.c:202)
Time: 27627.163 ms
一样会报错,可能是执行的sql内容太大,耗费内存。
- 插入1千万数据
insert into test_demo_gen select generate_series(1,10000000) as id ,md5(random()::text) as rnd;
INSERT 0 10000000
Time: 14438.655 ms
性能相差不大。略多一点
- 查找性能
select count(*),max(id),min(id),max(rnd),min(rnd) from test_demo_gen;
count | max | min | max | min
----------+----------+-----+----------------------------------+----------------------------------
20000000 | 10000000 | 1 | e91532e79ea5fb8dc0edac23fb3f0e23 | 5b1e2a40c78aedc9bd5412a8b92588aa
(1 row)
Time: 33523.427 ms
速度相差10倍以上,貌似差别可能是在cpu上,也有可能在固态硬盘、或者节点上。
gpstate
72核如下
-Starting gpstate with args:
-local Greenplum Version: 'postgres (Greenplum Database) 6.8.1 build commit:7118e8aca825b743dd9477d19406fcc06fa53852'
-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.8.1 build commit:7118e8aca825b743dd9477d19406fcc06fa53852) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jun 11 2020 03:23:56'
-Obtaining Segment details from master...
-Gathering data from segments...
-Greenplum instance status summary
-----------------------------------------------------
- Master instance = Active
- Master standby = No master standby configured
- Total segment instance count from metadata = 24
-----------------------------------------------------
- Primary Segment Status
-----------------------------------------------------
- Total primary segments = 24
- Total primary segment valid (at master) = 24
- Total primary segment failures (at master) = 0
- Total number of postmaster.pid files missing = 0
- Total number of postmaster.pid files found = 24
- Total number of postmaster.pid PIDs missing = 0
- Total number of postmaster.pid PIDs found = 24
- Total number of /tmp lock files missing = 0
- Total number of /tmp lock files found = 24
- Total number postmaster processes missing = 0
- Total number postmaster processes found = 24
-----------------------------------------------------
- Mirror Segment Status
-----------------------------------------------------
- Mirrors not configured on this array
-----------------------------------------------------
总结
插入速度,其实相差不大,都会遇到耗费大内存的sql,跑不动的现象。但是,在计算性能上相差比较大,可能受节点数、硬盘数、cpu的影响。不确定是哪方面的原因。
之所以,插入性能相差不大,难道是因为插入的时候,只有主节点在工作?貌似不似。
故猜测,可能是因为cpu的原因。