物理资源性能对比

有幸安装成功了一次greenplum,便想测试一下已安装的机器的性能。故如下对比。gp的版本一致。安装的方式,也是参照相同的安装方式。

性能对比,结论:插入性能相差不大,但是在计算最值,速度相差有10倍以上。

对比

72核

机器的具体配置:

uname -r
3.10.0-957.el7.x86_64
uname -a
Linux sz53011 3.10.0-957.el7.x86_64 #1 SMP Thu Oct 4 20:48:51 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

6个固态硬盘,24个seg

内存

1 total used free shared buff/cache available
2 Mem: 377G 9.0G 197G 6.2G 171G 360G
3 Swap: 31G 64M 31G
  • 创建表,并插入1千万数据
CREATE TABLE test_demo_gen WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS 
select generate_series(1,10000000) as id ,md5(random()::text) as rnd
DISTRIBUTED BY (id);
SELECT 10000000
Time: 10392.693 ms
  • 插入上亿数据测试
insert into test_demo_gen select generate_series(1,100000000) as id ,md5(random()::text) as rnd;
ERROR:  Canceling query because of high VMEM usage. Used: 7332MB, available 814MB, red zone: 7372MB (runaway_cleaner.c:202)  (seg10 slice1 10.181.86.12:55010 pid=6174) (runaway_cleaner.c:202)
Time: 19431.844 ms
  • 插入1千万数据
insert into test_demo_gen select generate_series(1,10000000) as id ,md5(random()::text) as rnd;
INSERT 0 10000000
Time: 10275.093 ms
  • 查找性能
select count(*),max(id),min(id),max(rnd),min(rnd) from test_demo_gen;
  count   |   max    | min |               max                |               min                
----------+----------+-----+----------------------------------+----------------------------------
 20000000 | 10000000 |   1 | d684e87496865c1d12093354e9e96b68 | 5ba528f623dc140aa826324dede1f3d9
(1 row)

Time: 2511.471 ms

4核

机器配置

4 vCPU 16 GiB (I/O优化)

ecs.g5.xlarge 50Mbps (峰值)

4个seg,共用一个固态硬盘

  • 创建表,并插入1千万数据
CREATE TABLE test_demo_gen WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS 


select generate_series(1,10000000) as id ,md5(random()::text) as rnd

DISTRIBUTED BY (id);
SELECT 10000000
Time: 14633.800 ms

插入速度,相差不大。

  • 插入上亿数据测试
insert into test_demo_gen select generate_series(1,100000000) as id ,md5(random()::text) as rnd;
ERROR:  Canceling query because of high VMEM usage. Used: 7379MB, available 813MB, red zone: 7372MB (runaway_cleaner.c:202)  (seg2 slice1 172.19.241.151:55002 pid=9274) (runaway_cleaner.c:202)
Time: 27627.163 ms

一样会报错,可能是执行的sql内容太大,耗费内存。

  • 插入1千万数据
insert into test_demo_gen select generate_series(1,10000000) as id ,md5(random()::text) as rnd;
INSERT 0 10000000
Time: 14438.655 ms

性能相差不大。略多一点

  • 查找性能
select count(*),max(id),min(id),max(rnd),min(rnd) from test_demo_gen;
  count   |   max    | min |               max                |               min                
----------+----------+-----+----------------------------------+----------------------------------
 20000000 | 10000000 |   1 | e91532e79ea5fb8dc0edac23fb3f0e23 | 5b1e2a40c78aedc9bd5412a8b92588aa
(1 row)
Time: 33523.427 ms

速度相差10倍以上,貌似差别可能是在cpu上,也有可能在固态硬盘、或者节点上。

gpstate

72核如下

-Starting gpstate with args: 
-local Greenplum Version: 'postgres (Greenplum Database) 6.8.1 build commit:7118e8aca825b743dd9477d19406fcc06fa53852'
-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.8.1 build commit:7118e8aca825b743dd9477d19406fcc06fa53852) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 6.4.0, 64-bit compiled on Jun 11 2020 03:23:56'
-Obtaining Segment details from master...
-Gathering data from segments...
-Greenplum instance status summary
-----------------------------------------------------
-   Master instance                                = Active
-   Master standby                                 = No master standby configured
-   Total segment instance count from metadata     = 24
-----------------------------------------------------
-   Primary Segment Status
-----------------------------------------------------
-   Total primary segments                         = 24
-   Total primary segment valid (at master)        = 24
-   Total primary segment failures (at master)     = 0
-   Total number of postmaster.pid files missing   = 0
-   Total number of postmaster.pid files found     = 24
-   Total number of postmaster.pid PIDs missing    = 0
-   Total number of postmaster.pid PIDs found      = 24
-   Total number of /tmp lock files missing        = 0
-   Total number of /tmp lock files found          = 24
-   Total number postmaster processes missing      = 0
-   Total number postmaster processes found        = 24
-----------------------------------------------------
-   Mirror Segment Status
-----------------------------------------------------
-   Mirrors not configured on this array
-----------------------------------------------------

总结

插入速度,其实相差不大,都会遇到耗费大内存的sql,跑不动的现象。但是,在计算性能上相差比较大,可能受节点数、硬盘数、cpu的影响。不确定是哪方面的原因。

之所以,插入性能相差不大,难道是因为插入的时候,只有主节点在工作?貌似不似。

故猜测,可能是因为cpu的原因。