数据插入到分区

从近40天的数据中,快速的扫描一次,然后过滤调某1天的数据,并按类型进行分区。首先是,如何创建出,只包含部分字段的表、又要有分区,(官方貌似并没有提供直接的语法)。其次,尝试一些索引、无日志的表等,尝试提升速度。

分区

创建分区测试

DROP TABLE IF EXISTS mytestss;
CREATE TABLE mytestss WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS SELECT
    *
FROM
    tb_scan
WHERE
    ins_db_tm >= '2021-09-08 14:23'
    AND ins_db_tm < '2021-09-08 14:25'
DISTRIBUTED BY (ship_id)
partition by range (ins_db_tm) 
(
    PARTITION pn START ('2021-09-07 00:00:00'::timestamp without time zone) END ('2021-09-09 00:00:00'::timestamp without time zone) EVERY ('1 day'::interval), 
    DEFAULT PARTITION pdefault
)
;

发生错误

ERROR: Cannot create a partitioned table using CREATE TABLE AS SELECT
HINT: Use CREATE TABLE…LIKE (followed by INSERT…SELECT) instead
Time: 0.415 ms

不能直接用这种语法创建分区。但是提示的,貌似找不到直接语法能做。

like创建表测试

DROP TABLE IF EXISTS mytestss;
CREATE TABLE mytestss(
    LIKE tb_scan
) 
WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
)
DISTRIBUTED BY (ship_id)
partition by range (ins_db_tm) 
(
    PARTITION pn START ('2021-09-07 00:00:00'::timestamp without time zone) END ('2021-09-09 00:00:00'::timestamp without time zone) EVERY ('1 day'::interval), 
    DEFAULT PARTITION pdefault
)
;

like创建带有分区的表

下面虽然能正确的创建分区,但是呢,字段必须是全部的字段。具体参见:变通方法来创建分区。

DROP TABLE IF EXISTS mytestss;
CREATE TABLE mytestss(
    LIKE tb_scan
) 
WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
)
DISTRIBUTED BY (ship_id)
partition by list (scan_typ) 
(
    partition p24 values ('24'),
    partition p14 values ('14'),
    partition p10 values ('10'),
    partition p29 values ('29'),
    partition p18 values ('18'),
    partition p7 values ('7','07'),
    partition p13 values ('13'),
    partition p5 values ('5','05','03','3'),
    partition p20 values ('20'),
    partition p97 values ('97','96'),
    default partition pdefault
)
;

insert into mytestss select * from tb_scan WHERE
    ins_db_tm >= '2021-09-07'
    AND ins_db_tm < '2021-09-08';

及时对于一个表的分区,数据量太多,时间也不短。

   count   
-----------
 440581826
 
 160秒

挪数据测试

in

insert into mytestss select * from tb_scan 
WHERE ins_db_tm >= '2021-08-07'
    AND ins_db_tm < '2021-09-08'
    AND ship_id in (select ship_id from test_scan_qian);

查询计划很复杂

INSERT 0 419074776
Time: 2238218.178 ms

INNER JOIN

DROP TABLE IF EXISTS mytestss;
CREATE TABLE mytestss WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS SELECT
    a.*
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-08-07'
    AND a.ins_db_tm < '2021-09-08'
DISTRIBUTED BY (ship_id);


SELECT 419074776
Time: 1851479.388 ms

略有提升,但是计算整张表都用了半小时,估计没有时间干别的呢

时间间隔

SELECT
    to_char(ins_db_tm,'YYYY-MM-DD'),
    count(1)
FROM
    mytestss 
group by 1 order by 2;

目的是先按最近,完成一般字段的拼接,对于未完成的,则扩大范围。7天,能满足绝大部分的需求

1 to_char count
2 2021-08-09 10184
3 2021-08-08 11570
4 2021-08-07 12171
5 2021-08-10 21328
6 2021-08-11 31366
7 2021-08-17 39701
8 2021-08-12 40683
9 2021-08-16 42964
10 2021-08-15 43145
11 2021-08-19 43449
12 2021-08-13 44199
13 2021-08-18 44219
14 2021-08-14 44689
15 2021-08-21 45435
16 2021-08-20 45801
17 2021-08-22 70617
18 2021-08-23 92952
19 2021-08-25 119449
20 2021-08-24 148483
21 2021-08-26 157087
22 2021-08-27 186852
23 2021-08-28 243529
24 2021-08-29 308812
25 2021-08-30 478306
26 2021-08-31 701390
27 2021-09-01 985593
28 2021-09-02 1940210
29 2021-09-03 5721785
30 2021-09-04 24154200
31 2021-09-05 68762933
32 2021-09-06 92614994
33 2021-09-07 221866680

总量变更测试

减少字段,试试

DROP TABLE IF EXISTS mytestss2;
CREATE TABLE mytestss2 WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS SELECT
    a.ship_id,
    a.grp_ship_id,
    a.rmk_id,
    a.rmk_inf,
    a.scan_site,
    a.scan_emp,
    a.fst_scan_site,
    a.pre_scan_site,
    a.nxt_scan_site,
    a.scan_typ,
    a.scan_tm,
    a.ins_db_tm
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-08-07'
    AND a.ins_db_tm < '2021-09-08'
DISTRIBUTED BY (ship_id);

SELECT 419074776
Time: 732309.319 ms

缩减时间

缩减7天,再试试。多余未求解到的,难道再从表找?

DROP TABLE IF EXISTS mytestss2;
CREATE TABLE mytestss2 WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
) AS SELECT
    a.ship_id,
    a.grp_ship_id,
    a.rmk_id,
    a.rmk_inf,
    a.scan_site,
    a.scan_emp,
    a.fst_scan_site,
    a.pre_scan_site,
    a.nxt_scan_site,
    a.scan_typ,
    a.scan_tm,
    a.ins_db_tm
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-09-01'
    AND a.ins_db_tm < '2021-09-08'
DISTRIBUTED BY (ship_id);


SELECT 416046395
Time: 188838.672 ms

变通方法来创建分区

先为执行的sql创建视图

这一步,也是比较坑的,因为有视图的关系,会导致依赖的表无法创建。建议,执行完后,直接删除该视图。但是呢,频繁删除,又会造成系统垃圾……

CREATE VIEW mytestss2_view AS SELECT
    a.ship_id,
    a.grp_ship_id,
    a.rmk_id,
    a.rmk_inf,
    a.scan_site,
    a.scan_emp,
    a.fst_scan_site,
    a.pre_scan_site,
    a.nxt_scan_site,
    a.scan_typ,
    a.scan_tm,
    a.ins_db_tm
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-09-01'
    AND a.ins_db_tm < '2021-09-08';

创建模板表

下面这个,在视不变的情况下,不用重复执行

CREATE TABLE mytestss2_tpl AS SELECT * from mytestss2_view limit 0;

创建分区表

下面这个一般也不用动,除非更改了表结构。所以,一般的场景,就是truncate+insert

CREATE TABLE mytestss2(
    LIKE mytestss2_tpl
) 
WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
)
DISTRIBUTED BY (ship_id)
partition by list (scan_typ) 
(
    partition p24 values ('24'),
    partition p14 values ('14'),
    partition p10 values ('10'),
    partition p29 values ('29'),
    partition p18 values ('18'),
    partition p7 values ('7','07'),
    partition p13 values ('13'),
    partition p5 values ('5','05','03','3'),
    partition p20 values ('20'),
    partition p97 values ('97','96'),
    default partition pdefault
)
;
insert into mytestss2 select * from mytestss2_view;
INSERT 0 416046395
Time: 219957.758 ms

上面分区功能,貌似有点影响。

添加索引,是否能提升速度

结论:

  • 可能不会提升速度,还可能导致原有的sql无法直接。因为内存超限制了。
DROP TABLE IF EXISTS mytestss_idx;
CREATE TABLE mytestss_idx(
    LIKE tb_scan
) 
WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
)
DISTRIBUTED BY (ship_id)
partition by list (scan_typ) 
(
    partition p24 values ('24'),
    partition p14 values ('14'),
    partition p10 values ('10'),
    partition p29 values ('29'),
    partition p18 values ('18'),
    partition p7 values ('7','07'),
    partition p13 values ('13'),
    partition p5 values ('5','05','03','3'),
    partition p20 values ('20'),
    partition p97 values ('97','96'),
    default partition pdefault
)
;

添加索引

对于只读表,只能创建下面的btree索引。

CREATE INDEX idx_ship_id ON test_scan_qian USING HASH(ship_id);
-- ERROR:  hash indexes are not supported


CREATE UNIQUE INDEX idx_ship_id ON test_scan_qian(ship_id);
-- ERROR:  append-only tables do not support unique indexes

CREATE INDEX idx_ship_id ON test_scan_qian(ship_id);
-- CREATE INDEX

测试

  • 方式1,in
insert into mytestss_idx select * from tb_scan 
WHERE ins_db_tm >= '2021-08-08'
    AND ins_db_tm < '2021-09-09'
    AND ship_id in (select ship_id from test_scan_qian);

因为索引的存在,导致后续的多种计算都失效。

ERROR:  Canceling query because of high VMEM usage. Used: 7339MB, available 817MB, red zone: 7372MB (runaway_cleaner.c:202)  (seg13 slice1 10.181.86.12:55013 pid=40582) (runaway_cleaner.c:202)

ERROR:  Canceling query because of high VMEM usage. Used: 7314MB, available 819MB, red zone: 7372MB (runaway_cleaner.c:202)  (seg10 slice1 10.181.86.12:55010 pid=52338) (runaway_cleaner.c:202)

in的执行计划,非常的复杂,没有JOIN的简单




 Insert  (cost=0.00..439.62 rows=1 width=584)
   ->  Result  (cost=0.00..439.52 rows=1 width=588)
         ->  Partition Selector for mytestss_idx  (cost=0.00..439.52 rows=1 width=584)
               ->  Redistribute Motion 24:24  (slice2; segments: 24)  (cost=0.00..439.52 rows=1 width=584)
                     Hash Key: tb_scan.ship_id
                     ->  Result  (cost=0.00..439.52 rows=1 width=584)
                           ->  GroupAggregate  (cost=0.00..439.52 rows=1 width=584)
                                 Group Key: tb_scan.oper_id, tb_scan.btch_id, tb_scan.elec_sig, tb_scan.ship_id, tb_scan.grp_ship_id, tb_scan.sub_ship_id, tb_scan.trans_ship_id, tb_scan.contr_id, tb_scan
.frgt_typ, tb_scan.frgt_wgt_rng_cd, tb_scan.frgt_wgt, tb_scan.frgt_len, tb_scan.frgt_wid, tb_scan.frgt_hei, tb_scan.frgt_vol_rng_cd, tb_scan.frgt_vol, tb_scan.frgt_nm, tb_scan.frgt_typ_dtl, tb_scan.trans
_typ, tb_scan.seal_id, tb_scan.car_lic_id, tb_scan.car_pos_cd, tb_scan.trans_perd_cd, tb_scan.car_ln_cd, tb_scan.sms_ntfy_typ, tb_scan.sms_ntfy_mbl, tb_scan.trans_tm_lmt, tb_scan.exp_prod_typ, tb_scan.ex
p_serv_lvl, tb_scan.rmk_id, tb_scan.rmk_inf, tb_scan.scan_site, tb_scan.fst_scan_site, tb_scan.pre_scan_site, tb_scan.nxt_scan_site, tb_scan.nxt_scan_prov, tb_scan.dest_site, tb_scan.dest_site_admn_dist,
 tb_scan.car_rtng_cd, tb_scan.cust_nm, tb_scan.delv_emp, tb_scan.wgt_emp, tb_scan.wrk_emp, tb_scan.scan_emp, tb_scan.wrk_grp_cd, tb_scan.scan_typ, tb_scan.scan_tm, tb_scan.gun_id, tb_scan.mbl_cd, tb_scan
.api_id, tb_scan.api_ver, tb_scan.ins_db_tm, tb_scan.db_patn_id, tb_scan.udf_1, tb_scan.udf_2, tb_scan.udf_3, tb_scan.udf_4, tb_scan.udf_5, tb_scan.udf_6, tb_scan.udf_7, tb_scan.udf_8, tb_scan.udf_9, tb_
scan.udf_10, tb_scan.udf_11, tb_scan.udf_12, tb_scan.udf_13, tb_scan.udf_14, tb_scan.udf_15, tb_scan.udf_16, tb_scan.udf_17, tb_scan.udf_18, tb_scan.udf_19, tb_scan.udf_20, tb_scan.ctid, tb_scan.tableoid
, tb_scan.gp_segment_id
                                 ->  Sort  (cost=0.00..439.52 rows=1 width=598)
                                       Sort Key: tb_scan.ctid, tb_scan.tableoid, tb_scan.gp_segment_id
                                       ->  Redistribute Motion 24:24  (slice1; segments: 24)  (cost=0.00..439.52 rows=1 width=598)
                                             Hash Key: tb_scan.ctid, tb_scan.tableoid, tb_scan.gp_segment_id
                                             ->  GroupAggregate  (cost=0.00..439.52 rows=1 width=598)
                                                   Group Key: tb_scan.oper_id, tb_scan.btch_id, tb_scan.elec_sig, tb_scan.ship_id, tb_scan.grp_ship_id, tb_scan.sub_ship_id, tb_scan.trans_ship_id, tb_scan
.contr_id, tb_scan.frgt_typ, tb_scan.frgt_wgt_rng_cd, tb_scan.frgt_wgt, tb_scan.frgt_len, tb_scan.frgt_wid, tb_scan.frgt_hei, tb_scan.frgt_vol_rng_cd, tb_scan.frgt_vol, tb_scan.frgt_nm, tb_scan.frgt_typ_
dtl, tb_scan.trans_typ, tb_scan.seal_id, tb_scan.car_lic_id, tb_scan.car_pos_cd, tb_scan.trans_perd_cd, tb_scan.car_ln_cd, tb_scan.sms_ntfy_typ, tb_scan.sms_ntfy_mbl, tb_scan.trans_tm_lmt, tb_scan.exp_pr
od_typ, tb_scan.exp_serv_lvl, tb_scan.rmk_id, tb_scan.rmk_inf, tb_scan.scan_site, tb_scan.fst_scan_site, tb_scan.pre_scan_site, tb_scan.nxt_scan_site, tb_scan.nxt_scan_prov, tb_scan.dest_site, tb_scan.de
st_site_admn_dist, tb_scan.car_rtng_cd, tb_scan.cust_nm, tb_scan.delv_emp, tb_scan.wgt_emp, tb_scan.wrk_emp, tb_scan.scan_emp, tb_scan.wrk_grp_cd, tb_scan.scan_typ, tb_scan.scan_tm, tb_scan.gun_id, tb_sc
an.mbl_cd, tb_scan.api_id, tb_scan.api_ver, tb_scan.ins_db_tm, tb_scan.db_patn_id, tb_scan.udf_1, tb_scan.udf_2, tb_scan.udf_3, tb_scan.udf_4, tb_scan.udf_5, tb_scan.udf_6, tb_scan.udf_7, tb_scan.udf_8, 
tb_scan.udf_9, tb_scan.udf_10, tb_scan.udf_11, tb_scan.udf_12, tb_scan.udf_13, tb_scan.udf_14, tb_scan.udf_15, tb_scan.udf_16, tb_scan.udf_17, tb_scan.udf_18, tb_scan.udf_19, tb_scan.udf_20, tb_scan.ctid
, tb_scan.tableoid, tb_scan.gp_segment_id
                                                   ->  Sort  (cost=0.00..439.52 rows=1 width=2290)
                                                         Sort Key: tb_scan.ctid, tb_scan.tableoid, tb_scan.gp_segment_id
                                                         ->  Nested Loop  (cost=0.00..439.52 rows=1 width=2290)
                                                               Join Filter: true
                                                               ->  Sequence  (cost=0.00..431.01 rows=1 width=2290)
                                                                     ->  Partition Selector for tb_scan (dynamic scan id: 1)  (cost=10.00..100.00 rows=5 width=4)
                                                                           Partitions selected: 33 (out of 1099)
                                                                     ->  Dynamic Seq Scan on tb_scan (dynamic scan id: 1)  (cost=0.00..431.01 rows=1 width=2290)
                                                                           Filter: ((ins_db_tm >= '2021-08-08 00:00:00'::timestamp without time zone) AND (ins_db_tm < '2021-09-09 00:00:00'::timestamp wit
hout time zone))
                                                               ->  Bitmap Heap Scan on test_scan_qian  (cost=0.00..8.52 rows=1 width=1)
                                                                     Recheck Cond: ((ship_id)::text = (tb_scan.ship_id)::text)
                                                                     ->  Bitmap Index Scan on idx_ship_id  (cost=0.00..0.00 rows=0 width=0)
                                                                           Index Cond: ((ship_id)::text = (tb_scan.ship_id)::text)
 Optimizer: Pivotal Optimizer (GPORCA)
(28 rows)
  • 方式2,join
insert into mytestss_idx SELECT
    a.*
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-08-07'
    AND a.ins_db_tm < '2021-09-08';

ERROR: Canceling query because of high VMEM usage. Used: 7337MB, available 819MB, red zone: 7372MB (runaway_cleaner.c:202) (seg9 10.181.86.12:55009 pid=9266) (runaway_cleaner.c:202)

JOIN的执行计划

Insert  (cost=0.00..439.62 rows=1 width=2276)
  ->  Result  (cost=0.00..439.52 rows=1 width=588)
        ->  Partition Selector for mytestss_idx  (cost=0.00..439.52 rows=1 width=2276)
              ->  Nested Loop  (cost=0.00..439.52 rows=1 width=2276)
                    Join Filter: true
                    ->  Sequence  (cost=0.00..431.01 rows=1 width=2276)
                          ->  Partition Selector for tb_scan (dynamic scan id: 1)  (cost=10.00..100.00 rows=5 width=4)
                                Partitions selected: 33 (out of 1099)
                          ->  Dynamic Seq Scan on tb_scan (dynamic scan id: 1)  (cost=0.00..431.01 rows=1 width=2276)
                                Filter: ((ins_db_tm >= '2021-08-07 00:00:00'::timestamp without time zone) AND (ins_db_tm < '2021-09-08 00:00:00'::timestamp without time zone))
                    ->  Bitmap Heap Scan on test_scan_qian  (cost=0.00..8.52 rows=1 width=1)
                          Recheck Cond: ((ship_id)::text = (tb_scan.ship_id)::text)
                          ->  Bitmap Index Scan on idx_ship_id  (cost=0.00..0.00 rows=0 width=0)
                                Index Cond: ((ship_id)::text = (tb_scan.ship_id)::text)
Optimizer: Pivotal Optimizer (GPORCA)
DROP TABLE IF EXISTS marked_ship_unqi;
CREATE TABLE marked_ship_unqi WITH ( 
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB 
) AS 
select  ship_id,string_agg(ship_type,'|') as ship_type from (select ship_id,ship_type from marked_ship where ship_id in (select ship_id from test_scan_qian ) group by ship_id,ship_type ) as f group by  ship_id
DISTRIBUTED BY (ship_id) ;

索引这条路,貌似也行不通,查看了建的索引,是btree索引,可能是因为索引载入内存,再加上计算的话,可能耗费比较久的时间吧。

DROP TABLE IF EXISTS marked_ship_unqi;
CREATE TABLE marked_ship_unqi WITH ( 
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB 
) AS 
select  ship_id,string_agg(ship_type,'|') as ship_type from (select ship_id,ship_type from marked_ship where ship_id in (select ship_id from test_scan_qian ) group by ship_id,ship_type ) as f group by  ship_id
DISTRIBUTED BY (ship_id) ;

上面跑不通,适当降低难度,再试。

DROP TABLE IF EXISTS marked_ship_unqi;
CREATE TABLE marked_ship_unqi WITH ( 
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB 
) AS 
select ship_id,ship_type from marked_ship where ship_id in (select ship_id from test_scan_qian ) group by ship_id,ship_type
DISTRIBUTED BY (ship_id) ;

ERROR: Canceling query because of high VMEM usage. Used: 7320MB, available 819MB, red zone: 7372MB (runaway_cleaner.c:202) (seg8 slice1 10.181.86.12:55008 pid=53334) (runaway_cleaner.c:202)

还是同样无法执行……

DROP INDEX idx_ship_id;

删除表的索引,又能计算了,貌似索引不但没有用,而且还会严重影响性能

DROP TABLE IF EXISTS marked_ship_unqi;
CREATE TABLE marked_ship_unqi WITH ( 
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB 
) AS 
select  ship_id,string_agg(ship_type,'|') as ship_type from (select ship_id,ship_type from marked_ship where ship_id in (select ship_id from test_scan_qian ) group by ship_id,ship_type ) as f group by  ship_id
DISTRIBUTED BY (ship_id);
SELECT 10753244
Time: 64686.757 ms

UNLOGGED表性能对比

  • 简单的量测试(1分钟的)

对于大量扫描的方式,并不会提高多少性能。

DROP TABLE IF EXISTS marked_ship_unqi;
CREATE UNLOGGED TABLE marked_ship_unqi WITH ( 
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB 
) AS 
select  ship_id,string_agg(ship_type,'|') as ship_type from (select ship_id,ship_type from marked_ship where ship_id in (select ship_id from test_scan_qian ) group by ship_id,ship_type ) as f group by  ship_id
DISTRIBUTED BY (ship_id) ;
SELECT 10753244
Time: 60890.819 ms
  • 在量大一点的测试
DROP TABLE IF EXISTS mytestss_idx;
CREATE UNLOGGED TABLE mytestss_idx(
    LIKE tb_scan
) 
WITH (
    APPENDONLY = TRUE,
    COMPRESSLEVEL = 5,
    ORIENTATION = COLUMN,
    COMPRESSTYPE = ZLIB
)
DISTRIBUTED BY (ship_id)
partition by list (scan_typ) 
(
    partition p24 values ('24'),
    partition p14 values ('14'),
    partition p10 values ('10'),
    partition p29 values ('29'),
    partition p18 values ('18'),
    partition p7 values ('7','07'),
    partition p13 values ('13'),
    partition p5 values ('5','05','03','3'),
    partition p20 values ('20'),
    partition p97 values ('97','96'),
    default partition pdefault
)
;
insert into mytestss_idx SELECT
    a.*
FROM
    tb_scan as a
INNER JOIN test_scan_qian as q on q.ship_id = a.ship_id
WHERE
    a.ins_db_tm >= '2021-08-07'
    AND a.ins_db_tm < '2021-09-09';
    
INSERT 0 428710345
Time: 2006922.094 ms

貌似UNLOGGED用处也不大。…… 可能耗时在大量扫描过程中。

总结

首先,将一个大表,从一种分区格式,转换为另外一种分区格式,耗费的性能非常大。所以,除非一开始就设计好,不然,不建议这么干。

在尝试的过程中,考虑过增加索引来提高性能,结果不但不行,反而会导致进程内存大量耗尽。(突然想到,是否能用上布隆过滤,将5000万的数据,放到过滤上。)然后,联表,就变成了,计算布隆过滤了。

无日志表,使用了,但是考虑性能并不在这个上面,故整体提升也不大。或者说,根本没有变。

内存表,没有用过。pg貌似只能通过间接的方式,(设置表的存储位置)来实现内存表。另外一个概念,临时表。这两个下次再测试。