PostgreSQL 11 1000亿 tpcb、1000W tpcc 性能测试 - on 阿里云ECS + ESSD (含quorum based 0丢失多副本配置与性能测试)

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介: 标签PostgreSQL , pgbench , tpcb , tpcc , tpch背景https://help.aliyun.com/knowledge_detail/64950.html阿里云ESSD提供了单盘32TB容量,100万IOPS,4GB/s读写吞吐的能力,单台ECS可以挂载16块ESSD盘,组成512 TB的大容量存储。

标签

PostgreSQL , pgbench , tpcb , tpcc , tpch


背景

https://help.aliyun.com/knowledge_detail/64950.html

阿里云ESSD提供了单盘32TB容量,100万IOPS,4GB/s读写吞吐的能力,单台ECS可以挂载16块ESSD盘,组成512 TB的大容量存储。非常适合数据库这类IO密集应用。

PostgreSQL 作为一款优秀的企业级开源数据库产品,阿里云ESSD的加入,可以带给用户什么样的体验呢?

《PostgreSQL 11 100亿 tpcb 性能 on ECS》

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》

ESSD云盘部署

parted -s /dev/vdb mklabel gpt          
parted -s /dev/vdb mkpart primary 1MiB 100%         
mkfs.ext4 /dev/vdb1 -m 0 -O extent,uninit_bg -E lazy_itable_init=1 -T largefile -L data01          
mkdir /data01            
          
vi /etc/fstab            
            
LABEL=data01 /data01     ext4        defaults,noatime,nodiratime,nodelalloc,barrier=0,data=writeback    0 0          
          
          
mount -a      

fsync RT测试

1 ECS本地SSD

使用PostgreSQL提供的pg_test_fsync进行测试

5 seconds per test    
O_DIRECT supported on this platform for open_datasync and open_sync.    
    
Compare file sync methods using one 8kB write:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                     46355.824 ops/sec      22 usecs/op    
        fdatasync                         39213.835 ops/sec      26 usecs/op    
        fsync                             35912.478 ops/sec      28 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                         42426.737 ops/sec      24 usecs/op    
    
Compare file sync methods using two 8kB writes:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                     17109.945 ops/sec      58 usecs/op    
        fdatasync                         26316.089 ops/sec      38 usecs/op    
        fsync                             24202.679 ops/sec      41 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                         15760.721 ops/sec      63 usecs/op    
    
Compare open_sync with different write sizes:    
(This is designed to compare the cost of writing 16kB in different write    
open_sync sizes.)    
         1 * 16kB open_sync write         29108.820 ops/sec      34 usecs/op    
         2 *  8kB open_sync writes        15674.805 ops/sec      64 usecs/op    
         4 *  4kB open_sync writes         9942.061 ops/sec     101 usecs/op    
         8 *  2kB open_sync writes         5637.484 ops/sec     177 usecs/op    
        16 *  1kB open_sync writes         3076.057 ops/sec     325 usecs/op    
    
Test if fsync on non-write file descriptor is honored:    
(If the times are similar, fsync() can sync data written on a different    
descriptor.)    
        write, fsync, close               32581.863 ops/sec      31 usecs/op    
        write, close, fsync               32512.798 ops/sec      31 usecs/op    
    
Non-sync'ed 8kB writes:    
        write                            350232.219 ops/sec       3 usecs/op    

2 ESSD

5 seconds per test    
O_DIRECT supported on this platform for open_datasync and open_sync.    
    
Compare file sync methods using one 8kB write:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                      8395.592 ops/sec     119 usecs/op    
        fdatasync                          7722.692 ops/sec     129 usecs/op    
        fsync                              5619.389 ops/sec     178 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                          5685.669 ops/sec     176 usecs/op    
    
Compare file sync methods using two 8kB writes:    
(in wal_sync_method preference order, except fdatasync is Linux's default)    
        open_datasync                      3858.783 ops/sec     259 usecs/op    
        fdatasync                          5396.356 ops/sec     185 usecs/op    
        fsync                              4214.546 ops/sec     237 usecs/op    
        fsync_writethrough                              n/a    
        open_sync                          3025.366 ops/sec     331 usecs/op    
    
Compare open_sync with different write sizes:    
(This is designed to compare the cost of writing 16kB in different write    
open_sync sizes.)    
         1 * 16kB open_sync write          4506.749 ops/sec     222 usecs/op    
         2 *  8kB open_sync writes         3099.963 ops/sec     323 usecs/op    
         4 *  4kB open_sync writes         1763.684 ops/sec     567 usecs/op    
         8 *  2kB open_sync writes          429.923 ops/sec    2326 usecs/op    
        16 *  1kB open_sync writes          198.005 ops/sec    5050 usecs/op    
    
Test if fsync on non-write file descriptor is honored:    
(If the times are similar, fsync() can sync data written on a different    
descriptor.)    
        write, fsync, close                5393.927 ops/sec     185 usecs/op    
        write, close, fsync                5470.240 ops/sec     183 usecs/op    
    
Non-sync'ed 8kB writes:    
        write                            385505.858 ops/sec       3 usecs/op    

fio 专业IO测试

测试项

vi test    
    
[global]        
thread        
numjobs=64        
ramp_time=6        
size=10g        
exitall        
time_based        
runtime=180        
group_reporting        
randrepeat=0        
norandommap        
bs=8k        
rwmixwrite=35        
        
[rw-rand-libaio-mysql-ext4]        
stonewall        
direct=1        
iodepth=16        
iodepth_batch=8        
iodepth_low=8        
iodepth_batch_complete=8        
rw=randrw        
ioengine=libaio        
filename=/data01/ext4        
        
[rw-seq-libaio-mysql-ext4]        
stonewall        
direct=1        
iodepth=16        
iodepth_batch=8        
iodepth_low=8        
iodepth_batch_complete=8        
rw=rw        
ioengine=libaio        
filename=/data01/ext4        
    
[rw-rand-sync-pgsql-ext4]        
stonewall        
direct=0        
rw=randrw        
ioengine=sync        
filename=/data01/ext4        
        
[rw-seq-sync-pgsql-ext4]        
stonewall        
direct=0        
rw=rw        
ioengine=sync        
filename=/data01/ext4    
fio test --output ./cfq-raw.log    

1 ECS本地SSD

rw-rand-libaio-mysql-ext4: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-seq-libaio-mysql-ext4: (g=1): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-rand-sync-pgsql-ext4: (g=2): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
rw-seq-sync-pgsql-ext4: (g=3): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
fio-3.1    
Starting 256 threads    
    
rw-rand-libaio-mysql-ext4: (groupid=0, jobs=64): err= 0: pid=27005: Tue Sep 18 15:18:42 2018    
   read: IOPS=146k, BW=1140MiB/s (1196MB/s)(200GiB/180004msec)    
    slat (usec): min=22, max=114828, avg=2266.17, stdev=2520.66    
    clat (nsec): min=1759, max=115061k, avg=2316672.58, stdev=2575502.75    
     lat (usec): min=70, max=164337, avg=4582.93, stdev=3639.95    
    clat percentiles (usec):    
     |  1.00th=[  603],  5.00th=[ 1352], 10.00th=[ 1549], 20.00th=[ 1713],    
     | 30.00th=[ 1811], 40.00th=[ 1876], 50.00th=[ 1942], 60.00th=[ 2008],    
     | 70.00th=[ 2057], 80.00th=[ 2114], 90.00th=[ 2245], 95.00th=[ 2376],    
     | 99.00th=[16581], 99.50th=[19792], 99.90th=[27919], 99.95th=[32900],    
     | 99.99th=[49021]    
   bw (  KiB/s): min= 6000, max=42120, per=1.54%, avg=17968.83, stdev=2125.54, samples=22976    
   iops        : min=  750, max= 5265, avg=2245.75, stdev=265.70, samples=22976    
  write: IOPS=78.6k, BW=614MiB/s (644MB/s)(108GiB/180004msec)    
    slat (usec): min=24, max=114826, avg=2266.44, stdev=2515.63    
    clat (nsec): min=1056, max=114853k, avg=2225812.28, stdev=2478372.79    
     lat (usec): min=43, max=164125, avg=4492.34, stdev=3574.60    
    clat percentiles (usec):    
     |  1.00th=[   19],  5.00th=[ 1123], 10.00th=[ 1500], 20.00th=[ 1696],    
     | 30.00th=[ 1795], 40.00th=[ 1876], 50.00th=[ 1942], 60.00th=[ 1991],    
     | 70.00th=[ 2057], 80.00th=[ 2114], 90.00th=[ 2212], 95.00th=[ 2311],    
     | 99.00th=[16057], 99.50th=[19268], 99.90th=[27132], 99.95th=[31327],    
     | 99.99th=[47449]    
   bw (  KiB/s): min= 3235, max=22265, per=1.54%, avg=9680.55, stdev=1180.58, samples=22976    
   iops        : min=  404, max= 2783, avg=1209.72, stdev=147.58, samples=22976    
  lat (usec)   : 2=0.01%, 4=0.06%, 10=0.04%, 20=0.33%, 50=0.37%    
  lat (usec)   : 100=0.16%, 250=0.21%, 500=0.37%, 750=0.59%, 1000=0.84%    
  lat (msec)   : 2=56.76%, 4=36.91%, 10=0.82%, 20=2.08%, 50=0.44%    
  lat (msec)   : 100=0.01%, 250=0.01%    
  cpu          : usr=0.50%, sys=82.17%, ctx=8285760, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.4%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=26268555,14152416,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-seq-libaio-mysql-ext4: (groupid=1, jobs=64): err= 0: pid=27075: Tue Sep 18 15:18:42 2018    
   read: IOPS=147k, BW=1150MiB/s (1206MB/s)(202GiB/180004msec)    
    slat (usec): min=22, max=97402, avg=2245.77, stdev=2561.30    
    clat (nsec): min=1597, max=97414k, avg=2299719.84, stdev=2622137.64    
     lat (usec): min=37, max=121819, avg=4545.57, stdev=3688.31    
    clat percentiles (usec):    
     |  1.00th=[  529],  5.00th=[ 1287], 10.00th=[ 1516], 20.00th=[ 1680],    
     | 30.00th=[ 1778], 40.00th=[ 1844], 50.00th=[ 1909], 60.00th=[ 1975],    
     | 70.00th=[ 2040], 80.00th=[ 2114], 90.00th=[ 2212], 95.00th=[ 2376],    
     | 99.00th=[16712], 99.50th=[19792], 99.90th=[28181], 99.95th=[32375],    
     | 99.99th=[47973]    
   bw (  KiB/s): min= 6822, max=34080, per=1.53%, avg=18044.54, stdev=2237.76, samples=22979    
   iops        : min=  852, max= 4260, avg=2255.24, stdev=279.74, samples=22979    
  write: IOPS=79.3k, BW=620MiB/s (650MB/s)(109GiB/180004msec)    
    slat (usec): min=24, max=97396, avg=2247.37, stdev=2563.13    
    clat (nsec): min=965, max=97414k, avg=2198816.49, stdev=2518690.40    
     lat (usec): min=40, max=121817, avg=4446.27, stdev=3625.49    
    clat percentiles (usec):    
     |  1.00th=[   17],  5.00th=[  996], 10.00th=[ 1434], 20.00th=[ 1647],    
     | 30.00th=[ 1762], 40.00th=[ 1844], 50.00th=[ 1909], 60.00th=[ 1958],    
     | 70.00th=[ 2024], 80.00th=[ 2089], 90.00th=[ 2180], 95.00th=[ 2278],    
     | 99.00th=[16188], 99.50th=[19268], 99.90th=[27132], 99.95th=[31327],    
     | 99.99th=[46400]    
   bw (  KiB/s): min= 3073, max=18400, per=1.53%, avg=9720.67, stdev=1243.91, samples=22979    
   iops        : min=  384, max= 2300, avg=1214.75, stdev=155.52, samples=22979    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.01%, 4=0.09%, 10=0.06%, 20=0.50%, 50=0.28%    
  lat (usec)   : 100=0.13%, 250=0.20%, 500=0.54%, 750=0.71%, 1000=0.98%    
  lat (msec)   : 2=61.58%, 4=31.54%, 10=0.78%, 20=2.16%, 50=0.46%    
  lat (msec)   : 100=0.01%    
  cpu          : usr=0.50%, sys=81.14%, ctx=8462673, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.3%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=26505435,14277757,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-rand-sync-pgsql-ext4: (groupid=2, jobs=64): err= 0: pid=27150: Tue Sep 18 15:18:42 2018    
   read: IOPS=228k, BW=1785MiB/s (1872MB/s)(314GiB/180002msec)    
    clat (nsec): min=945, max=14282k, avg=5002.95, stdev=48977.41    
     lat (nsec): min=1247, max=14282k, avg=5374.56, stdev=49037.52    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[    3], 10.00th=[    3], 20.00th=[    4],    
     | 30.00th=[    4], 40.00th=[    4], 50.00th=[    4], 60.00th=[    4],    
     | 70.00th=[    4], 80.00th=[    4], 90.00th=[    5], 95.00th=[    5],    
     | 99.00th=[   12], 99.50th=[   90], 99.90th=[  212], 99.95th=[  586],    
     | 99.99th=[ 2180]    
   bw (  KiB/s): min= 4824, max=77689, per=1.55%, avg=28302.49, stdev=3542.33, samples=22979    
   iops        : min=  603, max= 9711, avg=3537.39, stdev=442.79, samples=22979    
  write: IOPS=123k, BW=961MiB/s (1008MB/s)(169GiB/180002msec)    
    clat (usec): min=2, max=216155, avg=504.28, stdev=1314.99    
     lat (usec): min=2, max=216155, avg=504.72, stdev=1315.00    
    clat percentiles (usec):    
     |  1.00th=[    8],  5.00th=[   77], 10.00th=[  285], 20.00th=[  392],    
     | 30.00th=[  408], 40.00th=[  420], 50.00th=[  469], 60.00th=[  519],    
     | 70.00th=[  545], 80.00th=[  562], 90.00th=[  586], 95.00th=[  603],    
     | 99.00th=[  685], 99.50th=[ 2089], 99.90th=[15270], 99.95th=[23462],    
     | 99.99th=[46924]    
   bw (  KiB/s): min= 2661, max=42215, per=1.55%, avg=15240.76, stdev=1812.18, samples=22979    
   iops        : min=  332, max= 5276, avg=1904.70, stdev=226.51, samples=22979    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.01%, 4=56.34%, 10=8.49%, 20=1.04%, 50=0.35%    
  lat (usec)   : 100=0.43%, 250=1.42%, 500=16.05%, 750=15.58%, 1000=0.04%    
  lat (msec)   : 2=0.07%, 4=0.05%, 10=0.07%, 20=0.04%, 50=0.02%    
  lat (msec)   : 100=0.01%, 250=0.01%    
  cpu          : usr=0.93%, sys=89.47%, ctx=5631403, majf=0, minf=0    
  IO depths    : 1=102.3%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=41127965,22146325,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
rw-seq-sync-pgsql-ext4: (groupid=3, jobs=64): err= 0: pid=27221: Tue Sep 18 15:18:42 2018    
   read: IOPS=357k, BW=2786MiB/s (2922MB/s)(490GiB/180001msec)    
    clat (nsec): min=1006, max=20336k, avg=2745.46, stdev=22291.35    
     lat (nsec): min=1307, max=20336k, avg=3110.59, stdev=22301.63    
    clat percentiles (nsec):    
     |  1.00th=[ 1592],  5.00th=[ 1800], 10.00th=[ 1944], 20.00th=[ 2160],    
     | 30.00th=[ 2352], 40.00th=[ 2480], 50.00th=[ 2608], 60.00th=[ 2736],    
     | 70.00th=[ 2864], 80.00th=[ 3024], 90.00th=[ 3216], 95.00th=[ 3440],    
     | 99.00th=[ 4016], 99.50th=[ 8896], 99.90th=[15552], 99.95th=[17280],    
     | 99.99th=[21376]    
   bw (  KiB/s): min=21099, max=151871, per=1.56%, avg=44588.93, stdev=5219.41, samples=22983    
   iops        : min= 2637, max=18983, avg=5573.29, stdev=652.43, samples=22983    
  write: IOPS=192k, BW=1500MiB/s (1573MB/s)(264GiB/180001msec)    
    clat (usec): min=2, max=97210, avg=322.71, stdev=828.04    
     lat (usec): min=2, max=97210, avg=323.13, stdev=828.04    
    clat percentiles (usec):    
     |  1.00th=[    6],  5.00th=[   97], 10.00th=[  182], 20.00th=[  269],    
     | 30.00th=[  297], 40.00th=[  306], 50.00th=[  310], 60.00th=[  314],    
     | 70.00th=[  322], 80.00th=[  326], 90.00th=[  334], 95.00th=[  343],    
     | 99.00th=[  400], 99.50th=[  930], 99.90th=[12911], 99.95th=[19792],    
     | 99.99th=[32113]    
   bw (  KiB/s): min=11433, max=81619, per=1.56%, avg=24008.70, stdev=2707.40, samples=22983    
   iops        : min= 1429, max=10202, avg=3000.77, stdev=338.43, samples=22983    
  lat (usec)   : 2=7.89%, 4=56.48%, 10=1.25%, 20=0.37%, 50=0.25%    
  lat (usec)   : 100=0.57%, 250=4.14%, 500=28.80%, 750=0.06%, 1000=0.03%    
  lat (msec)   : 2=0.05%, 4=0.03%, 10=0.04%, 20=0.03%, 50=0.02%    
  lat (msec)   : 100=0.01%    
  cpu          : usr=1.34%, sys=90.37%, ctx=8645391, majf=0, minf=0    
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=64196338,34566565,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
    
Run status group 0 (all jobs):    
   READ: bw=1140MiB/s (1196MB/s), 1140MiB/s-1140MiB/s (1196MB/s-1196MB/s), io=200GiB (215GB), run=180004-180004msec    
  WRITE: bw=614MiB/s (644MB/s), 614MiB/s-614MiB/s (644MB/s-644MB/s), io=108GiB (116GB), run=180004-180004msec    
    
Run status group 1 (all jobs):    
   READ: bw=1150MiB/s (1206MB/s), 1150MiB/s-1150MiB/s (1206MB/s-1206MB/s), io=202GiB (217GB), run=180004-180004msec    
  WRITE: bw=620MiB/s (650MB/s), 620MiB/s-620MiB/s (650MB/s-650MB/s), io=109GiB (117GB), run=180004-180004msec    
    
Run status group 2 (all jobs):    
   READ: bw=1785MiB/s (1872MB/s), 1785MiB/s-1785MiB/s (1872MB/s-1872MB/s), io=314GiB (337GB), run=180002-180002msec    
  WRITE: bw=961MiB/s (1008MB/s), 961MiB/s-961MiB/s (1008MB/s-1008MB/s), io=169GiB (181GB), run=180002-180002msec    
    
Run status group 3 (all jobs):    
   READ: bw=2786MiB/s (2922MB/s), 2786MiB/s-2786MiB/s (2922MB/s-2922MB/s), io=490GiB (526GB), run=180001-180001msec    
  WRITE: bw=1500MiB/s (1573MB/s), 1500MiB/s-1500MiB/s (1573MB/s-1573MB/s), io=264GiB (283GB), run=180001-180001msec    
    
Disk stats (read/write):    
    dm-0: ios=56484741/52000097, merge=0/0, ticks=55721482/45621907, in_queue=102865767, util=67.65%, aggrios=6930165/6249265, aggrmerge=130427/250746, aggrticks=6648860/4061887, aggrin_queue=10712820, aggrutil=66.53%    
  vdb: ios=6928564/6249448, merge=130186/250641, ticks=6629128/3928950, in_queue=10559055, util=65.93%    
  vdc: ios=6930486/6248991, merge=130413/250950, ticks=6643790/3962244, in_queue=10605847, util=66.02%    
  vdd: ios=6928089/6250855, merge=130732/250764, ticks=6472207/4009640, in_queue=10493342, util=66.18%    
  vde: ios=6929909/6250351, merge=130382/250303, ticks=6661137/4040922, in_queue=10701353, util=66.20%    
  vdf: ios=6932429/6245563, merge=130328/251129, ticks=6681549/4088606, in_queue=10769793, util=66.29%    
  vdg: ios=6930521/6249106, merge=130467/250956, ticks=6697543/4113859, in_queue=10811547, util=66.35%    
  vdh: ios=6930993/6249837, merge=130681/250844, ticks=6698661/4151500, in_queue=10851206, util=66.41%    
  vdi: ios=6930333/6249975, merge=130228/250384, ticks=6706869/4199379, in_queue=10910423, util=66.53%    

2 ESSD

rw-rand-libaio-mysql-ext4: (g=0): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-seq-libaio-mysql-ext4: (g=1): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=16    
...    
rw-rand-sync-pgsql-ext4: (g=2): rw=randrw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
rw-seq-sync-pgsql-ext4: (g=3): rw=rw, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=sync, iodepth=1    
...    
fio-3.1    
Starting 256 threads    
    
rw-rand-libaio-mysql-ext4: (groupid=0, jobs=64): err= 0: pid=21221: Tue Sep 18 15:19:03 2018    
   read: IOPS=112k, BW=879MiB/s (922MB/s)(154GiB/180008msec)    
    slat (usec): min=16, max=119403, avg=1880.80, stdev=2119.90    
    clat (nsec): min=940, max=512641k, avg=4098774.34, stdev=3181162.93    
     lat (usec): min=158, max=514427, avg=5979.64, stdev=3696.51    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[  486], 10.00th=[  922], 20.00th=[ 1729],    
     | 30.00th=[ 2474], 40.00th=[ 2933], 50.00th=[ 3425], 60.00th=[ 4080],    
     | 70.00th=[ 5014], 80.00th=[ 6128], 90.00th=[ 7832], 95.00th=[ 9503],    
     | 99.00th=[14091], 99.50th=[16450], 99.90th=[23200], 99.95th=[29230],    
     | 99.99th=[61604]    
   bw (  KiB/s): min= 8615, max=31056, per=1.57%, avg=14172.78, stdev=990.53, samples=23040    
   iops        : min= 1076, max= 3882, avg=1771.17, stdev=123.81, samples=23040    
  write: IOPS=60.6k, BW=473MiB/s (496MB/s)(83.2GiB/180008msec)    
    slat (usec): min=17, max=119400, avg=1881.51, stdev=2122.86    
    clat (nsec): min=1019, max=510353k, avg=3892737.90, stdev=3154201.30    
     lat (usec): min=118, max=512897, avg=5774.32, stdev=3688.63    
    clat percentiles (usec):    
     |  1.00th=[    3],  5.00th=[  388], 10.00th=[  799], 20.00th=[ 1500],    
     | 30.00th=[ 2278], 40.00th=[ 2802], 50.00th=[ 3261], 60.00th=[ 3851],    
     | 70.00th=[ 4752], 80.00th=[ 5866], 90.00th=[ 7504], 95.00th=[ 9241],    
     | 99.00th=[13829], 99.50th=[16188], 99.90th=[22676], 99.95th=[28181],    
     | 99.99th=[58459]    
   bw (  KiB/s): min= 4137, max=17712, per=1.57%, avg=7634.48, stdev=590.18, samples=23040    
   iops        : min=  517, max= 2214, avg=953.86, stdev=73.79, samples=23040    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.71%, 4=1.40%, 10=0.17%, 20=0.04%, 50=0.11%    
  lat (usec)   : 100=0.20%, 250=0.80%, 500=2.06%, 750=2.89%, 1000=3.31%    
  lat (msec)   : 2=12.56%, 4=35.84%, 10=36.00%, 20=3.73%, 50=0.17%    
  lat (msec)   : 100=0.01%, 250=0.01%, 750=0.01%    
  cpu          : usr=0.31%, sys=49.41%, ctx=17838909, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=103.3%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=20248219,10907423,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-seq-libaio-mysql-ext4: (groupid=1, jobs=64): err= 0: pid=21285: Tue Sep 18 15:19:03 2018    
   read: IOPS=133k, BW=1037MiB/s (1087MB/s)(182GiB/180005msec)    
    slat (usec): min=10, max=40443, avg=861.52, stdev=1493.93    
    clat (nsec): min=953, max=508067k, avg=4239522.11, stdev=2801559.16    
     lat (usec): min=188, max=510200, avg=5101.11, stdev=3015.85    
    clat percentiles (usec):    
     |  1.00th=[  420],  5.00th=[  914], 10.00th=[ 1319], 20.00th=[ 2024],    
     | 30.00th=[ 2671], 40.00th=[ 3261], 50.00th=[ 3851], 60.00th=[ 4490],    
     | 70.00th=[ 5211], 80.00th=[ 6194], 90.00th=[ 7635], 95.00th=[ 8979],    
     | 99.00th=[11731], 99.50th=[12911], 99.90th=[15795], 99.95th=[17171],    
     | 99.99th=[21890]    
   bw (  KiB/s): min=  602, max=22672, per=1.53%, avg=16220.02, stdev=2446.89, samples=23020    
   iops        : min=   75, max= 2834, avg=2027.07, stdev=305.86, samples=23020    
  write: IOPS=71.4k, BW=558MiB/s (585MB/s)(98.1GiB/180005msec)    
    slat (usec): min=12, max=40442, avg=871.28, stdev=1501.73    
    clat (nsec): min=913, max=509492k, avg=3964521.95, stdev=2769947.35    
     lat (usec): min=129, max=511537, avg=4835.88, stdev=2992.88    
    clat percentiles (usec):    
     |  1.00th=[  223],  5.00th=[  791], 10.00th=[ 1172], 20.00th=[ 1827],    
     | 30.00th=[ 2409], 40.00th=[ 2999], 50.00th=[ 3589], 60.00th=[ 4228],    
     | 70.00th=[ 4883], 80.00th=[ 5866], 90.00th=[ 7242], 95.00th=[ 8455],    
     | 99.00th=[11207], 99.50th=[12387], 99.90th=[15008], 99.95th=[16319],    
     | 99.99th=[20317]    
   bw (  KiB/s): min=  309, max=12471, per=1.53%, avg=8733.75, stdev=1342.51, samples=23020    
   iops        : min=   38, max= 1558, avg=1091.29, stdev=167.81, samples=23020    
  lat (nsec)   : 1000=0.01%    
  lat (usec)   : 2=0.31%, 4=0.36%, 10=0.02%, 20=0.01%, 50=0.01%    
  lat (usec)   : 100=0.01%, 250=0.08%, 500=0.82%, 750=2.09%, 1000=2.82%    
  lat (msec)   : 2=14.32%, 4=33.12%, 10=43.55%, 20=2.47%, 50=0.01%    
  lat (msec)   : 100=0.01%, 250=0.01%, 750=0.01%    
  cpu          : usr=0.41%, sys=9.90%, ctx=21948046, majf=0, minf=0    
  IO depths    : 1=0.0%, 2=0.0%, 4=0.0%, 8=0.1%, 16=102.1%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=0.0%, 8=100.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=23881029,12859729,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=16    
rw-rand-sync-pgsql-ext4: (groupid=2, jobs=64): err= 0: pid=21355: Tue Sep 18 15:19:03 2018    
   read: IOPS=304k, BW=2374MiB/s (2489MB/s)(417GiB/180001msec)    
    clat (nsec): min=659, max=20344k, avg=4937.11, stdev=53635.32    
     lat (nsec): min=830, max=20344k, avg=5147.67, stdev=53639.04    
    clat percentiles (nsec):    
     |  1.00th=[   1928],  5.00th=[   2096], 10.00th=[   2192],    
     | 20.00th=[   2352], 30.00th=[   2448], 40.00th=[   2544],    
     | 50.00th=[   2640], 60.00th=[   2736], 70.00th=[   2864],    
     | 80.00th=[   2992], 90.00th=[   3248], 95.00th=[   3504],    
     | 99.00th=[   8640], 99.50th=[  16512], 99.90th=[ 585728],    
     | 99.95th=[1122304], 99.99th=[2375680]    
   bw (  KiB/s): min= 8661, max=312564, per=1.57%, avg=38145.72, stdev=5837.53, samples=23040    
   iops        : min= 1082, max=39070, avg=4767.84, stdev=729.68, samples=23040    
  write: IOPS=164k, BW=1278MiB/s (1340MB/s)(225GiB/180001msec)    
    clat (usec): min=2, max=517110, avg=377.98, stdev=1501.97    
     lat (usec): min=2, max=517110, avg=378.25, stdev=1501.97    
    clat percentiles (usec):    
     |  1.00th=[    6],  5.00th=[   99], 10.00th=[  265], 20.00th=[  330],    
     | 30.00th=[  343], 40.00th=[  351], 50.00th=[  355], 60.00th=[  363],    
     | 70.00th=[  371], 80.00th=[  379], 90.00th=[  396], 95.00th=[  416],    
     | 99.00th=[  545], 99.50th=[ 1045], 99.90th=[11469], 99.95th=[16581],    
     | 99.99th=[28967]    
   bw (  KiB/s): min= 4709, max=168193, per=1.57%, avg=20539.14, stdev=3068.53, samples=23040    
   iops        : min=  588, max=21024, avg=2567.01, stdev=383.55, samples=23040    
  lat (nsec)   : 750=0.01%, 1000=0.01%    
  lat (usec)   : 2=1.33%, 4=62.24%, 10=1.91%, 20=0.50%, 50=0.18%    
  lat (usec)   : 100=0.32%, 250=1.63%, 500=31.34%, 750=0.27%, 1000=0.06%    
  lat (msec)   : 2=0.07%, 4=0.04%, 10=0.06%, 20=0.03%, 50=0.01%    
  lat (msec)   : 100=0.01%, 250=0.01%, 500=0.01%, 750=0.01%    
  cpu          : usr=0.53%, sys=89.73%, ctx=7630423, majf=0, minf=0    
  IO depths    : 1=102.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=54689028,29447045,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
rw-seq-sync-pgsql-ext4: (groupid=3, jobs=64): err= 0: pid=21420: Tue Sep 18 15:19:03 2018    
   read: IOPS=400k, BW=3128MiB/s (3280MB/s)(550GiB/180002msec)    
    clat (nsec): min=655, max=77256k, avg=1908.86, stdev=16100.37    
     lat (nsec): min=818, max=77256k, avg=2119.98, stdev=16102.12    
    clat percentiles (nsec):    
     |  1.00th=[  988],  5.00th=[ 1144], 10.00th=[ 1256], 20.00th=[ 1416],    
     | 30.00th=[ 1576], 40.00th=[ 1704], 50.00th=[ 1816], 60.00th=[ 1928],    
     | 70.00th=[ 2040], 80.00th=[ 2192], 90.00th=[ 2384], 95.00th=[ 2576],    
     | 99.00th=[ 3184], 99.50th=[ 5472], 99.90th=[13504], 99.95th=[15424],    
     | 99.99th=[18304]    
   bw (  KiB/s): min=27335, max=205560, per=1.57%, avg=50224.99, stdev=4880.87, samples=23040    
   iops        : min= 3416, max=25695, avg=6277.74, stdev=610.11, samples=23040    
  write: IOPS=216k, BW=1684MiB/s (1766MB/s)(296GiB/180002msec)    
    clat (nsec): min=1980, max=344807k, avg=289811.56, stdev=750064.44    
     lat (usec): min=2, max=344807, avg=290.08, stdev=750.06    
    clat percentiles (usec):    
     |  1.00th=[    5],  5.00th=[  106], 10.00th=[  194], 20.00th=[  258],    
     | 30.00th=[  273], 40.00th=[  277], 50.00th=[  285], 60.00th=[  289],    
     | 70.00th=[  293], 80.00th=[  297], 90.00th=[  306], 95.00th=[  310],    
     | 99.00th=[  351], 99.50th=[  799], 99.90th=[ 8356], 99.95th=[12387],    
     | 99.99th=[24249]    
   bw (  KiB/s): min=13581, max=112633, per=1.57%, avg=27046.57, stdev=2507.74, samples=23040    
   iops        : min= 1697, max=14079, avg=3380.43, stdev=313.47, samples=23040    
  lat (nsec)   : 750=0.01%, 1000=0.74%    
  lat (usec)   : 2=42.27%, 4=21.80%, 10=1.17%, 20=0.26%, 50=0.13%    
  lat (usec)   : 100=0.33%, 250=4.51%, 500=28.59%, 750=0.03%, 1000=0.02%    
  lat (msec)   : 2=0.05%, 4=0.04%, 10=0.05%, 20=0.02%, 50=0.01%    
  lat (msec)   : 100=0.01%, 500=0.01%    
  cpu          : usr=0.63%, sys=92.37%, ctx=9426487, majf=0, minf=0    
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%    
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%    
     issued rwt: total=72069547,38810440,0, short=0,0,0, dropped=0,0,0    
     latency   : target=0, window=0, percentile=100.00%, depth=1    
    
Run status group 0 (all jobs):    
   READ: bw=879MiB/s (922MB/s), 879MiB/s-879MiB/s (922MB/s-922MB/s), io=154GiB (166GB), run=180008-180008msec    
  WRITE: bw=473MiB/s (496MB/s), 473MiB/s-473MiB/s (496MB/s-496MB/s), io=83.2GiB (89.4GB), run=180008-180008msec    
    
Run status group 1 (all jobs):    
   READ: bw=1037MiB/s (1087MB/s), 1037MiB/s-1037MiB/s (1087MB/s-1087MB/s), io=182GiB (196GB), run=180005-180005msec    
  WRITE: bw=558MiB/s (585MB/s), 558MiB/s-558MiB/s (585MB/s-585MB/s), io=98.1GiB (105GB), run=180005-180005msec    
    
Run status group 2 (all jobs):    
   READ: bw=2374MiB/s (2489MB/s), 2374MiB/s-2374MiB/s (2489MB/s-2489MB/s), io=417GiB (448GB), run=180001-180001msec    
  WRITE: bw=1278MiB/s (1340MB/s), 1278MiB/s-1278MiB/s (1340MB/s-1340MB/s), io=225GiB (241GB), run=180001-180001msec    
    
Run status group 3 (all jobs):    
   READ: bw=3128MiB/s (3280MB/s), 3128MiB/s-3128MiB/s (3280MB/s-3280MB/s), io=550GiB (590GB), run=180002-180002msec    
  WRITE: bw=1684MiB/s (1766MB/s), 1684MiB/s-1684MiB/s (1766MB/s-1766MB/s), io=296GiB (318GB), run=180002-180002msec    
    
Disk stats (read/write):    
  vdb: ios=32558186/26432830, merge=13683969/3967228, ticks=103112470/71513817, in_queue=174719639, util=60.48%    

PostgreSQL 11 测试

1、参数

listen_addresses = '0.0.0.0'      
port = 1921      
max_connections = 2000      
superuser_reserved_connections = 3      
unix_socket_directories = '., /var/run/postgresql, /tmp'      
tcp_keepalives_idle = 60      
tcp_keepalives_interval = 10      
tcp_keepalives_count = 10      
shared_buffers = 64GB      
max_prepared_transactions = 2000      
work_mem = 8MB      
maintenance_work_mem = 2GB      
dynamic_shared_memory_type = posix      
vacuum_cost_delay = 0      
bgwriter_delay = 10ms      
bgwriter_lru_maxpages = 1000      
bgwriter_lru_multiplier = 10.0      
effective_io_concurrency = 0      
max_worker_processes = 128      
max_parallel_maintenance_workers = 64      
max_parallel_workers_per_gather = 0      
parallel_leader_participation = on      
max_parallel_workers = 64      
wal_level = minimal        
synchronous_commit = off      
wal_writer_delay = 10ms      
checkpoint_timeout = 35min      
max_wal_size = 128GB      
min_wal_size = 32GB      
checkpoint_completion_target = 0.1      
max_wal_senders = 0      
effective_cache_size = 400GB      
log_destination = 'csvlog'      
logging_collector = on      
log_directory = 'log'      
log_filename = 'postgresql-%a.log'      
log_truncate_on_rotation = on      
log_rotation_age = 1d      
log_rotation_size = 0      
log_checkpoints = on       
log_connections = on      
log_disconnections = on      
log_error_verbosity = verbose       
log_line_prefix = '%m [%p] '      
log_timezone = 'PRC'      
log_autovacuum_min_duration = 0      
autovacuum_max_workers = 16      
autovacuum_freeze_max_age = 1200000000      
autovacuum_multixact_freeze_max_age = 1400000000      
autovacuum_vacuum_cost_delay = 0ms      
vacuum_freeze_table_age = 1150000000      
vacuum_multixact_freeze_table_age = 1150000000      
datestyle = 'iso, mdy'      
timezone = 'PRC'      
lc_messages = 'C'      
lc_monetary = 'C'      
lc_numeric = 'C'      
lc_time = 'C'      
default_text_search_config = 'pg_catalog.english'      
jit = off      
cpu_tuple_cost=0.00018884145574257426        
cpu_index_tuple_cost = 0.00433497085216479990        
cpu_operator_cost = 0.00216748542608239995        
seq_page_cost=0.014329        
random_page_cost = 0.016     
parallel_tuple_cost = 0      
parallel_setup_cost = 0     
min_parallel_table_scan_size = 0    
min_parallel_index_scan_size = 0    

1000W tpcc 测试

16072 * 60 = 96.4万 tpmC    

详细结果

......    
[ 2993s ] thds: 64 tps: 15107.81 qps: 431171.53 (r/w/o: 196624.50/204331.41/30215.62) lat (ms,95%): 10.65 err/s 51.00 reconn/s: 0.00    
[ 2994s ] thds: 64 tps: 15454.20 qps: 434439.71 (r/w/o: 198195.61/205335.70/30908.41) lat (ms,95%): 10.46 err/s 77.00 reconn/s: 0.00    
[ 2995s ] thds: 64 tps: 15480.57 qps: 438798.81 (r/w/o: 200298.44/207538.23/30962.14) lat (ms,95%): 10.46 err/s 72.00 reconn/s: 0.00    
[ 2996s ] thds: 64 tps: 15341.97 qps: 434496.22 (r/w/o: 198027.64/205784.63/30683.94) lat (ms,95%): 10.65 err/s 73.00 reconn/s: 0.00    
[ 2997s ] thds: 64 tps: 15208.54 qps: 433973.96 (r/w/o: 197975.05/205581.82/30417.08) lat (ms,95%): 10.65 err/s 75.01 reconn/s: 0.00    
[ 2998s ] thds: 64 tps: 15300.14 qps: 431763.76 (r/w/o: 196862.95/204300.53/30600.28) lat (ms,95%): 10.65 err/s 84.00 reconn/s: 0.00    
[ 2999s ] thds: 64 tps: 15108.49 qps: 426253.35 (r/w/o: 194171.57/201864.80/30216.98) lat (ms,95%): 10.65 err/s 56.99 reconn/s: 0.00    
[ 3000s ] thds: 64 tps: 15046.89 qps: 428187.32 (r/w/o: 195463.56/202631.98/30091.78) lat (ms,95%): 10.84 err/s 70.00 reconn/s: 0.00    
SQL statistics:    
    queries performed:    
        read:                            625427855    
        write:                           649118720    
        other:                           96478628    
        total:                           1371025203    
    transactions:                        48223282 (16072.47 per sec.)    
    queries:                             1371025203 (456952.75 per sec.)    
    ignored errors:                      210005 (69.99 per sec.)    
    reconnects:                          0      (0.00 per sec.)    
    
General statistics:    
    total time:                          3000.3636s    
    total number of events:              48223282    
    
Latency (ms):    
         min:                                    0.28    
         avg:                                    3.98    
         max:                                  912.95    
         95th percentile:                        9.91    
         sum:                            191859179.61    
    
Threads fairness:    
    events (avg/stddev):           753488.7812/3072.19    
    execution time (avg/stddev):   2997.7997/0.05    

1000亿 tpcb 测试

1、生成1000亿数据

nohup pgbench -i -s 1000000 -I dtg >./pgbench_ins.log 2>&1 &    
    
    
99998900000 of 100000000000 tuples (99%) done (elapsed 93180.83 s, remaining 1.03 s)    
99999000000 of 100000000000 tuples (99%) done (elapsed 93181.05 s, remaining 0.93 s)    
99999100000 of 100000000000 tuples (99%) done (elapsed 93181.13 s, remaining 0.84 s)    
99999200000 of 100000000000 tuples (99%) done (elapsed 93181.21 s, remaining 0.75 s)    
99999300000 of 100000000000 tuples (99%) done (elapsed 93181.30 s, remaining 0.65 s)    
99999400000 of 100000000000 tuples (99%) done (elapsed 93182.01 s, remaining 0.56 s)    
99999500000 of 100000000000 tuples (99%) done (elapsed 93182.09 s, remaining 0.47 s)    
99999600000 of 100000000000 tuples (99%) done (elapsed 93182.17 s, remaining 0.37 s)    
99999700000 of 100000000000 tuples (99%) done (elapsed 93182.25 s, remaining 0.28 s)    
99999800000 of 100000000000 tuples (99%) done (elapsed 93182.33 s, remaining 0.19 s)    
99999900000 of 100000000000 tuples (99%) done (elapsed 93182.42 s, remaining 0.09 s)    
100000000000 of 100000000000 tuples (100%) done (elapsed 93182.50 s, remaining 0.00 s)    
done.    

生成1000亿数据耗时: 93182 秒。 (约25小时 52分钟。)

2、给1000亿的单表创建索引(64 parallel)

postgres=# analyze pgbench_accounts ;    
ANALYZE    
postgres=# alter table pgbench_accounts set (parallel_workers =64);    
ALTER TABLE    
nohup pgbench -i -s 1000000 -I p > ./pk.log 2>&1 &    

1000亿单表创建索引耗时: 10小时 50分钟。

1000亿单表、索引容量大小

1000亿单表:12.5 TB。

1000亿单表索引: 2 TB。

postgres=# \di+ pgbench*  
                                      List of relations  
 Schema |         Name          | Type  |  Owner   |      Table       |  Size   | Description   
--------+-----------------------+-------+----------+------------------+---------+-------------  
 public | pgbench_accounts_pkey | index | postgres | pgbench_accounts | 2092 GB |   
 public | pgbench_branches_pkey | index | postgres | pgbench_branches | 21 MB   |   
 public | pgbench_tellers_pkey  | index | postgres | pgbench_tellers  | 214 MB  |   
(3 rows)  
  
postgres=# \dt+ pgbench*  
                          List of relations  
 Schema |       Name       | Type  |  Owner   |  Size   | Description   
--------+------------------+-------+----------+---------+-------------  
 public | pgbench_accounts | table | postgres | 12 TB   |   
 public | pgbench_branches | table | postgres | 35 MB   |   
 public | pgbench_history  | table | postgres | 0 bytes |   
 public | pgbench_tellers  | table | postgres | 422 MB  |   
(4 rows)  

索引深度

1、1000亿行,INT8类型索引,深度为4(不包括ROOT PAGE)。

postgres=# select * from bt_metap('pgbench_accounts_pkey');  
 magic  | version |   root   | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+----------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 | 23149704 |     4 | 23149704 |         4 |           0 |                      -1  
(1 row)  

2、索引查询,索引需要搜索5个BLOCK。

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from pgbench_accounts where aid=10000000;  
                                                                   QUERY PLAN                                                                      
-------------------------------------------------------------------------------------------------------------------------------------------------  
 Index Scan using pgbench_accounts_pkey on public.pgbench_accounts  (cost=0.62..0.66 rows=1 width=101) (actual time=0.020..0.021 rows=1 loops=1)  
   Output: aid, bid, abalance, filler  
   Index Cond: (pgbench_accounts.aid = 10000000)  
   Buffers: shared hit=6  -- 5个index block, 1个heap block  
 Planning Time: 0.049 ms  
 Execution Time: 0.033 ms  
(6 rows)  

《深入浅出PostgreSQL B-Tree索引结构》

tpcb 1000亿 性能测试

使用高斯分布,生成测试数据。

《生成泊松、高斯、指数、随机分布数据 - PostgreSQL 9.5 new feature - pgbench improve, gaussian (standard normal) & exponential distribution》

只读

vi test.sql  
  
\set aid random_gaussian(1, :range, 10.0)  
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

读写

vi rw.sql  
  
\set aid random_gaussian(1, :range, 10.0)  
\set bid random(1, 1 * :scale)  
\set tid random(1, 10 * :scale)  
\set delta random(-5000, 5000)  
BEGIN;  
UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
END;  

tpcb 1000亿 只读测试

1、活跃数据10亿

QPS: 998818

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=1000000000  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 359606534  
latency average = 0.064 ms  
latency stddev = 0.046 ms  
tps = 998777.462686 (including connections establishing)  
tps = 998818.121681 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.062  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

2、活跃数据100亿

QPS: 597877

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=10000000000  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 215257932  
latency average = 0.107 ms  
latency stddev = 0.526 ms  
tps = 597861.125133 (including connections establishing)  
tps = 597877.469245 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.105  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

前期IO

Total DISK READ :       2.32 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.32 G/s | Actual DISK WRITE:       0.00 B/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
46798 be/4 postgres   31.72 M/s    0.00 B/s  0.00 % 72.60 % postgres: postgres postgres [local] BIND    
46774 be/4 postgres   37.42 M/s    0.00 B/s  0.00 % 71.91 % postgres: postgres postgres [local] SELECT  
46792 be/4 postgres   35.54 M/s    0.00 B/s  0.00 % 71.89 % postgres: postgres postgres [local] idle    
46708 be/4 postgres   35.08 M/s    0.00 B/s  0.00 % 71.59 % postgres: postgres postgres [local] SELECT  
46730 be/4 postgres   46.84 M/s    0.00 B/s  0.00 % 70.99 % postgres: postgres postgres [local] SELECT  
46704 be/4 postgres   34.51 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46716 be/4 postgres   46.05 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46788 be/4 postgres   33.83 M/s    0.00 B/s  0.00 % 70.84 % postgres: postgres postgres [local] SELECT  
46807 be/4 postgres   33.78 M/s    0.00 B/s  0.00 % 70.41 % postgres: postgres postgres [local] SELECT  
46815 be/4 postgres   35.21 M/s    0.00 B/s  0.00 % 70.33 % postgres: postgres postgres [local] SELECT  
46812 be/4 postgres   45.95 M/s    0.00 B/s  0.00 % 70.18 % postgres: postgres postgres [local] SELECT  
46752 be/4 postgres   34.21 M/s    0.00 B/s  0.00 % 70.09 % postgres: postgres postgres [local] SELECT  

加热后IO

... ...    
  
Total DISK READ :     527.32 M/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:     527.24 M/s | Actual DISK WRITE:      30.77 K/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47108 be/4 postgres   11.77 M/s    0.00 B/s  0.00 %  4.71 % postgres: postgres postgres [local] SELECT  
47025 be/4 postgres   10.55 M/s    0.00 B/s  0.00 %  4.39 % postgres: postgres postgres [local] SELECT  
47115 be/4 postgres    9.28 M/s    0.00 B/s  0.00 %  4.30 % postgres: postgres postgres [local] SELECT  
47061 be/4 postgres   13.27 M/s    0.00 B/s  0.00 %  4.23 % postgres: postgres postgres [local] SELECT  
47082 be/4 postgres   10.49 M/s    0.00 B/s  0.00 %  4.21 % postgres: postgres postgres [local] SELECT  
47111 be/4 postgres    6.54 M/s    0.00 B/s  0.00 %  4.18 % postgres: postgres postgres [local] idle    
47071 be/4 postgres    6.46 M/s    0.00 B/s  0.00 %  4.15 % postgres: postgres postgres [local] idle    
47018 be/4 postgres    9.13 M/s    0.00 B/s  0.00 %  4.11 % postgres: postgres postgres [local] idle    
47087 be/4 postgres    5.77 M/s    0.00 B/s  0.00 %  4.09 % postgres: postgres postgres [local] idle    
47105 be/4 postgres    8.89 M/s    0.00 B/s  0.00 %  4.08 % postgres: postgres postgres [local] BINDCT  
47069 be/4 postgres    8.46 M/s    0.00 B/s  0.00 %  4.05 % postgres: postgres postgres [local] SELECT  
47106 be/4 postgres    8.89 M/s    0.00 B/s  0.00 %  3.91 % postgres: postgres postgres [local] idle    
47053 be/4 postgres    6.48 M/s    0.00 B/s  0.00 %  3.91 % postgres: postgres postgres [local] SELECT  
47028 be/4 postgres    9.71 M/s    0.00 B/s  0.00 %  3.83 % postgres: postgres postgres [local] idle    
47112 be/4 postgres    6.72 M/s    0.00 B/s  0.00 %  3.82 % postgres: postgres postgres [local] SELECT  
47039 be/4 postgres    7.63 M/s    0.00 B/s  0.00 %  3.81 % postgres: postgres postgres [local] BIND    

3、活跃数据500亿

QPS: 66678

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=50000000000  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 3600 s  
number of transactions actually processed: 240046184  
latency average = 0.960 ms  
latency stddev = 1.660 ms  
tps = 66678.433880 (including connections establishing)  
tps = 66678.672147 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.001  \set aid random_gaussian(1, :range, 10.0)  
         0.958  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

IO

Total DISK READ :       2.45 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.45 G/s | Actual DISK WRITE:       0.00 B/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47230 be/4 postgres   39.87 M/s    0.00 B/s  0.00 % 87.93 % postgres: postgres postgres [local] SELECT  
47218 be/4 postgres   32.12 M/s    0.00 B/s  0.00 % 87.85 % postgres: postgres postgres [local] SELECT  
47196 be/4 postgres   38.54 M/s    0.00 B/s  0.00 % 87.77 % postgres: postgres postgres [local] SELECT  
47250 be/4 postgres   32.52 M/s    0.00 B/s  0.00 % 87.73 % postgres: postgres postgres [local] SELECT  
47210 be/4 postgres   35.25 M/s    0.00 B/s  0.00 % 87.64 % postgres: postgres postgres [local] SELECT  
47173 be/4 postgres   35.29 M/s    0.00 B/s  0.00 % 87.63 % postgres: postgres postgres [local] SELECT  
47220 be/4 postgres   36.14 M/s    0.00 B/s  0.00 % 87.63 % postgres: postgres postgres [local] SELECT  
47243 be/4 postgres   44.79 M/s    0.00 B/s  0.00 % 87.61 % postgres: postgres postgres [local] SELECT  
47149 be/4 postgres   48.33 M/s    0.00 B/s  0.00 % 87.55 % postgres: postgres postgres [local] SELECT  
47245 be/4 postgres   44.83 M/s    0.00 B/s  0.00 % 87.54 % postgres: postgres postgres [local] SELECT  
47254 be/4 postgres   29.74 M/s    0.00 B/s  0.00 % 87.53 % postgres: postgres postgres [local] SELECT  
47253 be/4 postgres   41.24 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47162 be/4 postgres   30.31 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47229 be/4 postgres   29.40 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47234 be/4 postgres   37.08 M/s    0.00 B/s  0.00 % 87.50 % postgres: postgres postgres [local] SELECT  
47242 be/4 postgres   40.28 M/s    0.00 B/s  0.00 % 87.46 % postgres: postgres postgres [local] SELECT  
47186 be/4 postgres   36.05 M/s    0.00 B/s  0.00 % 87.44 % postgres: postgres postgres [local] SELECT  
47165 be/4 postgres   33.66 M/s    0.00 B/s  0.00 % 87.43 % postgres: postgres postgres [local] SELECT  

4、活跃数据1000亿

QPS: 67295

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 360 -D range=100000000000  
  
  
transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 3600 s  
number of transactions actually processed: 242265704  
latency average = 0.951 ms  
latency stddev = 2.313 ms  
tps = 67295.523254 (including connections establishing)  
tps = 67295.778158 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.949  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

IO

Total DISK READ :       2.24 G/s | Total DISK WRITE :       0.00 B/s  
Actual DISK READ:       2.24 G/s | Actual DISK WRITE:      54.79 K/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
47932 be/4 postgres   33.54 M/s    0.00 B/s  0.00 % 88.36 % postgres: postgres postgres [local] SELECT  
48010 be/4 postgres   33.41 M/s    0.00 B/s  0.00 % 88.27 % postgres: postgres postgres [local] SELECT  
48021 be/4 postgres   34.10 M/s    0.00 B/s  0.00 % 88.21 % postgres: postgres postgres [local] SELECT  
48049 be/4 postgres   32.14 M/s    0.00 B/s  0.00 % 88.20 % postgres: postgres postgres [local] SELECT  
48048 be/4 postgres   33.34 M/s    0.00 B/s  0.00 % 88.18 % postgres: postgres postgres [local] SELECT  
47988 be/4 postgres   31.79 M/s    0.00 B/s  0.00 % 88.11 % postgres: postgres postgres [local] SELECT  
48007 be/4 postgres   26.25 M/s    0.00 B/s  0.00 % 88.07 % postgres: postgres postgres [local] SELECT  
48013 be/4 postgres   35.37 M/s    0.00 B/s  0.00 % 88.07 % postgres: postgres postgres [local] SELECT  
47949 be/4 postgres   36.25 M/s    0.00 B/s  0.00 % 88.04 % postgres: postgres postgres [local] SELECT  
47979 be/4 postgres   44.90 M/s    0.00 B/s  0.00 % 88.02 % postgres: postgres postgres [local] SELECT  
48047 be/4 postgres   39.64 M/s    0.00 B/s  0.00 % 87.97 % postgres: postgres postgres [local] SELECT  
48038 be/4 postgres   39.24 M/s    0.00 B/s  0.00 % 87.93 % postgres: postgres postgres [local] SELECT  
48034 be/4 postgres   38.02 M/s    0.00 B/s  0.00 % 87.89 % postgres: postgres postgres [local] SELECT  
48019 be/4 postgres   35.99 M/s    0.00 B/s  0.00 % 87.88 % postgres: postgres postgres [local] SELECT  
48046 be/4 postgres   32.00 M/s    0.00 B/s  0.00 % 87.88 % postgres: postgres postgres [local] SELECT  

tpcb 1000亿 读写测试

1、活跃数据10亿

TPS: 95119

QPS: 475595

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=1000000000   
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 34244287  
latency average = 0.673 ms  
latency stddev = 0.394 ms  
tps = 95116.186279 (including connections establishing)  
tps = 95119.886927 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.046  BEGIN;  
         0.133  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.077  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.104  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.088  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.074  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.146  END;  

2、活跃数据100亿

TPS: 85278

QPS: 426390

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=10000000000  
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 30702466  
latency average = 0.750 ms  
latency stddev = 1.518 ms  
tps = 85275.759706 (including connections establishing)  
tps = 85278.402619 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.047  BEGIN;  
         0.193  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.082  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.108  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.093  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.078  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.144  END;  

IO

Total DISK READ :     124.77 M/s | Total DISK WRITE :     846.78 M/s  
Actual DISK READ:     124.01 M/s | Actual DISK WRITE:     820.10 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
31430 be/4 postgres   27.03 K/s   44.03 M/s  0.00 % 22.13 % postgres: walwriter  
49767 be/4 postgres 1629.40 K/s    8.68 M/s  0.00 %  3.03 % postgres: postgres postgres [local] UPDATE  
49771 be/4 postgres    2.73 M/s    8.63 M/s  0.00 %  2.34 % postgres: postgres postgres [local] UPDATE               
49742 be/4 postgres    2.77 M/s    8.26 M/s  0.00 %  2.31 % postgres: postgres postgres [local] UPDATE               
49787 be/4 postgres 1343.68 K/s    9.27 M/s  0.00 %  2.29 % postgres: postgres postgres [local] UPDATE               
49785 be/4 postgres    3.05 M/s    8.64 M/s  0.00 %  2.27 % postgres: postgres postgres [local] UPDATE               
49776 be/4 postgres 1783.85 K/s   11.64 M/s  0.00 %  2.05 % postgres: postgres postgres [local] UPDATE               
49774 be/4 postgres    4.22 M/s    8.13 M/s  0.00 %  2.05 % postgres: postgres postgres [local] UPDATE  
49775 be/4 postgres  671.84 K/s    8.79 M/s  0.00 %  2.04 % postgres: postgres postgres [local] UPDATE               
49786 be/4 postgres 1220.12 K/s    8.15 M/s  0.00 %  2.04 % postgres: postgres postgres [local] UPDATE               
49772 be/4 postgres 1003.90 K/s    8.77 M/s  0.00 %  2.02 % postgres: postgres postgres [local] UPDATE               
49697 be/4 postgres    2.56 M/s    8.69 M/s  0.00 %  2.01 % postgres: postgres postgres [local] UPDATE  
49803 be/4 postgres  733.62 K/s    8.22 M/s  0.00 %  2.00 % postgres: postgres postgres [local] UPDATE  
49806 be/4 postgres    2.84 M/s   10.23 M/s  0.00 %  1.99 % postgres: postgres postgres [local] UPDATE               
49804 be/4 postgres 1783.85 K/s    8.60 M/s  0.00 %  1.98 % postgres: postgres postgres [local] UPDATE               
49766 be/4 postgres  478.78 K/s   14.62 M/s  0.00 %  1.97 % postgres: postgres postgres [local] UPDATE               
49770 be/4 postgres    2.29 M/s    8.82 M/s  0.00 %  1.96 % postgres: postgres postgres [local] UPDATE  
49715 be/4 postgres    3.20 M/s    8.37 M/s  0.00 %  1.96 % postgres: postgres postgres [local] UPDATE               
49810 be/4 postgres    3.35 M/s    9.68 M/s  0.00 %  1.93 % postgres: postgres postgres [local] UPDATE  
49780 be/4 postgres    2.56 M/s    8.22 M/s  0.00 %  1.92 % postgres: postgres postgres [local] UPDATE               
49784 be/4 postgres    3.39 M/s    8.35 M/s  0.00 %  1.92 % postgres: postgres postgres [local] UPDATE               
49734 be/4 postgres    2.31 M/s    8.66 M/s  0.00 %  1.91 % postgres: postgres postgres [local] UPDATE               
49800 be/4 postgres 2023.24 K/s    8.55 M/s  0.00 %  1.90 % postgres: postgres postgres [local] UPDATE  
49794 be/4 postgres 1629.40 K/s    8.88 M/s  0.00 %  1.85 % postgres: postgres postgres [local] UPDATE               
49765 be/4 postgres    3.24 M/s    8.67 M/s  0.00 %  1.80 % postgres: postgres postgres [local] UPDATE               
49724 be/4 postgres 1173.79 K/s   10.08 M/s  0.00 %  1.79 % postgres: postgres postgres [local] UPDATE               
49728 be/4 postgres 1413.18 K/s    8.63 M/s  0.00 %  1.78 % postgres: postgres postgres [local] UPDATE  
49781 be/4 postgres 1436.34 K/s    7.93 M/s  0.00 %  1.76 % postgres: postgres postgres [local] UPDATE               
49790 be/4 postgres 1096.56 K/s    8.60 M/s  0.00 %  1.76 % postgres: postgres postgres [local] UPDATE               

3、活跃数据500亿

TPS: 38301

QPS: 191505

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=50000000000   
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 13790704  
latency average = 1.671 ms  
latency stddev = 2.620 ms  
tps = 38299.935890 (including connections establishing)  
tps = 38301.102322 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.031  BEGIN;  
         1.274  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.068  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.103  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.076  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.058  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.056  END;  

IO

Total DISK READ :    1508.55 M/s | Total DISK WRITE :     618.92 M/s  
Actual DISK READ:    1507.72 M/s | Actual DISK WRITE:     450.33 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
49510 be/4 postgres   21.45 M/s    5.40 M/s  0.00 % 61.29 % postgres: postgres postgres [local] UPDATE               
49507 be/4 postgres   26.85 M/s    5.61 M/s  0.00 % 60.79 % postgres: postgres postgres [local] UPDATE               
49456 be/4 postgres   28.59 M/s    5.67 M/s  0.00 % 60.42 % postgres: postgres postgres [local] UPDATE               
49436 be/4 postgres   23.76 M/s    5.23 M/s  0.00 % 60.31 % postgres: postgres postgres [local] UPDATE               
49516 be/4 postgres   21.82 M/s    5.40 M/s  0.00 % 59.84 % postgres: postgres postgres [local] UPDATE               
49414 be/4 postgres   20.77 M/s    5.44 M/s  0.00 % 59.84 % postgres: postgres postgres [local] UPDATE               
49503 be/4 postgres   20.13 M/s    6.15 M/s  0.00 % 59.81 % postgres: postgres postgres [local] UPDATE               
49410 be/4 postgres   29.23 M/s    5.52 M/s  0.00 % 59.73 % postgres: postgres postgres [local] UPDATE               
49427 be/4 postgres   18.61 M/s    5.18 M/s  0.00 % 59.71 % postgres: postgres postgres [local] idle in transaction  
49501 be/4 postgres   17.22 M/s    5.60 M/s  0.00 % 59.70 % postgres: postgres postgres [local] UPDATE               
49493 be/4 postgres   24.60 M/s    7.22 M/s  0.00 % 59.66 % postgres: postgres postgres [local] UPDATE               
49512 be/4 postgres   23.08 M/s    5.53 M/s  0.00 % 59.65 % postgres: postgres postgres [local] UPDATE               
49509 be/4 postgres   24.04 M/s    5.64 M/s  0.00 % 59.55 % postgres: postgres postgres [local] UPDATE               
49490 be/4 postgres   17.89 M/s    5.62 M/s  0.00 % 59.55 % postgres: postgres postgres [local] UPDATE             

4、活跃数据1000亿

TPS: 35189

QPS: 175945

pgbench -M prepared -n -r -P 1 -f ./rw.sql -c 64 -j 64 -T 360 -D scale=1000000 -D range=100000000000   
  
  
transaction type: ./rw.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 64  
number of threads: 64  
duration: 360 s  
number of transactions actually processed: 12670591  
latency average = 1.818 ms  
latency stddev = 3.928 ms  
tps = 35188.224787 (including connections establishing)  
tps = 35189.625697 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.003  \set aid random_gaussian(1, :range, 10.0)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         0.032  BEGIN;  
         1.392  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.072  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.112  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.086  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.061  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         0.059  END;  

IO

Total DISK READ :    1824.52 M/s | Total DISK WRITE :     241.53 M/s  
Actual DISK READ:    1821.44 M/s | Actual DISK WRITE:     237.31 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
48918 be/4 postgres   27.19 M/s    0.00 B/s  0.00 % 67.37 % postgres: postgres postgres [local] UPDATE  
48893 be/4 postgres   32.32 M/s    0.00 B/s  0.00 % 67.29 % postgres: postgres postgres [local] UPDATE               
48914 be/4 postgres   23.47 M/s    0.00 B/s  0.00 % 67.29 % postgres: postgres postgres [local] UPDATE  
48889 be/4 postgres   33.45 M/s    0.00 B/s  0.00 % 67.23 % postgres: postgres postgres [local] UPDATE               
48904 be/4 postgres   34.80 M/s    0.00 B/s  0.00 % 67.21 % postgres: postgres postgres [local] UPDATE  
48861 be/4 postgres   30.88 M/s    7.81 K/s  0.00 % 67.06 % postgres: postgres postgres [local] UPDATE  
48910 be/4 postgres   27.16 M/s  328.11 K/s  0.00 % 67.02 % postgres: postgres postgres [local] UPDATE               
48821 be/4 postgres   27.54 M/s    7.81 K/s  0.00 % 67.01 % postgres: postgres postgres [local] UPDATE  
48825 be/4 postgres   35.88 M/s    0.00 B/s  0.00 % 66.89 % postgres: postgres postgres [local] UPDATE               
48930 be/4 postgres   31.68 M/s    7.81 K/s  0.00 % 66.82 % postgres: postgres postgres [local] UPDATE  
48867 be/4 postgres   26.99 M/s    7.81 K/s  0.00 % 66.81 % postgres: postgres postgres [local] idle in transaction  
48929 be/4 postgres   25.61 M/s    7.81 K/s  0.00 % 66.77 % postgres: postgres postgres [local] UPDATE  
48894 be/4 postgres   24.08 M/s    0.00 B/s  0.00 % 66.67 % postgres: postgres postgres [local] UPDATE               
48921 be/4 postgres   32.90 M/s  640.60 K/s  0.00 % 66.66 % postgres: postgres postgres [local] UPDATE  
48925 be/4 postgres   27.30 M/s    0.00 B/s  0.00 % 66.63 % postgres: postgres postgres [local] UPDATE  
48829 be/4 postgres   24.85 M/s    0.00 B/s  0.00 % 66.63 % postgres: postgres postgres [local] idle                 
48901 be/4 postgres   29.57 M/s    0.00 B/s  0.00 % 66.62 % postgres: postgres postgres [local] UPDATE  
48927 be/4 postgres   24.33 M/s    7.81 K/s  0.00 % 66.59 % postgres: postgres postgres [local] UPDATE  
48933 be/4 postgres   27.85 M/s    7.81 K/s  0.00 % 66.57 % postgres: postgres postgres [local] BINDTE  
48890 be/4 postgres   27.79 M/s    0.00 B/s  0.00 % 66.56 % postgres: postgres postgres [local] UPDATE  
48931 be/4 postgres   30.29 M/s    0.00 B/s  0.00 % 66.55 % postgres: postgres postgres [local] UPDATE  

其他测试

同步多副本环境

《PostgreSQL 一主多从(多副本,强同步)简明手册 - 配置、压测、监控、切换、防脑裂、修复、0丢失 - 珍藏级》

其中一个备库使用zfs存储,开启lz4压缩,提供闪回,备份能力。

创建备库,单个备库的创建速度约500MB/s,15TB的库,需要9个半小时创建完成。

[root@pg11 ~]# dstat
You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw 
  4   7  85   4   0   0| 289M  164M|   0     0 |   0     0 |  41k   80k
  0   1  99   0   0   0| 508M    0 |1218k 1007M|   0     0 |  40k 1559 
  0   1  99   0   0   0| 500M    0 |1233k 1004M|   0     0 |  41k 1673 
  0   1  99   0   0   0| 492M    0 |1206k  994M|   0     0 |  40k 1576 
  0   1  99   0   0   0| 508M    0 |1245k 1015M|   0     0 |  41k 1601 
  0   1  99   0   0   0| 516M    0 |1257k 1021M|   0     0 |  42k 1576 
  0   2  98   0   0   0| 520M    0 |1300k 1044M|   0     0 |  44k 1891 

[root@pg11 ~]# top -c -u postgres

top - 15:09:33 up 2 days,  4:48,  2 users,  load average: 0.41, 3.98, 22.70
Tasks: 516 total,   2 running, 514 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.1 us,  0.8 sy,  0.0 ni, 99.1 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 52819500+total, 29681988 free,  3079916 used, 49543308+buff/cache
KiB Swap:        0 total,        0 free,        0 used. 52057548+avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                  
50168 postgres  20   0 66.436g  14264  12248 R  35.1  0.0   0:37.57 postgres: walsender postgres 172.17.20.28(48412) sending backup "pg_basebackup base backup"                                                                              
50176 postgres  20   0 66.436g  20416  18400 S  24.2  0.0   0:18.35 postgres: walsender postgres 172.17.20.29(65032) sending backup "pg_basebackup base backup"                                                                              
50154 postgres  20   0 66.433g 1.979g 1.978g S   0.0  0.4   0:01.41 /usr/pgsql-11/bin/postgres                                                                                                                                               
50155 postgres  20   0  245148   2036    596 S   0.0  0.0   0:00.00 postgres: logger                                                                                                                                                         
50157 postgres  20   0 66.434g 531212 529720 S   0.0  0.1   0:00.43 postgres: checkpointer                                                                                                                                                   
50158 postgres  20   0 66.434g 530592 529120 S   0.0  0.1   0:00.54 postgres: background writer                                                                                                                                              
50159 postgres  20   0 66.433g 526884 525420 S   0.0  0.1   0:00.28 postgres: walwriter                                                                                                                                                      
50160 postgres  20   0 66.436g   3224   1392 S   0.0  0.0   0:00.00 postgres: autovacuum launcher                                                                                                                                            
50161 postgres  20   0  247404   2252    692 S   0.0  0.0   0:00.00 postgres: stats collector                                                                                                                                                
50162 postgres  20   0 66.436g   2844   1096 S   0.0  0.0   0:00.00 postgres: logical replication launcher                                                                                                                                   
50169 postgres  20   0 66.436g   4124   1944 S   0.0  0.0   0:00.01 postgres: walsender postgres 172.17.20.28(48414) streaming 101/44000140                                                                                                  
50177 postgres  20   0 66.436g   3796   1872 S   0.0  0.0   0:00.00 postgres: walsender postgres 172.17.20.29(65034) streaming 101/44000140                                                                                                  

配置步骤

1、配置pg_hba.conf

host replication all xxx.xxx.xxx.xxx/32 md5

2、创建replication角色用户

create role repxxx login replication encrypted password 'xxx';

3、pg_basebackup拉取数据,注意开启SLOT,否则对于很大的数据库,可能数据备份结束后,WAL已经在主库被清除了。(有WAL归档的情况下,可以不开启SLOT。开启SLOT后,未被备库拉取的WAL不会被清除。)

export PGPASSWORD=xxx

nohup pg_basebackup -F p -D $PGDATA -h 172.17.20.30 -p 1921 -U repxxx -X stream -C -S standby1 >/dev/null 2>&1 &
  
  
nohup pg_basebackup -F p -D $PGDATA -h 172.17.20.30 -p 1921 -U repxxx -X stream -C -S standby2 >/dev/null 2>&1 &

4、配置recovery.conf

recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.20.30 port=1921 user=repxxx password=xxx'          # e.g. 'host=localhost port=5432'
primary_slot_name = 'standby1'


recovery_target_timeline = 'latest'
standby_mode = on
primary_conninfo = 'host=172.17.20.30 port=1921 user=repxxx password=xxx'          # e.g. 'host=localhost port=5432'
primary_slot_name = 'standby2'

5、启动备库

6、主库配置多副本参数

synchronous_standby_names = 'ANY 1 (*)'
synchronous_commit = remote_write 


postgres=# show synchronous_commit ;
 synchronous_commit 
--------------------
 remote_write
(1 row)

postgres=# show synchronous_standby_names ;
 synchronous_standby_names 
---------------------------
 ANY 1 (*)
(1 row)

7、tpcb 1000亿(活跃10亿) rw 测试, QPS 5.34万。

transaction type: ./rw.sql
scaling factor: 1
query mode: prepared
number of clients: 64
number of threads: 64
duration: 360 s
number of transactions actually processed: 19239053
latency average = 1.198 ms
latency stddev = 1.208 ms
tps = 53432.922774 (including connections establishing)
tps = 53435.051257 (excluding connections establishing)
statement latencies in milliseconds:
         0.003  \set aid random_gaussian(1, :range, 10.0)
         0.001  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.001  \set delta random(-5000, 5000)
         0.043  BEGIN;
         0.122  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.068  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         0.089  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         0.080  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.064  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
         0.726  END;

8、tpcc 1000W , tps : 11000 , tpmC : 66W.

SQL statistics:
    queries performed:
        read:                            513698640
        write:                           533163795
        other:                           79242254
        total:                           1126104689
    transactions:                        39605095 (11000.27 per sec.)
    queries:                             1126104689 (312774.25 per sec.)
    ignored errors:                      172646 (47.95 per sec.)
    reconnects:                          0      (0.00 per sec.)

General statistics:
    total time:                          3600.3729s
    total number of events:              39605095

Latency (ms):
         min:                                    0.29
         avg:                                    5.81
         max:                                 1388.75
         95th percentile:                       11.87
         sum:                            230276426.25

Threads fairness:
    events (avg/stddev):           618829.6094/1712.80
    execution time (avg/stddev):   3598.0692/0.06

flashback 闪回

《PostgreSQL 最佳实践 - 块级增量备份(ZFS篇)验证 - recovery test script for zfs snapshot clone + postgresql stream replication + archive》

主备切换

《PostgreSQL 一主多从(多副本,强同步)简明手册 - 配置、压测、监控、切换、防脑裂、修复、0丢失 - 珍藏级》

修复主库, pg_rewind

《PostgreSQL primary-standby failback tools : pg_rewind》

《PostgreSQL 9.5 new feature - pg_rewind fast sync Split Brain Primary & Standby》

《PostgreSQL 9.5 add pg_rewind for Fast align for PostgreSQL unaligned primary & standby》

小结

1、8K fsync IO RT

1 ecs本地ssd

22 us    

2 essd

119 us    

2、8K directIO

1 ecs本地ssd

离散读

IOPS=146k, BW=1140MiB/s (1196MB/s)(200GiB/180004msec)    

离散写

IOPS=78.6k, BW=614MiB/s (644MB/s)(108GiB/180004msec)    

顺序读

IOPS=147k, BW=1150MiB/s (1206MB/s)(202GiB/180004msec)    

顺序写

IOPS=79.3k, BW=620MiB/s (650MB/s)(109GiB/180004m    

2 essd

离散读

IOPS=112k, BW=879MiB/s (922MB/s)(154GiB/180008msec)    

离散写

IOPS=60.6k, BW=473MiB/s (496MB/s)(83.2GiB/180008msec)    

顺序读

IOPS=133k, BW=1037MiB/s (1087MB/s)(182GiB/180005msec)    

顺序写

IOPS=71.4k, BW=558MiB/s (585MB/s)(98.1GiB/180005msec)    

3、tpcc 1000W (ESSD)

96.4万 tpmC

4、tpcb 1000亿 只读 (ESSD)

1、活跃数据10亿

QPS: 998818

2、活跃数据100亿

QPS: 597877

3、活跃数据500亿

QPS: 66678

4、活跃数据1000亿

QPS: 67295

5、tpcb 1000亿 读写 (ESSD)

1、活跃数据10亿

TPS: 95119

QPS: 475595

2、活跃数据100亿

TPS: 85278

QPS: 426390

3、活跃数据500亿

TPS: 38301

QPS: 191505

4、活跃数据1000亿

TPS: 35189

QPS: 175945

性能小结

环境:阿里云 ECS + 32T ESSD

表SIZE: 12.5 TB 写入耗时 25h52min

索引SIZE: 2 TB 创建耗时 10h50min

索引深度: 5级

单表数据量 TEST CASE QPS TPS
10 * 100W tpcc 1000W - 96.4万 tpmC
10 * 100W tpcc 1000W(同步多副本) - 66万 tpmC
1000亿 tpcb 活跃数据10亿 只读 998818 998818
1000亿 tpcb 活跃数据100亿 只读 597877 597877
1000亿 tpcb 活跃数据500亿 只读 66678 66678
1000亿 tpcb 活跃数据1000亿 只读 67295 67295
1000亿 tpcb 活跃数据10亿 读写 475595 95119
1000亿 tpcb 活跃数据10亿 读写(同步多副本) 267160 53432
1000亿 tpcb 活跃数据100亿 读写 426390 85278
1000亿 tpcb 活跃数据500亿 读写 191505 38301
1000亿 tpcb 活跃数据1000亿 读写 175945 35189

阿里云ESSD的引入,结合PostgreSQL企业级开源数据库(良好的性能、可管理海量数据、功能对齐Oracle,不仅ESSD层面提供多副本,同时数据库层面也支持通过quorum based replication多副本提供金融级的可靠性,使用ZFS秒级快照,闪回等企业特性),给企业大容量关系数据库上云提供了便利。

对于PG企业用户,可以选择阿里云PG企业版PPAS,拥有以上所有特性的同时,提供ORACLE兼容性。

参考

《fio测试IO性能》

ECS 本地SSD云盘(8*1.8TB条带)测试:

《PostgreSQL 100亿 tpcb 性能 on ECS》

《PostgreSQL 11 tpcc 测试(103万tpmC on ECS) - use sysbench-tpcc by Percona-Lab》

《(TPC-H测试 SF=10,SF=200) PostgreSQL 11 vs 10 vs Deepgreen》

《PostgreSQL 10 on ECS 实施 流复制备库镜像+自动快照备份+自动备份验证+自动清理备份与归档 - 珍藏级》

https://help.aliyun.com/knowledge_detail/64950.html

相关实践学习
一小时快速掌握 SQL 语法
本实验带您学习SQL的基础语法,快速入门SQL。
7天玩转云服务器
云服务器ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,可降低 IT 成本,提升运维效率。本课程手把手带你了解ECS、掌握基本操作、动手实操快照管理、镜像管理等。了解产品详情: https://www.aliyun.com/product/ecs
相关文章
|
20天前
|
关系型数据库 分布式数据库 数据库
成都晨云信息技术完成阿里云PolarDB数据库产品生态集成认证
近日,成都晨云信息技术有限责任公司(以下简称晨云信息)与阿里云PolarDB PostgreSQL版数据库产品展开产品集成认证。测试结果表明,晨云信息旗下晨云-站群管理系统(V1.0)与阿里云以下产品:开源云原生数据库PolarDB PostgreSQL版(V11),完全满足产品兼容认证要求,兼容性良好,系统运行稳定。
|
21天前
|
编解码 对象存储
阿里云视频转码转码模板-配置工作流
阿里云视频转码转码模板-配置工作流
14 0
|
20天前
|
弹性计算
2024年阿里云服务器不同实例规格与配置实时优惠价格整理与分享
2024年阿里云服务器的优惠价格新鲜出炉,有特惠云服务器也有普通优惠价格,本文为大家整理汇总了2024年阿里云服务器的优惠价格,包含特惠云服务器和其他配置云服务器的优惠价格。以便大家了解自己想购买的云服务器选择不同实例规格和带宽情况下的价格,仅供参考。
2024年阿里云服务器不同实例规格与配置实时优惠价格整理与分享
|
21天前
阿里云配置dcoker镜像仓库
阿里云配置dcoker镜像仓库
70 0
|
3天前
|
存储 弹性计算 安全
阿里云服务器2核2G、2核4G配置最新租用收费标准及活动价格参考
2核2G、2核4G配置是很多个人和企业建站以及部署中小型的web应用等场景时首选的云服务器配置,这些配置的租用价格也是用户非常关心的问题,本文为大家整理汇总了2024年阿里云服务器2核2G、2核4G配置不同实例规格及地域之间的收费标准,同时整理了这些配置最新活动价格,以供大家参考和选择。
阿里云服务器2核2G、2核4G配置最新租用收费标准及活动价格参考
|
5天前
|
域名解析 网络协议 应用服务中间件
阿里云服务器配置免费https服务
阿里云服务器配置免费https服务
|
26天前
|
弹性计算 固态存储 调度
2024年阿里云服务器配置选择指南,新手整理
阿里云服务器配置选择指南:个人用户推荐轻量应用服务器或ECS通用算力型u1,适合小型网站和轻量应用。企业用户应选择企业级独享型如ECS计算型c7、通用型g7,保证高性能计算需求。配置选择要考虑CPU内存比例、公网带宽和系统盘。轻量服务器提供2核2G3M和2核4G4M选项,ECS实例则有多种规格以适应不同业务场景。公网带宽建议至少5M,系统盘可选高效云盘、SSD或ESSD。详细信息见[阿里云服务器产品页](https://www.aliyun.com/product/ecs)。
61 3
|
29天前
|
存储 弹性计算 运维
阿里云轻量应用服务器与标准型阿里云服务器ECS全面对比(配置、价格)
随着云计算技术的蓬勃发展,阿里云作为业界的佼佼者,推出了多样化的云服务器产品以满足不同用户群体的需求。在这些产品中,阿里云轻量应用服务器与标准云服务器(ECS)因其各自的特点而备受关注。下面,我们将从多个角度对这两款产品进行深入剖析,以帮助您更好地选择适合自身需求的云服务器。
649 2
|
29天前
|
存储 弹性计算 安全
阿里云2核8G配置服务器租用价格多少?
随着云计算技术的飞速发展,越来越多的企业和个人开始青睐云服务器,将其视为数据存储和运算的理想平台。而在这其中,阿里云作为国内领先的云服务提供商,其ECS云服务器以其卓越的性能和稳定的服务,赢得了广大用户的信赖与喜爱。那么,对于许多用户来说,他们可能好奇的是,阿里云2核8G服务器一年的费用究竟是多少呢?
97 1
|
29天前
|
弹性计算
2024阿里云服务器购买、续费、升级配置价格表新鲜出炉!
2024年阿里云服务器购买、续费、升级优惠政策整理,阿里云服务器优惠价格表:轻量2核2G3M服务器61元一年、2核4G4M带宽165元1年,云服务器4核16G10M带宽26元1个月、149元半年,阿里云ECS云服务器2核2G3M新老用户均可99元一年续费不涨价,企业用户2核4G5M带宽199元一年。
390 2

相关产品

  • 云原生数据库 PolarDB