GBase 8s 数据库中count(字段) ,count(主键) ,count(1)和count(*)的区别

首先要弄清楚 count() 的语义。count() 是一个聚合函数,对于返回的结果集,一行行地判断,如果 count 函数的参数不是 NULL,累计值就加 1,否则不加。最后返回累计值。
所以,count(*)、count(1)和count(主键) 都表示返回满足条件的结果集的总行数;而 count(字段),则表示返回满足条件的数据行里面,参数 "字段" 不为 NULL 的总个数。

以测试表为例,表内加载了10万行记录,其中允许null值的字段name存在3个null值;

create table tabcount 
  (
    id integer,
    name varchar(40),
    age integer not null ,
    primary key (id) 
  );

count(可空字段)

扫描全表,判断字段可空,拿出该字段所有值,判断每一个值是否为空,不为空则累加

select count(name) from tabcount;

         (count) 

           99997

1 row(s) retrieved.

查看其执行计划,完全的扫描整表

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:21:26)
------
select count(name) from tabcount

Estimated Cost: 4312
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: SEQUENTIAL SCAN


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------
  t1                tabcount

  type     table  rows_prod  est_rows  rows_scan  time       est_cost
  -------------------------------------------------------------------
  scan     t1     100000     100000    100000     00:00.01   4312    

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         100000     00:00.01

count(非空字段)

扫描全表,判断字段不可空,按行累加。

select count(age) from tabcount;

         (count) 

          100000

1 row(s) retrieved.

查看其执行计划,完全的扫描整表

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:24:32)
------
select count(age) from tabcount

Estimated Cost: 4312
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: SEQUENTIAL SCAN


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------
  t1                tabcount

  type     table  rows_prod  est_rows  rows_scan  time       est_cost
  -------------------------------------------------------------------
  scan     t1     100000     100000    100000     00:00.01   4312    

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         100000     00:00.01

count(主键)

扫描全表,判断字段不可空,按行累加。

select count(id) from tabcount;

         (count) 

          100000

1 row(s) retrieved.

查看其执行计划,执行的是索引键扫描

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:26:45)
------
select count(id) from tabcount

Estimated Cost: 3653
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: INDEX PATH

    (1) Index Name: gbasedbt. 100_1
        Index Keys: id   (Key-Only)  (Serial, fragments: ALL)


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------
  t1                tabcount

  type     table  rows_prod  est_rows  rows_scan  time       est_cost
  -------------------------------------------------------------------
  scan     t1     100000     100000    100000     00:00.02   3653    

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         100000     00:00.02

count(1)

扫描全表,但不取值,收到的每一行都是1,判断不可能是null,按值累加。

select count(1) from tabcount;

         (count) 

          100000

1 row(s) retrieved.

查看其执行计划,完全的扫描整表

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:30:36)
------
select count(1) from tabcount

Estimated Cost: 4312
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: SEQUENTIAL SCAN


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------
  t1                tabcount

  type     table  rows_prod  est_rows  rows_scan  time       est_cost
  -------------------------------------------------------------------
  scan     t1     100000     100000    100000     00:00.02   4312

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         100000     00:00.03

count(*)

GBase 8s执行count(*)在优化器做了专门优化。如果无where条件,将直接从系统表中返回值。有where条件的话扫描全表,但是不取值,按行累加。

select count(*) from tabcount;   

      (count(*)) 

          100000

1 row(s) retrieved.

查看其执行计划,直接返回结果(从内部系统表)

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:35:28)
------
select count(*) from tabcount

Estimated Cost: 1
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: INDEX PATH

    (1) Index Name: (count)
        Index Keys: (count)


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         0          00:00.00

如果带where条件的话,则严格遵循限制条件

select count(*) from tabcount where name like 'test%';

      (count(*)) 

           99997

1 row(s) retrieved.

查看其执行计划,完全的扫描整表

QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:36:02)
------
select count(*) from tabcount where name like 'test%'

Estimated Cost: 4312
Estimated # of Rows Returned: 1

  1) gbasedbt.tabcount: SEQUENTIAL SCAN

        Filters: gbasedbt.tabcount.name LIKE 'test%'


Query statistics:
-----------------

  Table map :
  ----------------------------
  Internal name     Table name
  ----------------------------
  t1                tabcount

  type     table  rows_prod  est_rows  rows_scan  time       est_cost
  -------------------------------------------------------------------
  scan     t1     100000     100000    100000     00:00.06   4312

  type     rows_prod  est_rows  rows_cons  time
  -------------------------------------------------
  group    1          1         100000     00:00.06

性能对比

通过对比各个查询执行计划中的Estimated Cost值
count(*) 无限制条件 > count(主键) > count(1) ~= count(非空字段) ~= count(可空字段)

标签: gbase 8s, GBase, 主键, count, 优化器

仅有一条评论

  1. liujie liujie
    学习了

添加新评论