GBase 8s 数据库中count(字段) ,count(主键) ,count(1)和count(*)的区别
首先要弄清楚 count() 的语义。count() 是一个聚合函数,对于返回的结果集,一行行地判断,如果 count 函数的参数不是 NULL,累计值就加 1,否则不加。最后返回累计值。
所以,count(*)、count(1)和count(主键) 都表示返回满足条件的结果集的总行数;而 count(字段),则表示返回满足条件的数据行里面,参数 "字段" 不为 NULL 的总个数。
以测试表为例,表内加载了10万行记录,其中允许null值的字段name存在3个null值;
create table tabcount
(
id integer,
name varchar(40),
age integer not null ,
primary key (id)
);
count(可空字段)
扫描全表,判断字段可空,拿出该字段所有值,判断每一个值是否为空,不为空则累加
select count(name) from tabcount;
(count)
99997
1 row(s) retrieved.
查看其执行计划,完全的扫描整表
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:21:26)
------
select count(name) from tabcount
Estimated Cost: 4312
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tabcount
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 100000 100000 100000 00:00.01 4312
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 100000 00:00.01
count(非空字段)
扫描全表,判断字段不可空,按行累加。
select count(age) from tabcount;
(count)
100000
1 row(s) retrieved.
查看其执行计划,完全的扫描整表
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:24:32)
------
select count(age) from tabcount
Estimated Cost: 4312
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tabcount
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 100000 100000 100000 00:00.01 4312
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 100000 00:00.01
count(主键)
扫描全表,判断字段不可空,按行累加。
select count(id) from tabcount;
(count)
100000
1 row(s) retrieved.
查看其执行计划,执行的是索引键扫描
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:26:45)
------
select count(id) from tabcount
Estimated Cost: 3653
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: INDEX PATH
(1) Index Name: gbasedbt. 100_1
Index Keys: id (Key-Only) (Serial, fragments: ALL)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tabcount
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 100000 100000 100000 00:00.02 3653
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 100000 00:00.02
count(1)
扫描全表,但不取值,收到的每一行都是1,判断不可能是null,按值累加。
select count(1) from tabcount;
(count)
100000
1 row(s) retrieved.
查看其执行计划,完全的扫描整表
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:30:36)
------
select count(1) from tabcount
Estimated Cost: 4312
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: SEQUENTIAL SCAN
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tabcount
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 100000 100000 100000 00:00.02 4312
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 100000 00:00.03
count(*)
GBase 8s执行count(*)在优化器做了专门优化。如果无where条件,将直接从系统表中返回值。有where条件的话扫描全表,但是不取值,按行累加。
select count(*) from tabcount;
(count(*))
100000
1 row(s) retrieved.
查看其执行计划,直接返回结果(从内部系统表)
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:35:28)
------
select count(*) from tabcount
Estimated Cost: 1
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: INDEX PATH
(1) Index Name: (count)
Index Keys: (count)
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 0 00:00.00
如果带where条件的话,则严格遵循限制条件
select count(*) from tabcount where name like 'test%';
(count(*))
99997
1 row(s) retrieved.
查看其执行计划,完全的扫描整表
QUERY: (OPTIMIZATION TIMESTAMP: 01-12-2022 09:36:02)
------
select count(*) from tabcount where name like 'test%'
Estimated Cost: 4312
Estimated # of Rows Returned: 1
1) gbasedbt.tabcount: SEQUENTIAL SCAN
Filters: gbasedbt.tabcount.name LIKE 'test%'
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tabcount
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 100000 100000 100000 00:00.06 4312
type rows_prod est_rows rows_cons time
-------------------------------------------------
group 1 1 100000 00:00.06
性能对比
通过对比各个查询执行计划中的Estimated Cost值
count(*) 无限制条件 > count(主键) > count(1) ~= count(非空字段) ~= count(可空字段)
仅有一条评论