Tuesday, May 13

select count(*) vs. select count(1)

What is the difference between select count(*) and select count(1) ?

This was a long pending question in my mind so, I did a little bit of search and study to convince myself that ... there is a difference, may be because SQL Query Optimizer would expand '*' into the column list.

But if we compare, SQL query plans and performance is same for both of them.

Then, I thought of finding the answer using the way Sybase ASE tends to think.
Depending on that, it seems there is no difference in 'select count(1)' and 'select count(*)'
Even, 'select count(100)' would give the same performance.

Here are my findings ...


Consider a table
create table TAB_A
(
col_1 int null,
col_2 varchar(20) null,
col_3 char(1) null
)
Data:
insert into TAB_A values ( 1, 'Seabiscuit' , 'H')
insert into TAB_A values ( 2, 'Schindlers List' , 'H')
insert into TAB_A values ( 3, 'Trueman Show' , 'H')
insert into TAB_A values ( NULL, 'Office Space' , 'H')
insert into TAB_A values ( NULL, NULL, NULL)

Here is how 'count' function works.
1. count ( column_name ) : finds the total number of 'non null' rows in the table.
Exa.

select count(col_1) from TAB_A
---------
3

and

select count(col_2) from TAB_A
---------
4

2. count(*) : finds the total number of rows in the table, irrespective of null values.

Exa.

select count(*) from TAB_A
---------
5

Now, if we consider that Sybase expands '*' into the column list, then above two definitions of 'count' function does not allow this.

A. Suppose, sybase has expanded the '*' into column list, then the SQL will become

select count( col_1, col_2, col_3 ) from TAB_A
---------
Server Message : Number 102, Severity 15
Incorrect syntax near ',' .

This is not allowed, you can not pass comma separated column names to 'count' function.

B. Even if, we argue that point A is internal matter to optimizer and any how, it might have been worked around internally, in that case according to first definition, null values will be ignored while counting number. This is not correct, as we know 'count(*) ' always returns total number of rows, even if there are nulls.

So, from A and B we can say '*' can not get expanded internally.

Then, why is '*' here ?
I think, as opposed to "all columns" in general 'select query', here '*' indicates "all rows" of the table, including NULL values.

Also, after this analysis, I feel comfortable to say that, any number or string other than the column name might be getting converted to '*' , before executing the SQL to count the total number rows available in table.

Example :
select count( 1947) from TAB_A
---------
5

or

select count('someString') from TAB_A
---------
5

So, the only overhead, I can think of is to convert 1947 or 'someString' to '*' ,
which should be far less than negligible .

Related Posts:
Get number of rows without using count() function


0 comments:





Disclaimer :
Unless, otherwise mentioned, TF's SQL tips/tricks are for Sybase ASE 12.5.x