In Oracle, the ability to do aggregates are a very powerful feature. However, at times it can be confusing when you have several layers of aggregation or nulls in the columns that are being grouped by. For instance, take the following query:
- select department_id, sum(salary)
- from employees
- group by rollup (department_id);
Which yields the following results:
DEPARTMENT_ID SUM(SALARY)
---------------------- ----------------------
10 4400
20 19000
30 24900
40 6500
50 156400
60 28800
70 10000
80 304500
90 58000
100 51600
110 20300
7000
691400
13 rows selected
Notice I want to rollup the results to get a sum total, however I have 2 rows with null, which one is the rollup value? While its very easy to spot for a human, when attempting to present the data to a user in a meaningful fashion, knowing which row that is the total is very handy. To help with this problem, Oracle has something called a grouping_id, which identifies the level of aggregation. Take the following query:
- select department_id, sum(salary), grouping_id(department_id)
- from employees
- group by rollup (department_id);
Which yields the following results...
DEPARTMENT_ID SUM(SALARY) GROUPING_ID(DEPARTMENT_ID)
---------------------- ---------------------- --------------------------
10 4400 0
20 19000 0
30 24900 0
40 6500 0
50 156400 0
60 28800 0
70 10000 0
80 304500 0
90 58000 0
100 51600 0
110 20300 0
7000 0
691400 1
13 rows selected
Notice in the grouping_id column, the total is indicated by the number 1, which identifies the level of aggregation. Let's look at a more complex example that gives more explicitly indicates what is taking place.
- select department_id, job_id, sum(salary),
- grouping_id(department_id) dept_g, grouping_id(job_id) job_g,
- case when grouping_id(department_id)||grouping_id(job_id) = '00'
- then 'Detail Both'
- when grouping_id(department_id)||grouping_id(job_id) = '10'
- then 'Agg over department'
- when grouping_id(department_id)||grouping_id(job_id) = '01'
- then 'Agg over job'
- when grouping_id(department_id)||grouping_id(job_id) = '11'
- then 'Agg over both'
- end
- description
- from employees
- group by rollup (department_id, job_id);
Which yields the following:
DEPT JOB_ID SUM(SALARY) DEPT_G JOB_G DESCRIPTION
---------------------- ---------- ---------------------- ---------------------- ---------------------- -------------------
SA_REP 7000 0 0 Detail Both
7000 0 1 Agg over job
10 AD_ASST 4400 0 0 Detail Both
10 4400 0 1 Agg over job
20 MK_MAN 13000 0 0 Detail Both
20 MK_REP 6000 0 0 Detail Both
20 19000 0 1 Agg over job
30 PU_MAN 11000 0 0 Detail Both
30 PU_CLERK 13900 0 0 Detail Both
30 24900 0 1 Agg over job
40 HR_REP 6500 0 0 Detail Both
40 6500 0 1 Agg over job
50 ST_MAN 36400 0 0 Detail Both
50 SH_CLERK 64300 0 0 Detail Both
50 ST_CLERK 55700 0 0 Detail Both
50 156400 0 1 Agg over job
60 IT_PROG 28800 0 0 Detail Both
60 28800 0 1 Agg over job
70 PR_REP 10000 0 0 Detail Both
70 10000 0 1 Agg over job
80 SA_MAN 61000 0 0 Detail Both
80 SA_REP 243500 0 0 Detail Both
80 304500 0 1 Agg over job
90 AD_VP 34000 0 0 Detail Both
90 AD_PRES 24000 0 0 Detail Both
90 58000 0 1 Agg over job
100 FI_MGR 12000 0 0 Detail Both
100 FI_ACCOUNT 39600 0 0 Detail Both
100 51600 0 1 Agg over job
110 AC_MGR 12000 0 0 Detail Both
110 AC_ACCOUNT 8300 0 0 Detail Both
110 20300 0 1 Agg over job
691400 1 1 Agg over both
33 rows selected
With the rollup clause, it clearly identifies which rows are the aggregate rows for job_id and which row is the grand total row. Just for completeness, lets look at cubing our querying, which provides aggregate totals for each column.
- select department_id dept, job_id, sum(salary),
- grouping_id(department_id) dept_g, grouping_id(job_id) job_g,
- case when grouping_id(department_id)||grouping_id(job_id) = '00'
- then 'Detail Both'
- when grouping_id(department_id)||grouping_id(job_id) = '10'
- then 'Agg over department'
- when grouping_id(department_id)||grouping_id(job_id) = '01'
- then 'Agg over job'
- when grouping_id(department_id)||grouping_id(job_id) = '11'
- then 'Agg over both'
- end
- description
- from employees
- group by cube (department_id, job_id);
Which yields the following:
DEPT JOB_ID SUM(SALARY) DEPT_G JOB_G DESCRIPTION
---------------------- ---------- ---------------------- ---------------------- ---------------------- -------------------
7000 0 1 Agg over job
691400 1 1 Agg over both
AD_VP 34000 1 0 Agg over department
AC_MGR 12000 1 0 Agg over department
FI_MGR 12000 1 0 Agg over department
HR_REP 6500 1 0 Agg over department
MK_MAN 13000 1 0 Agg over department
MK_REP 6000 1 0 Agg over department
PR_REP 10000 1 0 Agg over department
PU_MAN 11000 1 0 Agg over department
SA_MAN 61000 1 0 Agg over department
SA_REP 7000 0 0 Detail Both
SA_REP 250500 1 0 Agg over department
ST_MAN 36400 1 0 Agg over department
AD_ASST 4400 1 0 Agg over department
AD_PRES 24000 1 0 Agg over department
IT_PROG 28800 1 0 Agg over department
PU_CLERK 13900 1 0 Agg over department
SH_CLERK 64300 1 0 Agg over department
ST_CLERK 55700 1 0 Agg over department
AC_ACCOUNT 8300 1 0 Agg over department
FI_ACCOUNT 39600 1 0 Agg over department
10 4400 0 1 Agg over job
10 AD_ASST 4400 0 0 Detail Both
20 19000 0 1 Agg over job
20 MK_MAN 13000 0 0 Detail Both
20 MK_REP 6000 0 0 Detail Both
30 24900 0 1 Agg over job
30 PU_MAN 11000 0 0 Detail Both
30 PU_CLERK 13900 0 0 Detail Both
40 6500 0 1 Agg over job
40 HR_REP 6500 0 0 Detail Both
50 156400 0 1 Agg over job
50 ST_MAN 36400 0 0 Detail Both
50 SH_CLERK 64300 0 0 Detail Both
50 ST_CLERK 55700 0 0 Detail Both
60 28800 0 1 Agg over job
60 IT_PROG 28800 0 0 Detail Both
70 10000 0 1 Agg over job
70 PR_REP 10000 0 0 Detail Both
80 304500 0 1 Agg over job
80 SA_MAN 61000 0 0 Detail Both
80 SA_REP 243500 0 0 Detail Both
90 58000 0 1 Agg over job
90 AD_VP 34000 0 0 Detail Both
90 AD_PRES 24000 0 0 Detail Both
100 51600 0 1 Agg over job
100 FI_MGR 12000 0 0 Detail Both
100 FI_ACCOUNT 39600 0 0 Detail Both
110 20300 0 1 Agg over job
110 AC_MGR 12000 0 0 Detail Both
110 AC_ACCOUNT 8300 0 0 Detail Both
52 rows selected
Notice now I can't just assume the aggregates come in the order I expect them, i.e., the grand total is at the bottom. If you were like me, rollups and cubes held more mystery that help in most cases, however with grouping_ids it becomes more apparent the meaning of each row and what exactly Oracle is aggregating for you. (Note: All queries were run against the HR schema that comes with Oracle 10g Express Edition (XE)).
| DEPARTMENT_ID | JOB_ID | SUM(SALARY) | DEPT_G | JOB_G | DESCRIPTION |
|---|---|---|---|---|---|
| SA_REP | 7000 | 0 | 0 | Detail Both | |
| 7000 | 0 | 1 | Agg over manager | ||
| 10 | AD_ASST | 4400 | 0 | 0 | Detail Both |
| 10 | 4400 | 0 | 1 | Agg over manager | |
| 20 | MK_MAN | 13000 | 0 | 0 | Detail Both |
| 20 | MK_REP | 6000 | 0 | 0 | Detail Both |
| 20 | 19000 | 0 | 1 | Agg over manager | |
| 30 | PU_MAN | 11000 | 0 | 0 | Detail Both |
| 30 | PU_CLERK | 13900 | 0 | 0 | Detail Both |
| 30 | 24900 | 0 | 1 | Agg over manager | |
| 40 | HR_REP | 6500 | 0 | 0 | Detail Both |
| 40 | 6500 | 0 | 1 | Agg over manager | |
| 50 | ST_MAN | 36400 | 0 | 0 | Detail Both |
| 50 | SH_CLERK | 64300 | 0 | 0 | Detail Both |
| 50 | ST_CLERK | 55700 | 0 | 0 | Detail Both |
| 50 | 156400 | 0 | 1 | Agg over manager | |
| 60 | IT_PROG | 28800 | 0 | 0 | Detail Both |
| 60 | 28800 | 0 | 1 | Agg over manager | |
| 70 | PR_REP | 10000 | 0 | 0 | Detail Both |
| 70 | 10000 | 0 | 1 | Agg over manager | |
| 80 | SA_MAN | 61000 | 0 | 0 | Detail Both |
| 80 | SA_REP | 243500 | 0 | 0 | Detail Both |
| 80 | 304500 | 0 | 1 | Agg over manager | |
| 90 | AD_VP | 34000 | 0 | 0 | Detail Both |
| 90 | AD_PRES | 24000 | 0 | 0 | Detail Both |
| 90 | 58000 | 0 | 1 | Agg over manager | |
| 100 | FI_MGR | 12000 | 0 | 0 | Detail Both |
| 100 | FI_ACCOUNT | 39600 | 0 | 0 | Detail Both |
| 100 | 51600 | 0 | 1 | Agg over manager | |
| 110 | AC_MGR | 12000 | 0 | 0 | Detail Both |
| 110 | AC_ACCOUNT | 8300 | 0 | 0 | Detail Both |
| 110 | 20300 | 0 | 1 | Agg over manager | |
| 691400 | 1 | 1 | Agg over both |