Tuesday, September 23, 2008

Group By and Grouping Id - Determining Levels of Aggregation

In Oracle, the ability to do aggregates are a very powerful feature. However, at times it can be confusing when you have several layers of aggregation or nulls in the columns that are being grouped by. For instance, take the following query:



  1. select department_id, sum(salary)  
  2. from employees  
  3. group by rollup (department_id);  


Which yields the following results:


DEPARTMENT_ID SUM(SALARY)
---------------------- ----------------------
10 4400
20 19000
30 24900
40 6500
50 156400
60 28800
70 10000
80 304500
90 58000
100 51600
110 20300
7000
691400

13 rows selected

Notice I want to rollup the results to get a sum total, however I have 2 rows with null, which one is the rollup value? While its very easy to spot for a human, when attempting to present the data to a user in a meaningful fashion, knowing which row that is the total is very handy. To help with this problem, Oracle has something called a grouping_id, which identifies the level of aggregation. Take the following query:

  1. select department_id, sum(salary), grouping_id(department_id)
  2. from employees
  3. group by rollup (department_id);


Which yields the following results...

DEPARTMENT_ID SUM(SALARY) GROUPING_ID(DEPARTMENT_ID)
---------------------- ---------------------- --------------------------
10 4400 0
20 19000 0
30 24900 0
40 6500 0
50 156400 0
60 28800 0
70 10000 0
80 304500 0
90 58000 0
100 51600 0
110 20300 0
7000 0
691400 1

13 rows selected



Notice in the grouping_id column, the total is indicated by the number 1, which identifies the level of aggregation. Let's look at a more complex example that gives more explicitly indicates what is taking place.

  1. select department_id, job_id, sum(salary),
  2. grouping_id(department_id) dept_g, grouping_id(job_id) job_g,
  3. case when grouping_id(department_id)||grouping_id(job_id) = '00'
  4. then 'Detail Both'
  5. when grouping_id(department_id)||grouping_id(job_id) = '10'
  6. then 'Agg over department'
  7. when grouping_id(department_id)||grouping_id(job_id) = '01'
  8. then 'Agg over job'
  9. when grouping_id(department_id)||grouping_id(job_id) = '11'
  10. then 'Agg over both'
  11. end
  12. description
  13. from employees
  14. group by rollup (department_id, job_id);


Which yields the following:


DEPT JOB_ID SUM(SALARY) DEPT_G JOB_G DESCRIPTION
---------------------- ---------- ---------------------- ---------------------- ---------------------- -------------------
SA_REP 7000 0 0 Detail Both
7000 0 1 Agg over job
10 AD_ASST 4400 0 0 Detail Both
10 4400 0 1 Agg over job
20 MK_MAN 13000 0 0 Detail Both
20 MK_REP 6000 0 0 Detail Both
20 19000 0 1 Agg over job
30 PU_MAN 11000 0 0 Detail Both
30 PU_CLERK 13900 0 0 Detail Both
30 24900 0 1 Agg over job
40 HR_REP 6500 0 0 Detail Both
40 6500 0 1 Agg over job
50 ST_MAN 36400 0 0 Detail Both
50 SH_CLERK 64300 0 0 Detail Both
50 ST_CLERK 55700 0 0 Detail Both
50 156400 0 1 Agg over job
60 IT_PROG 28800 0 0 Detail Both
60 28800 0 1 Agg over job
70 PR_REP 10000 0 0 Detail Both
70 10000 0 1 Agg over job
80 SA_MAN 61000 0 0 Detail Both
80 SA_REP 243500 0 0 Detail Both
80 304500 0 1 Agg over job
90 AD_VP 34000 0 0 Detail Both
90 AD_PRES 24000 0 0 Detail Both
90 58000 0 1 Agg over job
100 FI_MGR 12000 0 0 Detail Both
100 FI_ACCOUNT 39600 0 0 Detail Both
100 51600 0 1 Agg over job
110 AC_MGR 12000 0 0 Detail Both
110 AC_ACCOUNT 8300 0 0 Detail Both
110 20300 0 1 Agg over job
691400 1 1 Agg over both

33 rows selected



With the rollup clause, it clearly identifies which rows are the aggregate rows for job_id and which row is the grand total row. Just for completeness, lets look at cubing our querying, which provides aggregate totals for each column.

  1. select department_id dept, job_id, sum(salary),
  2. grouping_id(department_id) dept_g, grouping_id(job_id) job_g,
  3. case when grouping_id(department_id)||grouping_id(job_id) = '00'
  4. then 'Detail Both'
  5. when grouping_id(department_id)||grouping_id(job_id) = '10'
  6. then 'Agg over department'
  7. when grouping_id(department_id)||grouping_id(job_id) = '01'
  8. then 'Agg over job'
  9. when grouping_id(department_id)||grouping_id(job_id) = '11'
  10. then 'Agg over both'
  11. end
  12. description
  13. from employees
  14. group by cube (department_id, job_id);



Which yields the following:


DEPT JOB_ID SUM(SALARY) DEPT_G JOB_G DESCRIPTION
---------------------- ---------- ---------------------- ---------------------- ---------------------- -------------------
7000 0 1 Agg over job
691400 1 1 Agg over both
AD_VP 34000 1 0 Agg over department
AC_MGR 12000 1 0 Agg over department
FI_MGR 12000 1 0 Agg over department
HR_REP 6500 1 0 Agg over department
MK_MAN 13000 1 0 Agg over department
MK_REP 6000 1 0 Agg over department
PR_REP 10000 1 0 Agg over department
PU_MAN 11000 1 0 Agg over department
SA_MAN 61000 1 0 Agg over department
SA_REP 7000 0 0 Detail Both
SA_REP 250500 1 0 Agg over department
ST_MAN 36400 1 0 Agg over department
AD_ASST 4400 1 0 Agg over department
AD_PRES 24000 1 0 Agg over department
IT_PROG 28800 1 0 Agg over department
PU_CLERK 13900 1 0 Agg over department
SH_CLERK 64300 1 0 Agg over department
ST_CLERK 55700 1 0 Agg over department
AC_ACCOUNT 8300 1 0 Agg over department
FI_ACCOUNT 39600 1 0 Agg over department
10 4400 0 1 Agg over job
10 AD_ASST 4400 0 0 Detail Both
20 19000 0 1 Agg over job
20 MK_MAN 13000 0 0 Detail Both
20 MK_REP 6000 0 0 Detail Both
30 24900 0 1 Agg over job
30 PU_MAN 11000 0 0 Detail Both
30 PU_CLERK 13900 0 0 Detail Both
40 6500 0 1 Agg over job
40 HR_REP 6500 0 0 Detail Both
50 156400 0 1 Agg over job
50 ST_MAN 36400 0 0 Detail Both
50 SH_CLERK 64300 0 0 Detail Both
50 ST_CLERK 55700 0 0 Detail Both
60 28800 0 1 Agg over job
60 IT_PROG 28800 0 0 Detail Both
70 10000 0 1 Agg over job
70 PR_REP 10000 0 0 Detail Both
80 304500 0 1 Agg over job
80 SA_MAN 61000 0 0 Detail Both
80 SA_REP 243500 0 0 Detail Both
90 58000 0 1 Agg over job
90 AD_VP 34000 0 0 Detail Both
90 AD_PRES 24000 0 0 Detail Both
100 51600 0 1 Agg over job
100 FI_MGR 12000 0 0 Detail Both
100 FI_ACCOUNT 39600 0 0 Detail Both
110 20300 0 1 Agg over job
110 AC_MGR 12000 0 0 Detail Both
110 AC_ACCOUNT 8300 0 0 Detail Both

52 rows selected



Notice now I can't just assume the aggregates come in the order I expect them, i.e., the grand total is at the bottom. If you were like me, rollups and cubes held more mystery that help in most cases, however with grouping_ids it becomes more apparent the meaning of each row and what exactly Oracle is aggregating for you. (Note: All queries were run against the HR schema that comes with Oracle 10g Express Edition (XE)).












































































































































































































































































































































































































































































































DEPARTMENT_ID JOB_ID SUM(SALARY) DEPT_G JOB_G DESCRIPTION
SA_REP700000Detail Both
700001Agg over manager
10AD_ASST440000Detail Both
10440001Agg over manager
20MK_MAN1300000Detail Both
20MK_REP600000Detail Both
201900001Agg over manager
30PU_MAN1100000Detail Both
30PU_CLERK1390000Detail Both
302490001Agg over manager
40HR_REP650000Detail Both
40650001Agg over manager
50ST_MAN3640000Detail Both
50SH_CLERK6430000Detail Both
50ST_CLERK5570000Detail Both
5015640001Agg over manager
60IT_PROG2880000Detail Both
602880001Agg over manager
70PR_REP1000000Detail Both
701000001Agg over manager
80SA_MAN6100000Detail Both
80SA_REP24350000Detail Both
8030450001Agg over manager
90AD_VP3400000Detail Both
90AD_PRES2400000Detail Both
905800001Agg over manager
100FI_MGR1200000Detail Both
100FI_ACCOUNT3960000Detail Both
1005160001Agg over manager
110AC_MGR1200000Detail Both
110AC_ACCOUNT830000Detail Both
1102030001Agg over manager
69140011Agg over both


No comments: