advanced sql Functions in Oracle 10G phần 4 ppt

The value of nr here is 20 (20 rows). By the row, the CUME_RANK calculation is: CNAME TEMP RANK rownum cr calculation CD Binghamton 20 1 1 1 (1/20) .050 New Milford 24 2 2 2 (2/20) .100 Provo 44 6 6 6 (6/20) .300 Reston 47 7 7 9 (9/20) .450 Alexandria 47 7 8 9 (9/20) .450 Idaho Falls 47 7 9 9 (9/20) .450 Grass Valley 55 10 10 10 (10/20) .500 Baton Rouge 58 11 11 13 (13/20) .650 Starkville 58 11 12 13 (13/20) .650 Carrboro 58 11 13 13 (13/20) .650 Brewton 72 17 17 17 (17/20) .850 Gulf Breeze 77 18 19 19 (19/20) .950 Davenport 77 18 19 19 (19/20) .950 Orlando 79 20 20 20 (20/20) 1.000 The cr value of 9 for row 7 occurs because the rank of 7 was given to all rows up to the ninth row, and hence rows 7, 8, and 9 get the same value of 9 for cr, the numerator in the function calculation. The PERCENT_RANK and CUME_RANK func - tions are very specialized and far less common than RANK or ROW_NUMBER. Also, in our examples we have depicted only one grouping — one partition. A PARTITION BY clause may be added to the analytic clause of the function, and sub-grouping and sub-PER - CENT_RANKs and CUME_DISTs may also be reported. 108 The Analytical Functions in Oracle (Analytical Functions I) For example, using our Employee table with PERCENT_RANK and CUME_DIST: SELECT empno, ename, region, RANK() OVER(PARTITION BY region ORDER BY curr_salary) RANK, PERCENT_RANK() OVER(PARTITION BY region ORDER BY curr_salary) PR, CUME_DIST() OVER(PARTITION BY region ORDER BY curr_salary) CD FROM employee Gives: EMPNO ENAME REGION RANK PR CD 108 David E 1 0 .333333333 111 Katie E 2 .5 .666666667 122 Lindsey E 3 1 1 101 John W 1 0 .25 102 Stephanie W 2 .333333333 .75 106 Chloe W 2 .333333333 .75 104 Christina W 4 1 1 In this result, first note the partitioning by region: The result set acts like two different sets of data based on the partition. Within each region, we see the calculation of PERCENT_RANK and CUME_DIST as per the previous algorithms. 109 Chapter | 3 References SQL for Analysis in Data Warehouses, Oracle Corpo - ration, Redwood Shores, CA, Oracle9i Data Warehousing Guide, Release 2 (9.2), Part Number A96520-01. For an excellent discussion of how Oracle 10g has improved querying, see “DSS Performance in Oracle Database 10g,” an Oracle white paper, Sep - tember 2003. This article shows how the Optimizer has been improved in 10g. 110 The Analytical Functions in Oracle (Analytical Functions I) Chapter 4 Aggregate Functions Used as Analytical Functions (Analytical Functions II) The Use of Aggregate FunctionsThe Use of Aggregate Functions in SQLin SQL Many of the common aggregate functions can be used as analytical functions: SUM, AVG, COUNT, STDDEV, VARIANCE, MAX, and MIN. The aggre - gate functions used as analytical functions offer the advantage of partitioning and ordering as well. As an example, say you want to display each person’s employee number, name, original salary, and the aver - age salary of all employees. This cannot be done with a query like the following because you cannot mix aggre - gates and row-level results. 111 Chapter | 4 SELECT empno, ename, orig_salary, AVG(orig_salary) FROM employee ORDER BY ename Gives: SELECT empno, ename, orig_salary, * ERROR at line 1: ORA-00937: not a single-group group function But we can use a Cartesian product/virtual table like this: SELECT e.empno, e.ename, e.orig_salary, x.aos "Avg. salary" FROM employee e, (SELECT AVG(orig_salary) aos FROM employee) x ORDER BY ename Which gives: EMPNO ENAME ORIG_SALARY Avg. salary 101 John 35000 38285.7143 106 Chloe 33000 38285.7143 104 Christina 43000 38285.7143 108 David 37000 38285.7143 111 Kate 45000 38285.7143 122 Lindsey 40000 38285.7143 102 Stephanie 35000 38285.7143 This type of query is borderline cumbersome and may be done far more easily using AVG in an analytical function: 112 Aggregate Functions Used as Analytical Functions (Analytical Functions II) SELECT empno, ename, orig_salary, AVG(orig_salary) OVER() "Avg. salary" FROM employee ORDER BY ename Giving: EMPNO ENAME ORIG_SALARY Avg. salary 101 John 35000 38285.7143 106 Chloe 33000 38285.7143 104 Christina 43000 38285.7143 108 David 37000 38285.7143 111 Kate 45000 38285.7143 122 Lindsey 40000 38285.7143 102 Stephanie 35000 38285.7143 This display looks off-balance due to the decimal points in the average salary. We can modify the displayed result using the analytical function nested inside an ordinary row-level function; a better version of the query with a ROUND function added would be: SELECT empno, ename, orig_salary, ROUND(AVG(orig_salary) OVER()) "Avg. salary" FROM employee ORDER BY ename Giving: EMPNO ENAME ORIG_SALARY Avg. salary 101 John 35000 38286 106 Chloe 33000 38286 104 Christina 43000 38286 108 David 37000 38286 111 Kate 45000 38286 122 Lindsey 40000 38286 102 Stephanie 35000 38286 113 Chapter | 4 The aggregate/analytical function uses an argument to specify which column is aggregated/analyzed (orig_ salary). It should also be noted that there is a null OVER clause. When the OVER clause is null as it is here, it is said to be a reporting function and applies to the entire dataset. We can use partitioning in the OVER clause of the aggregate-analytical function like this: SELECT empno, ename, orig_salary, region, ROUND(AVG(orig_salary) OVER(PARTITION BY region)) "Avg. Salary" FROM employee ORDER BY region, ename Giving: EMPNO ENAME ORIG_SALARY REGION Avg. Salary 108 David 37000 E 40667 111 Kate 45000 E 40667 122 Lindsey 40000 E 40667 101 John 35000 W 36500 106 Chloe 33000 W 36500 104 Christina 43000 W 36500 102 Stephanie 35000 W 36500 In this version of the query, we now have the average by region reported along with the other ordinary row data for an individual. The result of the row-level reporting may be used in arithmetic in the result set. Suppose we wanted to see the difference between a person’s salary and the average for his or her region. This example shows that query: 114 Aggregate Functions Used as Analytical Functions (Analytical Functions II) SELECT empno, ename, region, curr_salary, orig_salary, ROUND(AVG(orig_salary) OVER(PARTITION BY region)) "Avg-group", ROUND(orig_salary - AVG(orig_salary) OVER(PARTITION BY region)) "Diff." FROM employee ORDER BY region, ename Giving: EMPNO ENAME REGION CURR_SALARY ORIG_SALARY Avg-group Diff. 108 David E 39000 37000 40667 -3667 111 Kate E 49000 45000 40667 4333 122 Lindsey E 52000 40000 40667 -667 101 John W 39000 35000 36500 -1500 106 Chloe W 44000 33000 36500 -3500 104 Christina W 55000 43000 36500 6500 102 Stephanie W 44000 35000 36500 -1500 RATIO-TO-REPORT Returning to the example of using an aggregate in a calculation, here we want to know what fraction of the total salary budget goes to which individual. We can find this result with a script like this: COLUMN portion FORMAT 99.9999 SELECT ename, curr_salary, curr_salary/SUM(curr_salary) OVER() Portion FROM employee ORDER BY curr_salary 115 Chapter | 4 Giving: ENAME CURR_SALARY PORTION John 39000 .1211 David 39000 .1211 Stephanie 44000 .1366 Chloe 44000 .1366 Kate 49000 .1522 Lindsey 52000 .1615 Christina 55000 .1708 Notice that the PORTION column adds up to 100%: COLUMN total FORMAT 9.9999 SELECT sum(o.portion) Total FROM (SELECT i.ename, i.curr_salary, i.curr_salary/SUM(i.curr_salary) OVER() Portion FROM employee i ORDER BY i.curr_salary) o Gives: TOTAL 1.0000 The above query showing the fraction of salary appor - tioned to each individual can be done in one step with an analytical function called RATIO_TO_REPORT, which is used like this: COLUMN portion2 LIKE portion SELECT ename, curr_salary, curr_salary/SUM(curr_salary) OVER() Portion, RATIO_TO_REPORT(curr_salary) OVER() Portion2 FROM employee ORDER BY curr_salary 116 Aggregate Functions Used as Analytical Functions (Analytical Functions II) Giving: ENAME CURR_SALARY PORTION PORTION2 John 39000 .1211 .1211 David 39000 .1211 .1211 Stephanie 44000 .1366 .1366 Chloe 44000 .1366 .1366 Kate 49000 .1522 .1522 Lindsey 52000 .1615 .1615 Christina 55000 .1708 .1708 The RATIO_TO_REPORT (and the SUM analytical function) can easily be partioned as well. For example: SELECT ename, curr_salary, region, curr_salary/SUM(curr_salary) OVER(PARTITION BY Region) Portion, RATIO_TO_REPORT(curr_salary) OVER(PARTITION BY Region) Portion2 FROM employee ORDER BY region, curr_salary Gives: ENAME CURR_SALARY RE PORTION PORTION2 David 39000 E .2786 .2786 Kate 49000 E .3500 .3500 Lindsey 52000 E .3714 .3714 John 39000 W .2143 .2143 Stephanie 44000 W .2418 .2418 Chloe 44000 W .2418 .2418 Christina 55000 W .3022 .3022 117 Chapter | 4 [...]... BREAKS Giving: ENAME CURR_SALARY REGION PORTION PORTION2 - -David 39000 E 27857 142 9 27857 142 9 Kate 49 000 35 35 Lindsey 52000 37 142 8571 37 142 8571 ****** -sum 1 John 39000 W 2 142 857 14 2 142 857 14 Stephanie 44 000 241 758 242 241 758 242 Chloe 44 000 241 758 242 241 758 242 Christina 55000 302197802 302197802 ****** -sum 1 119 Aggregate Functions Used as Analytical Functions. .. Kate 49 000 E 3500 3500 Lindsey 52000 E 37 14 37 14 E 1.0000 1.0000 John 39000 W 2 143 2 143 Chloe 44 000 W 241 8 241 8 Stephanie 44 000 W 241 8 241 8 Christina 55000 W 3022 3022 W 1.0000 1.0000 In this query, the TO_NUMBER(null) is provided to make the data types compatible 118 Chapter | 4 A similar report can be had without the UNION workaround with the following SQL* Plus formatting commands included in a script:... Stephanie 44 000 The average 48 800 Note that 48 800 = (44 000 + 55000 + 49 000 + 52000 + 44 000)/5, and that the rows containing nulls are simply ignored in the calculation Returning to our simple example and the moving averages we have computed thus far: Time 0 1 2 3 4 122 Value 12 10 14 9 7 Moving Average 12 11 10 Chapter | 4 The end points would be calculated as follows: Window 0: Original time Original... taking an average using n physical rows above and below each row A moving average will operate in a window so that if the moving average is based on, say, three numbers (n = 3), the windows and their reported window rows would be: 120 Chapter | 4 Window 1: Original time Original value Windowed (smoothed) value 0 12 1 10 12 = [(12 + 10 + 14) /3] 2 14 Window 2: Original time Original value Windowed (smoothed)... Analytical Functions (Analytical Functions II) Windowing Subclauses with Physical Offsets in Aggregate Analytical Functions A windowing subclause is a way of capturing several rows of a result set (i.e., a “window”) and reporting the result in one “window row.” An example of this technique would be in applications where one wants to smooth data by finding a moving average Moving averages are most often... value 1 10 2 14 11 = [(10 + 14 + 9)/3] 3 9 Window 3: Original time Original value Windowed (smoothed) value 2 14 3 9 10 = [( 14 + 9 + 7)/3] 4 7 These calculations result in this display of the data: Time 0 1 2 3 4 Value 12 10 14 9 7 Moving Average 12 11 10 In this calculation, the end points (time = 0 and time = 5) usually are not reported because there are no values beyond the end points with which... 14- JAN-06 15-JAN-06 15-JAN-06 16-JAN-06 16-JAN-06 17-JAN-06 17-JAN-06 18-JAN-06 18-JAN-06 19-JAN-06 19-JAN-06 20-JAN-06 20-JAN-06 21-JAN-06 21-JAN-06 22-JAN-06 22-JAN-06 23-JAN-06 23-JAN-06 24- JAN-06 24- JAN-06 928.37 217.26 6 64. 9 16.13 6 94. 51 42 1.59 41 3.12 40 3.95 645 .78 831.12 678 .41 783.57 49 1.05 878.15 635.75 968.89 378.25 351 882.51 975.73 24. 52 191 542 .2 46 2.92 2 94. 19 707.57 729.92 919.61 272. 24. .. location 133 Aggregate Functions Used as Analytical Functions (Analytical Functions II) Giving: Date 07-JAN-06 07-JAN-06 08-JAN-06 08-JAN-06 09-JAN-06 09-JAN-06 LOCATION RECEIPTS Running total MOBILE 7 24. 6 7 24. 60 PROVO 969.61 1,6 94. 21 MOBILE 88.76 1,782.97 PROVO 662 .45 2 ,44 5 .42 MOBILE 705 .47 3,150.89 PROVO 928.37 4, 079.26 UNBOUNDED FOLLOWING The clause UNBOUNDED FOLLOWING is used for... Running total PROVO 969.61 969.61 PROVO 662 .45 1,632.06 PROVO 928.37 2,560 .43 PROVO 6 64. 90 3,225.33 PROVO 6 94. 51 3,919. 84 PROVO 41 3.12 4, 332.96 PROVO 645 .78 4, 978. 74 PROVO 678 .41 5,657.15 PROVO 49 1.05 5,178.59 PROVO 635.75 5,151.89 PROVO 378.25 4, 601.77 In this example, it may be noted that, while it takes seven days for the summing to “get started,” the sums are quite useful after that... Date 07-JAN-06 09-JAN-06 14- JAN-06 15-JAN-06 16-JAN-06 17-JAN-06 19-JAN-06 22-JAN-06 23-JAN-06 LOCATION RECEIPTS Running total MOBILE 7 24. 60 7 24. 60 MOBILE 705 .47 1 ,43 0.07 MOBILE 831.12 2,261.19 MOBILE 783.57 2,320.16 MOBILE 878.15 3,198.31 MOBILE 968.89 3 ,46 1.73 MOBILE 975.73 4, 437 .46 MOBILE 707.57 4, 313.91 MOBILE 919.61 4, 449 .95 Upon careful examination of the data, it may be . .2 142 857 14 Stephanie 44 000 . 241 758 242 . 241 758 242 Chloe 44 000 . 241 758 242 . 241 758 242 Christina 55000 .302197802 .302197802 ****** sum 1 119 Chapter | 4 Windowing Subclauses with PhysicalWindowing Subclauses. .2786 Kate 49 000 E .3500 .3500 Lindsey 52000 E .37 14 .37 14 John 39000 W .2 143 .2 143 Stephanie 44 000 W . 241 8 . 241 8 Chloe 44 000 W . 241 8 . 241 8 Christina 55000 W .3022 .3022 117 Chapter | 4 Notice. Chloe 44 000 Christina 55000 David John Kate 49 000 Lindsey 52000 Stephanie 44 000 The average 48 800 Note that 48 800 = (44 000 + 55000 + 49 000 + 52000 + 44 000)/5, and that the rows containing nulls

advanced sql Functions in Oracle 10G phần 4 ppt

Thông tin tài liệu

Từ khóa liên quan

Tài liệu cùng người dùng

Tài liệu liên quan