欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

Oracle中的分析函數(shù)匯總

 更新時(shí)間:2022年05月05日 11:38:03   作者:springsnow  
本文詳細(xì)講解了Oracle中的分析函數(shù),文中通過(guò)示例代碼介紹的非常詳細(xì)。對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下

一、概述

OLAP的系統(tǒng)(即Online Aanalyse Process)一般用于系統(tǒng)決策使用。通常和數(shù)據(jù)倉(cāng)庫(kù)、數(shù)據(jù)分析、數(shù)據(jù)挖掘等概念聯(lián)系在一起。這些系統(tǒng)的特點(diǎn)是數(shù)據(jù)量大,對(duì)實(shí)時(shí)響應(yīng)的要求不高或者根本不關(guān)注這方面的要求,以查詢(xún)、統(tǒng)計(jì)操作為主。

我們來(lái)看看下面的幾個(gè)典型例子: 
①查找上一年度各個(gè)銷(xiāo)售區(qū)域排名前10的員工 
②按區(qū)域查找上一年度訂單總額占區(qū)域訂單總額20%以上的客戶(hù) 
③查找上一年度銷(xiāo)售最差的部門(mén)所在的區(qū)域 
④查找上一年度銷(xiāo)售最好和最差的產(chǎn)品

我們看看上面的幾個(gè)例子就可以感覺(jué)到這幾個(gè)查詢(xún)和我們?nèi)粘S龅降牟樵?xún)有些不同,具體有:

  • 需要對(duì)同樣的數(shù)據(jù)進(jìn)行不同級(jí)別的聚合操作
  • 需要在表內(nèi)將多條數(shù)據(jù)和同一條數(shù)據(jù)進(jìn)行多次的比較
  • 需要在排序完的結(jié)果集上進(jìn)行額外的過(guò)濾操作

1、分析函數(shù)和聚合函數(shù)的不同之處是什么?

普通的聚合函數(shù)用group by分組,每個(gè)分組返回一個(gè)統(tǒng)計(jì)值,而分析函數(shù)采用partition by分組,并且每組每行都可以返回一個(gè)統(tǒng)計(jì)值。

2、分析函數(shù)的形式

分析函數(shù)帶有一個(gè)開(kāi)窗函數(shù)over(),包含三個(gè)分析子句:分組(partition by), 排序(order by), 窗口(rows),他們的使用形式如下:

function_name(<argument>,<argument>...) over(<partition_Clause><order by_Clause><windowing_Clause>);
  • function_name():函數(shù)名稱(chēng)
  • argument:參數(shù)
  • over( ):開(kāi)窗函數(shù)
  • partition_Clause:分區(qū)子句,數(shù)據(jù)記錄集分組,group by...
  • order by_Clause:排序子句,數(shù)據(jù)記錄集排序,order by...
  • windowing_Clause:開(kāi)窗子句,定義分析函數(shù)在操作行的集合,三種開(kāi)窗方式:rows、range、Specifying

注:使用開(kāi)窗子句時(shí)一定要有排序子句!??!

3、OVER解析

OVER解析作用是告訴SQL引擎:按區(qū)域?qū)?shù)據(jù)進(jìn)行分區(qū),然后累積每個(gè)區(qū)域每個(gè)客戶(hù)的訂單總額(sum(sum(o.tot_sales)))。

①Over函數(shù)指明在那些字段上做分析,其內(nèi)跟Partition by表示對(duì)數(shù)據(jù)進(jìn)行分組。注意Partition by可以有多個(gè)字段。 
②Over函數(shù)可以和其它聚集函數(shù)、分析函數(shù)搭配,起到不同的作用。例如這里的SUM,還有諸如Rank,Dense_rank等

4、Oracle分析函數(shù)簡(jiǎn)單實(shí)例:

-- 按區(qū)域查找上一年度訂單總額占區(qū)域訂單總額20%以上的客戶(hù) table : orders_tmp

select * from orders_tmp;

image

select cust_nbr, region_id, cust_sales, region_sales,  -- 此處可以用tmptb.* , 但不能用 *
100 * round(cust_sales / region_sales, 2) || '%' Percent from 
 (select cust_nbr, region_id,
    sum(TOT_SALES) cust_sales,
    sum(sum(tot_sales)) over(partition by REGION_ID) as region_sales
  from orders_tmp where o.year = 2001 group by CUST_NBR, REGION_ID order by REGION_ID) tmptb
 where cust_sales > region_sales * 0.2;

image

二、分析函數(shù):Rank, Dense_rank, row_number,Ntile() 排列

形式:

Rank() Over ([Partition by ] [Order by ] [Nulls First/Last])
Dense_rank() Over ([Patition by ] [Order by ] [Nulls First/Last])
Row_number() Over ([Partitionby ] [Order by ] [Nulls First/Last])
Ntile() Over ([Partition by ] [Order by ])

rank,dense_rank,row_number函數(shù)為每條記錄產(chǎn)生一個(gè)從1開(kāi)始至n的自然數(shù),n的值可能小于等于記錄的總數(shù)。這3個(gè)函數(shù)的唯一區(qū)別在于當(dāng)碰到相同數(shù)據(jù)時(shí)的排名策略。

  1. row_number: 返回一個(gè)唯一的值,當(dāng)碰到相同數(shù)據(jù)時(shí),排名按照記錄集中記錄的順序依次遞增。
  2. dense_rank: 返回一個(gè)唯一的值,當(dāng)碰到相同數(shù)據(jù)時(shí),此時(shí)所有相同數(shù)據(jù)的排名都是一樣的。first、last :從DENSE_RANK返回的集合中取出排在最后面的一個(gè)值的行
  3. rank: 返回一個(gè)唯一的值,當(dāng)碰到相同的數(shù)據(jù)時(shí),此時(shí)所有相同數(shù)據(jù)的排名是一樣的,同時(shí)會(huì)在最后一條相同記錄和下一條不同記錄的排名之間空出排名。

①ROW_NUMBER:12345

②DENSE_RANK:12223

③RANK:12225

-- ①對(duì)所有客戶(hù)按訂單總額進(jìn)行排名 
-- ②按區(qū)域和客戶(hù)訂單總額進(jìn)行排名 
-- ③找出訂單總額排名前13位的客戶(hù) 
-- ④找出訂單總額最高、最低的客戶(hù) 
-- ⑤找出訂單總額排名前25%的客戶(hù)

-- 篩選排名前12位的客戶(hù), table : user_order 
-- 1.對(duì)所有客戶(hù)按訂單總額進(jìn)行排名, 使用rownum , rownum = 13,14 的數(shù)據(jù)跟 12 的數(shù)據(jù)一樣, 但是被漏掉了

select rownum, tmptb.* from 
 (select * from user_order order by CUSTOMER_sales desc) tmptb
where rownum <= 12;

-- 2.按區(qū)域和客戶(hù)訂單總額進(jìn)行排名 Rank, Dense_rank, row_number

select region_id, customer_id, 
  sum(customer_sales) total,
  rank() over(partition by region_id order by sum(customer_sales) desc) rank,
  dense_rank() over(partition by region_id order by sum(customer_sales) desc) dense_rank,
  row_number() over(partition by region_id order by sum(customer_sales) desc) row_number
from user_order
group by region_id, customer_id;

三、分析函數(shù):Top/Bottom N、First/Last、NTile

-- ①對(duì)所有客戶(hù)按訂單總額進(jìn)行排名 
-- ②按區(qū)域和客戶(hù)訂單總額進(jìn)行排名 
-- ③找出訂單總額排名前13位的客戶(hù) 
-- ④找出訂單總額最高、最低的客戶(hù) 
-- ⑤找出訂單總額排名前25%的客戶(hù)

-- 此處 null 被排到第一位 , 可以加 nulls last 把null的數(shù)據(jù)放到最后

select region_id, customer_id,
  sum(customer_sales) cust_sales,
  sum(sum(customer_sales)) over(partition by region_id) ran_total,
  rank() over(partition by region_id order by sum(customer_sales) desc /* nulls last */) rank
from user_order
group by region_id, customer_id;

-- 找出所有訂單總額排名前3的大客戶(hù)

select * from 
(select region_id,
    customer_id,
    sum(customer_sales) cust_total,
    rank() over(order by sum(customer_sales) desc NULLS LAST) rank
  from user_order
  group by region_id, customer_id)
 where rank <= 3;

-- 找出每個(gè)區(qū)域訂單總額排名前3的大客戶(hù)

select *
from (select region_id,
    customer_id,
    sum(customer_sales) cust_total,
    sum(sum(customer_sales)) over(partition by region_id) reg_total,
    rank() over(partition by region_id order by sum(customer_sales) desc NULLS LAST) rank
 from user_order
 group by region_id, customer_id)
where rank <= 3;

四、匯總

  • 匯總
  • 滾動(dòng)匯總
  • 分區(qū)滾動(dòng)匯總
  • 當(dāng)前記錄和后一條記錄
  • 分區(qū)匯總
Sum() Over ([Partition by ] [Order by ])
Sum() Over ([Partition by ] [Order by ]     Rows Between  Preceding And  Following)   
Sum() Over ([Partition by ] [Order by ]     Rows Between  Preceding And Current Row)
Sum() Over ([Partition by ] [Order by ]     Range Between Interval '' 'Day' Preceding    And Interval '' 'Day' Following )

五、Min()/Max():最大值/最小值

形式:

Min()/Max() Keep (Dense_rank First/Last [Partition by ] [Order by ])
  • -- min keep first last 找出訂單總額最高、最低的客戶(hù)
  • -- Min只能用于 dense_rank
  • -- min 函數(shù)的作用是用于當(dāng)存在多個(gè)First/Last情況下保證返回唯一的記錄, 去掉會(huì)出錯(cuò)
  • -- keep的作用。告訴Oracle只保留符合keep條件的記錄。
select 
   min(customer_id) keep (dense_rank first order by sum(customer_sales) desc) first,
   min(customer_id) keep (dense_rank last order by sum(customer_sales) desc) last
from user_order
group by customer_id;

-- 出訂單總額排名前1/5的客戶(hù) ntile 
-- 1.將數(shù)據(jù)分成5塊

select region_id,customer_id,
 sum(customer_sales) sales,
 ntile(5) over(order by sum(customer_sales) desc nulls last) tile
from user_order
group by region_id, customer_id;

-- 2.提取 tile=1 的數(shù)據(jù)

select * from 
(select region_id,customer_id,
   sum(customer_sales) sales,
   ntile(5) over(order by sum(customer_sales) desc nulls last) tile
 from user_order
 group by region_id, customer_id)
where tile = 1;

-- cust_nbr,month 為主鍵, 去重,只留下month最大的記錄 
-- 查找 cust_nbr 相同, month 最大的記錄

select cust_nbr,
 max(month) keep(dense_rank first order by month desc) max_month
from orders_tmp 
group by cust_nbr;

-- 去重, cust_nbr,month 為主鍵, cust_nbr 相同,只留下month最大的記錄

delete from orders_tmp2 where (cust_nbr, month) not in 
 (select cust_nbr, max(month) keep(dense_rank first order by month desc) max_month
from orders_tmp2 tb 
group by cust_nbr)

五、first_value/last_value:首記錄/末記錄

形式:

First_value / Last_value(Sum() Over ([Patition by ] [Order by ] Rows Between Preceding And Following ))

六、lag()與lead():相鄰記錄

Lag(Sum(), 1) Over([Patition by ] [Order by ])

lag和lead函數(shù)可以在一次查詢(xún)中取出同一字段的前n行的數(shù)據(jù)和后n行的值。這種操作可以使用對(duì)相同表的表連接來(lái)實(shí)現(xiàn),不過(guò)使用lag和lead有更高的效率。

lag(arg1,arg2,arg3)

第一個(gè)參數(shù)是列名,

第二個(gè)參數(shù)是偏移的offset,

第三個(gè)參數(shù)是超出記錄窗口時(shí)的默認(rèn)值。

-- ①列出每月的訂單總額以及全年的訂單總額 
-- ②列出每月的訂單總額以及截至到當(dāng)前月的訂單總額 
-- ③列出上個(gè)月、當(dāng)月、下一月的訂單總額以及全年的訂單總額 
-- ④列出每天的營(yíng)業(yè)額及一周來(lái)的總營(yíng)業(yè)額 
-- ⑤列出每天的營(yíng)業(yè)額及一周來(lái)每天的平均營(yíng)業(yè)額

-- ①通過(guò)指定一批記錄:例如從當(dāng)前記錄開(kāi)始直至某個(gè)部分的最后一條記錄結(jié)束 
-- ②通過(guò)指定一個(gè)時(shí)間間隔:例如在交易日之前的前30天 
-- ③通過(guò)指定一個(gè)范圍值:例如所有占到當(dāng)前交易量總額5%的記錄

-- 列出每月的訂單總額以及全年的訂單總額 
1.實(shí)現(xiàn)方法1

select month,
 sum(tot_sales) month_sales,
 sum(sum(tot_sales)) over (order by month rows between unbounded preceding and unbounded following) total_sales
from orders
group by month;

2.實(shí)現(xiàn)方法2

select month,
 sum(tot_sales) month_sales,
 sum(sum(tot_sales)) over(/*order by month*/) all_sales  -- 加上Order by month , 則數(shù)逐條記錄遞增
from orders group by month;

-- 列出每月的訂單總額以及截至到當(dāng)前月的訂單總額 
1.實(shí)現(xiàn)方法1

select month,
 sum(tot_sales) month_sales,
 sum(sum(tot_sales)) over(order by month rows between unbounded preceding and current row) current_total_sales
from orders 
group by month;

2.實(shí)現(xiàn)方法2

select month,
 sum(tot_sales) month_sales,
 sum(sum(tot_sales)) over(order by month) all_sales  -- 加上Order by month , 則是前面記錄累加到當(dāng)前記錄
from orders 
group by month;

-- 有時(shí)可能是針對(duì)全年的數(shù)據(jù)求平均值,有時(shí)會(huì)是針對(duì)截至到當(dāng)前的所有數(shù)據(jù)求平均值。很簡(jiǎn)單,只需要將: 
-- sum(sum(tot_sales))換成avg(sum(tot_sales))即可。

-- 統(tǒng)計(jì)當(dāng)天銷(xiāo)售額和五天內(nèi)的平均銷(xiāo)售額 range between interval

select trunc(order_dt) day,
 sum(sale_price) daily_sales,
 avg(sum(sale_price)) over (order by trunc(order_dt) range between interval '2' day preceding and interval '2' day following) five_day_avg
from cust_order
where sale_price is not null and order_dt between to_date('01-jul-2001','dd-mon-yyyy') and to_date('31-jul-2001','dd-mon-yyyy')

-- 顯示當(dāng)前月、上一個(gè)月、后一個(gè)月的銷(xiāo)售情況,以及每3個(gè)月的銷(xiāo)售平均值

select month,
  first_value(sum(tot_sales)) over (order by month rows between 1 preceding and 1 following) prev_month,
  sum(tot_sales) monthly_sales,
  last_value(sum(tot_sales)) over (order by month rows between 1 preceding and 1 following) next_month,
  avg(sum(tot_sales)) over  (order by month rows between 1 preceding and 1 following) rolling_avg
from orders_tmp
where year = 2001 and region_id = 6
group by month order by month;

-- 顯示當(dāng)月的銷(xiāo)售額和上個(gè)月的銷(xiāo)售額 
-- first_value(sum(tot_sales) over (order by month rows between 1 precedingand 0 following)) 
-- lag(sum(tot_sales),1)中的1表示以1月為間隔基準(zhǔn), 對(duì)應(yīng)為lead

select  month,            
 sum(tot_sales) monthly_sales,
 lag(sum(tot_sales), 1) over (order by month) prev_month_sales
from orders_tmp
where year = 2001 and region_id = 6
group by month order by month;

七、rollup()、cube()和grouping():排列組合分組

1)、group by rollup(a, b, c): 
首先會(huì)對(duì)(a、b、c)進(jìn)行g(shù)roup by,然后再對(duì)(a、b)進(jìn)行g(shù)roup by,其后再對(duì)(a)進(jìn)行g(shù)roup by,最后對(duì)全表進(jìn)行匯總操作。

2)、group by cube(a, b, c): 
則首先會(huì)對(duì)(a、b、c)進(jìn)行g(shù)roup by,然后依次是(a、b),(a、c),(a),(b、c),(b),(c),最后對(duì)全表進(jìn)行匯總操作。

八、ratio_to_report ():計(jì)算每條記錄在其對(duì)應(yīng)記錄集或其子集中所占的比例。

ratio_to_report(a) over(partition by b) :求按照b分組后a的值在所屬分組中總值的占比,a的值必須為數(shù)值或數(shù)值型字段。

Ratio_to_report() 括號(hào)中就是分子,over() 括號(hào)中就是分母 分母缺省就是整個(gè)占比

eg:列出上一年度每個(gè)月的銷(xiāo)售總額、年底銷(xiāo)售額以及每個(gè)月的銷(xiāo)售額占全年總銷(xiāo)售額的比例:

select region_id, salesperson_id,
  sum(tot_sales) sp_sales,
  round(ratio_to_report(sum(tot_sales)) over (partition by region_id), 2) sp_ratio
from orders
where year = 2001
group by region_id, salesperson_id
order by region_id, salesperson_id;

到此這篇關(guān)于Oracle分析函數(shù)的文章就介紹到這了。希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持腳本之家。

相關(guān)文章

最新評(píng)論