快捷導(dǎo)航

Oracle數(shù)據(jù)庫使用 listagg去重刪除重復(fù)數(shù)據(jù)的方法匯總

更新時間：2025年01月20日 14:38:17 作者：???傀儡師

文章介紹了在Oracle數(shù)據(jù)庫中使用LISTAGG和XMLAGG函數(shù)進行字符串聚合并去重的方法,包括去重聚合、使用XML解析和CLOB類型處理大字符串、以及根據(jù)具體需求選擇合適的方法,感興趣的朋友跟隨小編一起看看吧

案例表

create table "dept_tag_info"
(
"tag_id" bigint not null,
"tag_code" varchar(200),
"tag_name" varchar(500),
"tag_level" varchar(200),
"parent_id" bigint,
"gmt_create" datetime(6),
"create_by" varchar(50),
"update_by" varchar(50),
"gmt_modified" datetime(6),
"del_flag" char(1),
"sort" integer,
"multiple_choice" char(1),
"relation_dept_category" varchar(500),
"tips_text" varchar(2000),
"remarks" varchar(500),
"enabled" char(1),
constraint "dept_tag_info_new_pk" not cluster primary key("tag_id")) storage(on "ctbiyi_data_v3", clusterbtr) ;
comment on table "dept_tag_info" is '企業(yè)標簽基礎(chǔ)信息表';
comment on column "dept_tag_info"."tag_id" is '主鍵';
comment on column "dept_tag_info"."tag_code" is '標簽編碼';
comment on column "dept_tag_info"."tag_name" is '標簽名稱';
comment on column "dept_tag_info"."tag_level" is '標簽層級';
comment on column "dept_tag_info"."parent_id" is '父節(jié)點編碼id';
comment on column "dept_tag_info"."gmt_create" is '創(chuàng)建時間';
comment on column "dept_tag_info"."create_by" is '創(chuàng)建人';
comment on column "dept_tag_info"."update_by" is '修改人';
comment on column "dept_tag_info"."gmt_modified" is '修改時間';
comment on column "dept_tag_info"."del_flag" is '刪除標記 0-未刪除 1-已刪除';
comment on column "dept_tag_info"."sort" is '排序';
comment on column "dept_tag_info"."multiple_choice" is '多選（1是 0否）';
comment on column "dept_tag_info"."relation_dept_category" is '關(guān)聯(lián)主體';

為了方便大家看所以所有小寫

    select 
    t.tag_code,
    t.tag_name,
    listagg(t.tag_level, ',') within group(order by t.tag_level) as tag_levels
from 
    dept_tag_info t
group by 
    t.tag_code, 
    t.tag_name;

第一種：使用wm_concat() + distinct去重聚合

select 
    t.tag_code,
    t.tag_name,
    wm_concat(distinct t.tag_level) as tag_levels
from 
    dept_tag_info t
group by 
    t.tag_code, 
    t.tag_name;

第二種：使用listagg，先去重，再聚合

select 
    t.tag_code,
    t.tag_name,
    listagg(t.tag_level, ',') within group(order by t.tag_level) as tag_levels
from 
    (select distinct s.tag_code, s.tag_name, s.tag_level
     from dept_tag_info s) t
group by 
    t.tag_code, 
    t.tag_name;

第三種：xmlagg(xmlparse(content t.tag_level || ‘,’ wellformed) order by t.tag_level)：

使用 xmlagg 和 xmlparse 函數(shù)將 tag_level 字段聚合為一個用逗號分隔的字符串，并按 tag_level 排序。
getclobval()：將 xml 類型的結(jié)果轉(zhuǎn)換為 clob（character large object）。
rtrim(…, ‘,’)：去掉聚合結(jié)果末尾的逗號。
內(nèi)部子查詢 select distinct s.tag_code, s.tag_name, s.tag_level from dynamic_ctbiyi_v3.dept_tag_info s：
選擇唯一的 tag_code、tag_name 和 tag_level

select 
    t.tag_code,
    t.tag_name,
    rtrim(
        xmlagg(
            xmlparse(content t.tag_level || ',' wellformed)
            order by t.tag_level
        ).getclobval(), 
        ','
    ) as tag_levels
from 
    (select distinct s.tag_code, s.tag_name, s.tag_level
     from dept_tag_info s) t
group by 
    t.tag_code, 
    t.tag_name;

listagg 的優(yōu)缺點

優(yōu)點：
簡潔和易用：LISTAGG 語法簡單，易于理解和使用。
性能較好：在許多情況下，LISTAGG 的執(zhí)行速度會快于 XMLAGG，尤其是在處理較少數(shù)據(jù)量時。
排序：支持在聚合過程中對字符串進行排序，使用 WITHIN GROUP 子句。
缺點：
字符串長度限制：LISTAGG 生成的字符串長度不能超過 4000 字符，如果超過這個限制，會拋出錯誤。
無格式化功能：LISTAGG 僅限于字符串連接，不支持更復(fù)雜的格式化。
xmlagg 的優(yōu)缺點
優(yōu)點：
字符串長度更大：XMLAGG 可以處理比 LISTAGG 更大的字符串，因為生成的結(jié)果是 CLOB 類型，不受 4000 字符的限制。
靈活性：支持更復(fù)雜的 XML 處理和格式化功能，適合需要復(fù)雜字符串操作的場景。
缺點：
性能問題：在處理大量數(shù)據(jù)時，XMLAGG 可能比 LISTAGG 慢，因為涉及到 XML 解析和處理。
復(fù)雜性：語法相對復(fù)雜，使用起來不如 LISTAGG 簡單。
使用 LISTAGG：當(dāng)聚合后的字符串長度不超過 4000 字符時，并且只需要簡單的字符串連接和排序。
使用 XMLAGG：當(dāng)聚合后的字符串長度可能超過 4000 字符，或者需要更復(fù)雜的格式化和處理時。
根據(jù)具體需求選擇合適的函數(shù)可以在保證代碼簡潔性和執(zhí)行效率的同時，滿足業(yè)務(wù)需求。

手動處理重復(fù)數(shù)據(jù)的一種快捷安全的方式

-- 查找重復(fù)記錄
select "tag_id", count(*) as cnt
from dept_tag_info
group by "tag_id"
having count(*) > 1
order by cnt desc;

主刪除語句：

delete from dept_tag_info t
where t.rowid in (
    select rid
    from (
        select t1.rowid as rid, row_number() over (partition by t1.tag_code, t1.tag_name order by 1) as rn
        from dept_tag_info t1
    ) t2
    where t2.rn > 1
);

如何在Oracle SQL中使用XMLAGG和LISTAGG函數(shù)進行字符串聚合。
產(chǎn)品工廠聚合
場景：你有一個名為product_details的表，里面有一個列product_factory，你希望將所有不同的產(chǎn)品工廠聚合成一個以逗號分隔的列表。

SELECT   
    RTRIM(XMLAGG(XMLPARSE(content = dd.product_factory || ',' wellformed)   
    ORDER BY dd.product_factory).getclobval(), ',') AS productFactory  
FROM   
    product_details dd;

產(chǎn)品名稱聚合
場景：你有另一個表product_changes，你想要聚合在特定日期后發(fā)生變化的產(chǎn)品名稱。

SELECT   
    LISTAGG(dd.change_after_part_name, ',') WITHIN GROUP (ORDER BY dd.change_after_part_name) AS productName  
FROM   
    product_changes dd  
WHERE   
    dd.change_date > '2023-01-01';

同時使用兩者的聚合
場景：你希望在一個查詢中獲取產(chǎn)品工廠和其相關(guān)產(chǎn)品名稱的列表。

SELECT   
    RTRIM(XMLAGG(XMLPARSE(content = dd.product_factory || ',' wellformed)  
    ORDER BY dd.product_factory).getclobval(), ',') AS productFactory,  
    LISTAGG(cc.change_after_part_name, ',') WITHIN GROUP (ORDER BY cc.change_after_part_name) AS productName  
FROM   
    product_details dd  
LEFT JOIN   
    product_changes cc ON dd.product_id = cc.product_id  
WHERE   
    cc.change_date > '2023-01-01';

到此這篇關(guān)于Oracle 系列數(shù)據(jù)庫使用 listagg去重，刪除重復(fù)數(shù)據(jù)的幾種方法的文章就介紹到這了,更多相關(guān)Oracle listagg去重內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章: