Oracle批量投入數(shù)據(jù)方法總結(jié)

更新時間：2025年01月14日 10:32:26 作者：fengyehongWorld

這篇文章主要介紹了Oracle批量投入數(shù)據(jù)方法總結(jié),文中通過代碼示例講解的非常詳細(xì),對大家的學(xué)習(xí)或工作有的一定的幫助,需要的朋友可以參考下

零. 待投入數(shù)據(jù)的表結(jié)構(gòu)

create table DB_USER."PERSON_TABLE" (
  ID NUMBER not null
  , NAME VARCHAR2(50)
  , AGE NUMBER
  , EMAIL VARCHAR2(100)
  , CREATED_DATE DATE
)

一. INSERT INTO ... SELECT投入數(shù)據(jù)

INSERT INTO ... SELECT的這種方式相當(dāng)于把數(shù)據(jù)加載到內(nèi)存中之后再插入數(shù)據(jù)庫，只適合投入小規(guī)模的數(shù)據(jù)。

1.1 普通的方式投入數(shù)據(jù)

當(dāng)數(shù)據(jù)量不是很多的時候，可以使用這種方式

先從DUAL虛擬表中檢索后造出指定條數(shù)的數(shù)據(jù)后，再插入到指定的表中。
除了主鍵之類的關(guān)鍵字段之外，其余字段寫固定值即可。

INSERT INTO PERSON_TABLE
	SELECT
		-- 因為該字段為字符串形式,所以使用TO_CHAR轉(zhuǎn)換
		-- TO_CHAR(100000000 + LEVEL) || 'TEST_ID' AS id,
		LEVEL AS id,
		'Name_' || ROWNUM AS name,
        TRUNC(DBMS_RANDOM.VALUE(18, 60)) AS age,
        'user' || ROWNUM || '@example.com' AS email,
        SYSDATE - DBMS_RANDOM.VALUE(0, 365) AS created_date
	FROM
		DUAL 
	CONNECT BY LEVEL <= 1000000;

1.2 并行插入（Parallel Insert）投入數(shù)據(jù)

ALTER SESSION ENABLE PARALLEL DML;

INSERT /*+ PARALLEL(PERSON_TABLE, 4) */ INTO PERSON_TABLE
SELECT LEVEL AS id,
       'Name_' || ROWNUM AS name,
       TRUNC(DBMS_RANDOM.VALUE(18, 60)) AS age,
       'user' || ROWNUM || '@example.com' AS email,
       SYSDATE - DBMS_RANDOM.VALUE(0, 365) AS created_date
FROM DUAL
CONNECT BY LEVEL <= 1000000;

二. PL/SQL 循環(huán)投入數(shù)據(jù)

2.1 腳本介紹

靈活，支持動態(tài)生成數(shù)據(jù)，適合中小數(shù)據(jù)量。
數(shù)據(jù)量大時性能較差，容易導(dǎo)致上下文切換開銷。

BEGIN
    FOR i IN 1..5000000 LOOP
        INSERT INTO PERSON_TABLE (id, name, age, email, created_date)
        VALUES (
            i,
            'Name_' || i,
            -- 隨機(jī)年齡
            TRUNC(DBMS_RANDOM.VALUE(18, 60)),
            'user' || i || '@example.com',
            -- 隨機(jī)日期
            SYSDATE - DBMS_RANDOM.VALUE(0, 365)
        );

        -- 每 100000 條提交一次
        IF MOD(i, 100000) = 0 THEN
            COMMIT;
        END IF;
        
    END LOOP;
    
    COMMIT;
END;
/

2.2 效果

投入500萬條數(shù)據(jù)，耗時5分鐘。

三. PL/SQL FORALL 批量操作

3.1 腳本介紹

這種方式可以減少上下文切換，性能比普通的循環(huán)插入要好。

DECLARE
    TYPE person_array IS TABLE OF PERSON_TABLE%ROWTYPE;
    v_data person_array := person_array();
BEGIN
    FOR i IN 1..5000000 LOOP
        v_data.EXTEND;
        v_data(v_data.COUNT).id := i;
        v_data(v_data.COUNT).name := 'Name_' || i;
        v_data(v_data.COUNT).age := TRUNC(DBMS_RANDOM.VALUE(18, 60));
        v_data(v_data.COUNT).email := 'user' || i || '@example.com';
        v_data(v_data.COUNT).created_date := SYSDATE - DBMS_RANDOM.VALUE(0, 365);

        -- 每 100000 條批量插入一次
        IF MOD(i, 100000) = 0 THEN
            FORALL j IN 1..v_data.COUNT
                INSERT INTO PERSON_TABLE VALUES v_data(j);
            COMMIT;
            v_data.DELETE; -- 清空數(shù)組
        END IF;
    END LOOP;

    -- 插入剩余數(shù)據(jù)
    FORALL j IN 1..v_data.COUNT
        INSERT INTO PERSON_TABLE VALUES v_data(j);
    COMMIT;
END;
/

3.2 效果

投入500萬條數(shù)據(jù)，耗時1分鐘18秒。

四. SQL*Loader 工具加載外部文件

寫一個PowerShell腳本，根據(jù)數(shù)據(jù)庫的表結(jié)構(gòu)來生成csv文件

該腳本執(zhí)行后，會在桌面上生成一個csv文件。

# 文件名稱
$file_name = 'person_data.csv'
# 路徑
$outputFile = "$Home\Desktop\$file_name"
# csv 文件的總行數(shù)
$rows = 5000000
# 并行線程數(shù)
$threadCount = 4
# 每個線程生成的記錄數(shù)量
$chunkSize = [math]::Ceiling($rows / $threadCount)

# 判斷文件是否存在，存在的話就刪除
if (Test-Path -Path $outputFile) {
    Remove-Item -Path $outputFile -Force
}

# 寫入 CSV 表頭
# "`"ID`",`"NAME`",`"AGE`",`"EMAIL`",`"CREATED_DATE`"" | Out-File -FilePath $outputFile -Encoding UTF8 -Append

# 定義腳本塊
$scriptblock = {
    
    param($startRow, $endRow, $tempFile)

    # 在后臺作業(yè)中定義 Generate-Chunk 函數(shù)
    function Generate-Chunk {
        
        param (
            [int]$startRow,
            [int]$endRow,
            [string]$filePath
        )

        $random = [System.Random]::new()
        $currentDate = Get-Date
        $sb = [System.Text.StringBuilder]::new()

		# 循環(huán)生成csv數(shù)據(jù)
        for ($i = $startRow; $i -le $endRow; $i++) {
			
			# =========================對應(yīng)數(shù)據(jù)庫的各字段值=========================
            $id = $i
            $name = "Name_$i"
            $age = $random.Next(18, 60)
            $email = "user$i@example.com"
            $createdDate = $currentDate.AddDays(- $random.Next(0, 365)).ToString("yyyy/MM/dd HH:mm:ss")
            # =========================對應(yīng)數(shù)據(jù)庫的各字段值=========================
			
			# =========================一行csv=========================
            $line = "`"$id`",`"$name`",`"$age`",`"$email`",`"$createdDate`""
            # =========================一行csv=========================

            $sb.AppendLine($line) | Out-Null
        }

        <# 
            將生成的內(nèi)容寫入文件
            -NoNewline 的作用是為了防止csv文件的最后一行被追加空行
        #>
        $sb.ToString() | Out-File -FilePath $filePath -Encoding UTF8 -Append -NoNewline
    }

    # 調(diào)用 Generate-Chunk 函數(shù)，多線程生成臨時csv文件
    Generate-Chunk -startRow $startRow -endRow $endRow -filePath $tempFile
}

# CSV文件合成
function Merge-CSV {
        
	param (
		[string]$outputFile,
		[bool]$IsReadAllDataToMemory
	)
	
	# 獲取所有分段文件，按名稱排序
    $partFiles = Get-ChildItem -Path "$outputFile.*.part" | Sort-Object Name
	
	if ($IsReadAllDataToMemory) {
		
		# 將所有內(nèi)容加載到內(nèi)存中，然后一次性寫入
		$partFiles | ForEach-Object { Get-Content $_.FullName } | Out-File -FilePath $outputFile -Encoding UTF8 -Force
		# 刪除所有分段文件
		$partFiles | ForEach-Object { Remove-Item $_.FullName }
		
		return;
	}
	
	$partFiles | ForEach-Object {
        Get-Content -Path $_.FullName | Out-File -FilePath $outputFile -Encoding UTF8 -Append
        Remove-Item -Path $_.FullName
    }
}
    
try {
	# 定義job數(shù)組
    $jobs = @()
	
	# 組裝job
    1..$threadCount | ForEach-Object {
		
        $startRow = ($_ - 1) * $chunkSize + 1
        $endRow = [math]::Min($_ * $chunkSize, $rows)
		
		# 臨時csv文件
        $tempFile = "$outputFile.$_.part"

        $jobs += Start-Job -ScriptBlock $scriptblock -ArgumentList $startRow, $endRow, $tempFile
    }
	
	# 統(tǒng)計生成csv文件所消耗的時間
    $exec_time = Measure-Command {

        Write-Host "臨時csv文件開始生成..."

        # 執(zhí)行job，等待并收集所有執(zhí)行結(jié)果
        $jobs | ForEach-Object { Wait-Job -Job $_; Receive-Job -Job $_; Remove-Job -Job $_ }

        # 合并所有并發(fā)生成的csv臨時文件,組裝成最終的總csv文件
        Write-Host "臨時csv文件生成完畢,開啟合并..."
        Merge-CSV -outputFile $outputFile -IsReadAllDataToMemory $False
    }

    Write-Host "csv文件生成完畢,共消耗$($exec_time.TotalSeconds)秒: $outputFile" -ForegroundColor Red
	
} catch {

    # 當(dāng)異常發(fā)生時,清空桌面上的臨時csv文件
    if (Test-Path -Path "$outputFile.*.part") {
        Remove-Item -Path "$outputFile.*.part" -Force
    }

    Write-Host "腳本運行時發(fā)生異常: $_" -ForegroundColor Red
    Write-Host "詳細(xì)信息: $($_.Exception.Message)" -ForegroundColor Yellow
    Write-Host "堆棧跟蹤: $($_.Exception.StackTrace)" -ForegroundColor Gray
}

Read-Host "按 Enter 鍵退出..."

創(chuàng)建控制文件control_file.ctl

LOAD DATA
INFILE 'person_data.csv'
INTO TABLE PERSON_TABLE
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
(id, name, age, email, created_date "TO_DATE(:created_date, 'YYYY/MM/DD HH24:MI:SS')")

使用 SQL*Loader 執(zhí)行加載

性能極高，適合大規(guī)模數(shù)據(jù)插入。
支持多線程和并行加載。

sqlldr db_user/oracle@SERVICE_XEPDB1_CLIENT control=control_file.ctl direct=true

4.1 效果

投入500萬條數(shù)據(jù)，耗時居然不到10秒！

到此這篇關(guān)于Oracle批量投入數(shù)據(jù)方法總結(jié)的文章就介紹到這了,更多相關(guān)Oracle投入數(shù)據(jù)內(nèi)容請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家！

您可能感興趣的文章:

欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片