欧美bbbwbbbw肥妇,免费乱码人妻系列日韩,一级黄片

關(guān)于使用SQOOP抽數(shù)到Hive遇到的問題

 更新時(shí)間:2024年04月12日 11:22:53   作者:Meepoljd  
這篇文章主要介紹了關(guān)于使用SQOOP抽數(shù)到Hive遇到的問題,具有很好的參考價(jià)值,希望對(duì)大家有所幫助,如有錯(cuò)誤或未考慮完全的地方,望不吝賜教

使用SQOOP抽數(shù)到Hive遇到問題

前置條件

1.源端數(shù)據(jù)庫(kù)類型為Mysql

2.目標(biāo)端是Hive庫(kù),beeline為1.1.0

3.Hive建表時(shí)使用了分桶,并且加入了stored as orc參數(shù),之前這么創(chuàng)建是為了能夠?qū)崿F(xiàn)delete等操作

處理過程

最初想要在Sqoop中使用hcatalog直接建表+抽數(shù)據(jù),語(yǔ)句是這樣寫的:

sqoop import --connect jdbc:mysql://XX.XX.XX.XX:19100/pms_scyw --username root --password ********\
--table t_sb_zwyc_xl --null-string '\\N' --null-non-string '\\N' \
--create-hcatalog-table \
--hcatalog-database test1 \
--hcatalog-table t_sb_zwyc_xl_521 \
--hcatalog-storage-stanza "clustered by (obj_id) into 16 buckets stored as orc TBLPROPERTIES('transactional'='true')"
#查到的資料說hcatalog-storage-stanza參數(shù)會(huì)在建表的時(shí)候自動(dòng)加入到create語(yǔ)句之后,所以這里這樣寫

滿心期待說能直接把工作做完了,結(jié)果就報(bào)錯(cuò)了,報(bào)錯(cuò)結(jié)果如下:

19/05/21 10:03:57 INFO hcat.SqoopHCatUtilities: Executing external HCatalog CLI process with args :-f,/tmp/hcat-script-1558404237184
19/05/21 10:04:00 INFO hcat.SqoopHCatUtilities: FAILED: IllegalArgumentException Decimal precision out of allowed range [1,38]
19/05/21 10:04:00 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: HCat exited with status 64
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.executeExternalHCatProgram(SqoopHCatUtilities.java:1148)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.launchHCatCli(SqoopHCatUtilities.java:1097)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.createHCatTable(SqoopHCatUtilities.java:644)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:340)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:802)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

看信息似乎是精度超過了限制,怎么會(huì)呢,隨即看了看Sqoop生成的建表語(yǔ)句,找到端倪了:

    `dqtz` varchar(6),
    `dszll` decimal(12,4),
    `dxmpyxkid` varchar(42),
    `dycs` decimal(65),

可以看到有個(gè)字段的類型是decimal(65),允許范圍只有1-38,這顯然超了,這咋辦啊,只有求助谷歌大法了,查來查去也只發(fā)現(xiàn)這一個(gè)帖子:

帖子鏈接https://kb.informatica.com/solution/23/pages/64/524288.aspx

原帖給了一種解決方法

Solution
To resolve this issue, we need to ensure that the Precision and the Scale of the Number datatype changes from 0 to a valid value.
To achieve this, add the following arguments in the JDBC parameters of the Oracle connection.
CatalogOptions=0;NumberPrecisionScale=1
Add above to the Connection string of the JDBC connection used for metadata access.
After modifying the JDBC connection string, re-import the metadata and verify if the right precision is getting imported.

意思就是說在JDBC串后面加上“CatalogOptions=0;NumberPrecisionScale=1”參數(shù),可人家用的源庫(kù)是oracle,我這是mysql,給的方法也不頂用,就很頭大,沒辦法,只有花點(diǎn)時(shí)間自己建表然后把數(shù)據(jù)導(dǎo)進(jìn)來了。

捯飭了一段時(shí)間,表建好了,ok,再抽一下試試看,這次語(yǔ)句變了下:

sqoop import  --connect jdbc:mysql://XX.XX.XX.XX:19100/pms_scyw --username root --password ********\
 --table t_sb_zwyc_xl --null-string '\\N' --null-non-string '\\N'  \
 --fields-terminated-by '~@!'   -m 2 \
 --hive-overwrite \
 --hcatalog-database test1 \
 --hcatalog-table t_sb_zwyc_xl_ljd \
 --verbose ; 

結(jié)果又報(bào)錯(cuò)了,好在不是和上次同樣的錯(cuò)誤,看看是什么問題:

ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hive.hcatalog.common.HCatException : 2016 : Error operation not supported : Store into a partition with bucket definition from Pig/Mapreduce is not supported
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:109)
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:346)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:768)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:249)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)

舒服了,這次直接是說不支持分桶格式了,查了下,貌似Sqoop暫時(shí)還是不支持分桶的表,但是可以通過臨時(shí)表來進(jìn)行一下數(shù)據(jù)的中轉(zhuǎn),那就先建個(gè)臨時(shí)表,不過這個(gè)臨時(shí)表不加“clustered by (obj_id) into 16 buckets”參數(shù),只在create語(yǔ)句后加上stored as orc TBLPROPERTIES(‘transactional’=‘true’) 就行了,隨后就是先把數(shù)據(jù)抽進(jìn)臨時(shí)表,在從臨時(shí)表insert到目標(biāo)表去就可以了。

一通操作做完以后驗(yàn)證一下表內(nèi)的數(shù)據(jù)也是ok的。

雖然目的是達(dá)到了,但是發(fā)現(xiàn)了一個(gè)問題,在目標(biāo)表中進(jìn)行查詢時(shí),select后面加了limit的話,Hive服務(wù)會(huì)直接宕機(jī)掉,日志也沒有ERROR級(jí)別的,很奇怪,只能后面再找找原因

總結(jié)

以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。

相關(guān)文章

最新評(píng)論