關(guān)于使用SQOOP抽數(shù)到Hive遇到的問題
使用SQOOP抽數(shù)到Hive遇到問題
前置條件
1.源端數(shù)據(jù)庫(kù)類型為Mysql
2.目標(biāo)端是Hive庫(kù),beeline為1.1.0
3.Hive建表時(shí)使用了分桶,并且加入了stored as orc參數(shù),之前這么創(chuàng)建是為了能夠?qū)崿F(xiàn)delete等操作
處理過程
最初想要在Sqoop中使用hcatalog直接建表+抽數(shù)據(jù),語(yǔ)句是這樣寫的:
sqoop import --connect jdbc:mysql://XX.XX.XX.XX:19100/pms_scyw --username root --password ********\ --table t_sb_zwyc_xl --null-string '\\N' --null-non-string '\\N' \ --create-hcatalog-table \ --hcatalog-database test1 \ --hcatalog-table t_sb_zwyc_xl_521 \ --hcatalog-storage-stanza "clustered by (obj_id) into 16 buckets stored as orc TBLPROPERTIES('transactional'='true')" #查到的資料說hcatalog-storage-stanza參數(shù)會(huì)在建表的時(shí)候自動(dòng)加入到create語(yǔ)句之后,所以這里這樣寫
滿心期待說能直接把工作做完了,結(jié)果就報(bào)錯(cuò)了,報(bào)錯(cuò)結(jié)果如下:
19/05/21 10:03:57 INFO hcat.SqoopHCatUtilities: Executing external HCatalog CLI process with args :-f,/tmp/hcat-script-1558404237184
19/05/21 10:04:00 INFO hcat.SqoopHCatUtilities: FAILED: IllegalArgumentException Decimal precision out of allowed range [1,38]
19/05/21 10:04:00 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: HCat exited with status 64
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.executeExternalHCatProgram(SqoopHCatUtilities.java:1148)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.launchHCatCli(SqoopHCatUtilities.java:1097)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.createHCatTable(SqoopHCatUtilities.java:644)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:340)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:802)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:259)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
看信息似乎是精度超過了限制,怎么會(huì)呢,隨即看了看Sqoop生成的建表語(yǔ)句,找到端倪了:
`dqtz` varchar(6), `dszll` decimal(12,4), `dxmpyxkid` varchar(42), `dycs` decimal(65),
可以看到有個(gè)字段的類型是decimal(65),允許范圍只有1-38,這顯然超了,這咋辦啊,只有求助谷歌大法了,查來查去也只發(fā)現(xiàn)這一個(gè)帖子:
帖子鏈接:https://kb.informatica.com/solution/23/pages/64/524288.aspx
原帖給了一種解決方法
Solution
To resolve this issue, we need to ensure that the Precision and the Scale of the Number datatype changes from 0 to a valid value.
To achieve this, add the following arguments in the JDBC parameters of the Oracle connection.
CatalogOptions=0;NumberPrecisionScale=1
Add above to the Connection string of the JDBC connection used for metadata access.
After modifying the JDBC connection string, re-import the metadata and verify if the right precision is getting imported.
意思就是說在JDBC串后面加上“CatalogOptions=0;NumberPrecisionScale=1”參數(shù),可人家用的源庫(kù)是oracle,我這是mysql,給的方法也不頂用,就很頭大,沒辦法,只有花點(diǎn)時(shí)間自己建表然后把數(shù)據(jù)導(dǎo)進(jìn)來了。
捯飭了一段時(shí)間,表建好了,ok,再抽一下試試看,這次語(yǔ)句變了下:
sqoop import --connect jdbc:mysql://XX.XX.XX.XX:19100/pms_scyw --username root --password ********\ --table t_sb_zwyc_xl --null-string '\\N' --null-non-string '\\N' \ --fields-terminated-by '~@!' -m 2 \ --hive-overwrite \ --hcatalog-database test1 \ --hcatalog-table t_sb_zwyc_xl_ljd \ --verbose ;
結(jié)果又報(bào)錯(cuò)了,好在不是和上次同樣的錯(cuò)誤,看看是什么問題:
ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hive.hcatalog.common.HCatException : 2016 : Error operation not supported : Store into a partition with bucket definition from Pig/Mapreduce is not supported
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:109)
at org.apache.hive.hcatalog.mapreduce.HCatOutputFormat.setOutput(HCatOutputFormat.java:70)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureHCat(SqoopHCatUtilities.java:346)
at org.apache.sqoop.mapreduce.hcat.SqoopHCatUtilities.configureImportOutputFormat(SqoopHCatUtilities.java:768)
at org.apache.sqoop.mapreduce.ImportJobBase.configureOutputFormat(ImportJobBase.java:98)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:249)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:665)
舒服了,這次直接是說不支持分桶格式了,查了下,貌似Sqoop暫時(shí)還是不支持分桶的表,但是可以通過臨時(shí)表來進(jìn)行一下數(shù)據(jù)的中轉(zhuǎn),那就先建個(gè)臨時(shí)表,不過這個(gè)臨時(shí)表不加“clustered by (obj_id) into 16 buckets”參數(shù),只在create語(yǔ)句后加上stored as orc TBLPROPERTIES(‘transactional’=‘true’) 就行了,隨后就是先把數(shù)據(jù)抽進(jìn)臨時(shí)表,在從臨時(shí)表insert到目標(biāo)表去就可以了。
一通操作做完以后驗(yàn)證一下表內(nèi)的數(shù)據(jù)也是ok的。
雖然目的是達(dá)到了,但是發(fā)現(xiàn)了一個(gè)問題,在目標(biāo)表中進(jìn)行查詢時(shí),select后面加了limit的話,Hive服務(wù)會(huì)直接宕機(jī)掉,日志也沒有ERROR級(jí)別的,很奇怪,只能后面再找找原因
總結(jié)
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
相關(guān)文章
Web 設(shè)計(jì)與開發(fā)者必須知道的 15 個(gè)站點(diǎn)
今天讀到一篇文章,介紹了15個(gè)對(duì) Web 設(shè)計(jì)與開發(fā)師極端有用的站點(diǎn),里面有不少也是我們一直在使用的,也許對(duì)很多人都有用,翻譯出來以餉同仁。2009-08-08手機(jī)中點(diǎn)擊網(wǎng)頁(yè)鏈接實(shí)現(xiàn)撥號(hào)或保存電話功能實(shí)現(xiàn)代碼
這篇文章主要介紹了手機(jī)中點(diǎn)擊網(wǎng)頁(yè)鏈接實(shí)現(xiàn)撥號(hào)或保存電話功能實(shí)現(xiàn)代碼,需要的朋友可以參考下2015-04-04JetBrains(IEDA、CLion、Pycharm) 學(xué)生獲得免費(fèi)使用資格
JetBrains針對(duì)學(xué)生推出了免費(fèi)使用資格,但是很多同學(xué)卻不知道或者說不知道怎樣獲得免費(fèi)資格,通過學(xué)生認(rèn)證來使用JetBrains的軟件才是最方便穩(wěn)定的,具體怎么獲取呢,感興趣的朋友跟隨小編一起看看吧2020-08-08VScode?隱藏大量無用的文件比如在看Linux?kernel或boot時(shí)候
這篇文章主要介紹了VScode?隱藏大量無用的文件比如在看Linux?kernel或boot時(shí)候,VScode 工程創(chuàng)建先在 Ubuntu 下編譯一下 uboot,然后將編譯后的 uboot 文件夾復(fù)制到 windows 下,并創(chuàng)建VScode 工程,需要的朋友可以參考下2022-10-1012種實(shí)現(xiàn)301網(wǎng)頁(yè)重定向方法的代碼實(shí)例(含Web編程語(yǔ)言和Web服務(wù)器)
這篇文章主要介紹了11種實(shí)現(xiàn)301網(wǎng)頁(yè)重定向方法的代碼實(shí)例,文中包含9種編程語(yǔ)言和3種WEB服務(wù)器配置方法,共計(jì)12種,需要的朋友可以參考下2014-07-07