Win10環(huán)境借助DockerDesktop部署大數(shù)據(jù)時(shí)序數(shù)據(jù)庫Apache Druid的操作方法
Win10環(huán)境借助DockerDesktop部署最新版大數(shù)據(jù)時(shí)序數(shù)據(jù)庫Apache Druid32.0.0
前言
大數(shù)據(jù)分析中,有一種常見的場(chǎng)景,那就是時(shí)序數(shù)據(jù),簡(jiǎn)言之,數(shù)據(jù)一旦產(chǎn)生絕對(duì)不會(huì)修改,隨著時(shí)間流逝,每個(gè)時(shí)間點(diǎn)都會(huì)有個(gè)新的狀態(tài)值。這種時(shí)序數(shù)據(jù)的量級(jí)往往異常夸張,例如傳感器的原始監(jiān)控?cái)?shù)據(jù):
https://lizhiyong.blog.csdn.net/article/details/114898620
一個(gè)簡(jiǎn)單的加速度傳感器一年的數(shù)據(jù)量就是31e!?。≈圃鞓I(yè)傳感器數(shù)據(jù)如果不經(jīng)底層PLC
等下位機(jī)預(yù)處理,直接打到邊緣計(jì)算網(wǎng)關(guān),即使mqtt
也會(huì)有巨大的負(fù)載?。?!
類似的,還有服務(wù)器的原始監(jiān)控?cái)?shù)據(jù),例如常見的Prometheus
和Zabbix
,當(dāng)集群很多時(shí),監(jiān)控項(xiàng)同樣很多,再算上虛擬化后的容器和虛擬機(jī)內(nèi)都可能部署了監(jiān)控,此時(shí)的數(shù)據(jù)量級(jí)就灰??捎^?。?!一小時(shí)幾百億條數(shù)據(jù)都是常見的事情!?。?/p>
但是很多原始的監(jiān)控?cái)?shù)據(jù)如果全部存下來,存儲(chǔ)成本高的可怕,同時(shí)信息密度極低,更多時(shí)候我們可能只關(guān)注近期的全部熱數(shù)據(jù)來做在線的模型訓(xùn)練,人工查看每秒鐘幾千條數(shù)據(jù)也是不切合實(shí)際的,事實(shí)上,做一個(gè)簡(jiǎn)單的秒級(jí)/分鐘級(jí)統(tǒng)計(jì)就能滿足大多數(shù)的分析場(chǎng)景,超過1天的冷數(shù)據(jù)其實(shí)已經(jīng)沒什么時(shí)效性。
對(duì)于此類場(chǎng)景,可以高吞吐、預(yù)聚合的數(shù)據(jù)庫,在壓測(cè)后,從Apache Druid
、Clickhouse
、Kylin
中,選擇了前者。。。專業(yè)的事情要交給專業(yè)的組件去做?。?!
對(duì)于非內(nèi)核和二開的業(yè)務(wù)開發(fā)人員,更多場(chǎng)景應(yīng)該關(guān)注的是API、特性及用法,不應(yīng)該在部署這種事情上花費(fèi)太多精力!?。」P者之前已部署了Docker Desktop:
https://lizhiyong.blog.csdn.net/article/details/145580868
今天在Win10環(huán)境再搭建個(gè)Apache Druid
最新版玩玩。
版本選擇
官網(wǎng):
https://druid.apache.org/
注意不是阿里數(shù)據(jù)庫連接池的那個(gè)Druid
?。。?/p>
截至2025-02-13
,Apache Druid
最新版本是32.0.0
。
資源準(zhǔn)備
參考官網(wǎng):
https://druid.apache.org/docs/latest/tutorials/docker
官方給出了使用docker-compose.yml
編排容器的教程,作為一個(gè)實(shí)時(shí)組件,大內(nèi)存是必須的?。?!但是啟動(dòng)8個(gè)容器【Zookeeper
+PostgreSQL
+6個(gè)Druid
】每個(gè)最多7GB內(nèi)存也不是什么大事?。?!
https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/docker-compose.yml
獲取到這個(gè)資源文件:
version: "2.2" volumes: metadata_data: {} middle_var: {} historical_var: {} broker_var: {} coordinator_var: {} router_var: {} druid_shared: {} services: postgres: container_name: postgres image: postgres:latest ports: - "5432:5432" volumes: - metadata_data:/var/lib/postgresql/data environment: - POSTGRES_PASSWORD=FoolishPassword - POSTGRES_USER=druid - POSTGRES_DB=druid # Need 3.5 or later for container nodes zookeeper: container_name: zookeeper image: zookeeper:3.5.10 ports: - "2181:2181" environment: - ZOO_MY_ID=1 coordinator: image: apache/druid:32.0.0 container_name: coordinator volumes: - druid_shared:/opt/shared - coordinator_var:/opt/druid/var depends_on: - zookeeper - postgres ports: - "8081:8081" command: - coordinator env_file: - environment broker: image: apache/druid:32.0.0 container_name: broker volumes: - broker_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8082:8082" command: - broker env_file: - environment historical: image: apache/druid:32.0.0 container_name: historical volumes: - druid_shared:/opt/shared - historical_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8083:8083" command: - historical env_file: - environment middlemanager: image: apache/druid:32.0.0 container_name: middlemanager volumes: - druid_shared:/opt/shared - middle_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "8091:8091" - "8100-8105:8100-8105" command: - middleManager env_file: - environment router: image: apache/druid:32.0.0 container_name: router volumes: - router_var:/opt/druid/var depends_on: - zookeeper - postgres - coordinator ports: - "3012:8888" #這里筆者改為3012防止霸占有用的端口 command: - router env_file: - environment
參照官網(wǎng)另一篇:
https://druid.apache.org/docs/latest/configuration/
自己玩玩可以先不改這些運(yùn)行時(shí)配置,容器啟動(dòng)的,后續(xù)要重新部署也非常容易?。。?/p>
還需要:
https://raw.githubusercontent.com/apache/druid/32.0.0/distribution/docker/environment
做另一個(gè)配置文件:
# Java tuning #DRUID_XMX=1g #DRUID_XMS=1g #DRUID_MAXNEWSIZE=250m #DRUID_NEWSIZE=250m #DRUID_MAXDIRECTMEMORYSIZE=6172m DRUID_SINGLE_NODE_CONF=micro-quickstart druid_emitter_logging_logLevel=debug druid_extensions_loadList=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "postgresql-metadata-storage", "druid-multi-stage-query"] druid_zk_service_host=zookeeper druid_metadata_storage_host= druid_metadata_storage_type=postgresql druid_metadata_storage_connector_connectURI=jdbc:postgresql://postgres:5432/druid druid_metadata_storage_connector_user=druid druid_metadata_storage_connector_password=FoolishPassword druid_indexer_runner_javaOptsArray=["-server", "-Xmx1g", "-Xms1g", "-XX:MaxDirectMemorySize=3g", "-Duser.timezone=UTC", "-Dfile.encoding=UTF-8", "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"] druid_indexer_fork_property_druid_processing_buffer_sizeBytes=256MiB druid_storage_type=local druid_storage_storageDirectory=/opt/shared/segments druid_indexer_logs_type=file druid_indexer_logs_directory=/opt/shared/indexing-logs druid_processing_numThreads=2 druid_processing_numMergeBuffers=2 DRUID_LOG4J=<?xml version="1.0" encoding="UTF-8" ?><Configuration status="WARN"><Appenders><Console name="Console" target="SYSTEM_OUT"><PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/></Console></Appenders><Loggers><Root level="info"><AppenderRef ref="Console"/></Root><Logger name="org.apache.druid.jetty.RequestLog" additivity="false" level="DEBUG"><AppenderRef ref="Console"/></Logger></Loggers></Configuration>
部署文件看起來麻雀雖小五臟俱全?。。?/p>
部署
PS C:\Users\zhiyong> cd E:\dockerData\volume\druid1 PS E:\dockerData\volume\druid1> ls 目錄: E:\dockerData\volume\druid1 Mode LastWriteTime Length Name ---- ------------- ------ ---- -a---- 2025-02-13 23:26 2980 docker-compose.yml -a---- 2025-02-13 23:33 1576 environment PS E:\dockerData\volume\druid1> docker compose up -d time="2025-02-13T23:34:39+08:00" level=warning msg="E:\\dockerData\\volume\\druid1\\docker-compose.yml: the attribute `version` is obsolete, it will be ignored, please remove it to avoid potential confusion" [+] Running 72/15 ? router Pulled 230.7s ? coordinator Pulled 230.7s ? postgres Pulled 181.0s ? historical Pulled 230.7s ? broker Pulled 230.7s ? middlemanager Pulled 230.7s ? zookeeper Pulled 85.7s [+] Running 15/15 ? Network druid1_default Created 0.1s ? Volume "druid1_druid_shared" Created 0.0s ? Volume "druid1_historical_var" Created 0.0s ? Volume "druid1_middle_var" Created 0.0s ? Volume "druid1_router_var" Created 0.0s ? Volume "druid1_metadata_data" Created 0.0s ? Volume "druid1_coordinator_var" Created 0.0s ? Volume "druid1_broker_var" Created 0.0s ? Container postgres Started 2.4s ? Container zookeeper Started 2.4s ? Container coordinator Started 1.6s ? Container router Started 2.5s ? Container broker Started 2.3s ? Container historical Started 2.5s ? Container middlemanager Started 2.8s PS E:\dockerData\volume\druid1>
拉取鏡像成功后很快就能拉起容器:
好家伙。。。還順便把其它組件的端口也給暴露出來了。。。
于是還**白piao
**到一個(gè)PG和Zookeeper
?。?!
驗(yàn)證
http://localhost:3012/unified-console.html#
灰常好,現(xiàn)在已經(jīng)擁有了一個(gè)最新Apache Druid32.0.0
?。。?/p>
轉(zhuǎn)載請(qǐng)注明出處:https://lizhiyong.blog.csdn.net/article/details/145622903
到此這篇關(guān)于Win10環(huán)境借助DockerDesktop部署大數(shù)據(jù)時(shí)序數(shù)據(jù)庫Apache Druid的文章就介紹到這了,更多相關(guān)DockerDesktop部署大數(shù)據(jù)時(shí)序數(shù)據(jù)庫Apache Druid內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
相關(guān)文章
使用百度云加速后網(wǎng)站打開速度慢、廣告不顯示的解決方法
這篇文章主要介紹了使用百度云加速后網(wǎng)站打開速度慢、廣告不顯示的解決方法,需要的朋友可以參考下2015-09-09微信支付、支付寶支付等常用第三方支付通道接口手續(xù)費(fèi)對(duì)比
微信支付、支付寶等第三方支付,需要和銀聯(lián)、網(wǎng)聯(lián)對(duì)接,有清算機(jī)構(gòu)和銀行的交易處理通道成本。費(fèi)率指支付手續(xù)費(fèi)的費(fèi)率,不同行業(yè)、不同的支付平臺(tái)、不同的支付額度或次數(shù)所對(duì)應(yīng)的通道費(fèi)率是不一樣的。2023-01-01Archlinux?Timeshift系統(tǒng)備份與還原的操作方法
這篇文章主要介紹了Archlinux?Timeshift系統(tǒng)備份與還原的操作方法,本文給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2023-01-01thymeleaf實(shí)現(xiàn)th:each雙重多重嵌套功能
今天給大家分享一個(gè)使用 thymeleaf 實(shí)現(xiàn)一個(gè)動(dòng)態(tài)加載一二級(jí)文章分類的功能,本文通過代碼講解的非常詳細(xì),具有一定的參考借鑒價(jià)值,需要的朋友參考下吧2019-11-11解析jetbrains IDE的插件加載不出來的問題(IDEA、pycharm等)
這篇文章主要介紹了解析jetbrains IDE的插件加載不出來(IDEA、pycharm等),本文給大家分享解決方案,對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2020-10-10