IoT邊緣集群Kubernetes?Events告警通知實現(xiàn)示例
背景
邊緣集群(基于 樹莓派 + K3S) 需要實現(xiàn)基本的告警功能。
邊緣集群限制
CPU/內(nèi)存/存儲 資源緊張,無法支撐至少需要 2GB 以上內(nèi)存和大量存儲的基于 Prometheus 的完整監(jiān)控體系方案(即使是基于 Prometheus Agent, 也無法支撐) (需要避免額外的存儲和計算資源消耗)
網(wǎng)絡條件,無法支撐監(jiān)控體系,因為監(jiān)控體系一般都需要每 1min 定時(或每時每刻)傳輸數(shù)據(jù),且數(shù)據(jù)量不??;
存在 5G 收費網(wǎng)絡的情況,且訪問的目的端地址需要開通權(quán)限,且按照流量收費,且因為 5G 網(wǎng)絡條件,網(wǎng)絡傳輸能力受限,且不穩(wěn)定(可能會在一段時間內(nèi)離線);
關(guān)鍵需求
總結(jié)下來,關(guān)鍵需求如下:
- 實現(xiàn)對邊緣集群異常的及時告警,需要知道邊緣集群正在發(fā)生的異常情況;
- 網(wǎng)絡:網(wǎng)絡條件情況較差,網(wǎng)絡流量少,只只能開通極少數(shù)目的端地址,可以容忍網(wǎng)絡不穩(wěn)定(一段時間內(nèi)離線)的情況;
- 資源:需要盡量避免額外的存儲和計算資源消耗
方案
綜上所訴,采用如下方案實現(xiàn):
基于 Kubernetes Events 的告警通知
架構(gòu)圖

技術(shù)方案規(guī)劃
- 從 Kubernetes 的各項資源收集 Events, 如:
pod
node
kubelet
crd
...
- 通過 kubernetes-event-exporter 組件來實現(xiàn)對 Kubernetes Events 的收集;
- 只篩選
Warning級別 Events 供告警通知(后續(xù),條件可以進一步定義) - 告警通過 飛書 webhook 等通信工具進行發(fā)送(后續(xù),發(fā)送渠道可以增加)
實施步驟
手動方式:
在邊緣集群上,執(zhí)行如下操作:
1. 創(chuàng)建 roles
如下:
cat << _EOF_ | kubectl apply -f -
---
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: event-exporter-extra
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
---
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: monitoring
name: event-exporter
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
namespace: monitoring
name: event-exporter
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter-extra
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: event-exporter-extra
subjects:
- kind: ServiceAccount
namespace: kube-event-export
name: event-exporter
_EOF_
2. 創(chuàng)建 kubernetes-event-exporter config
如下:
cat << _EOF_ | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: event-exporter-cfg
namespace: monitoring
data:
config.yaml: |
logLevel: error
logFormat: json
route:
routes:
- match:
- receiver: "dump"
- drop:
- type: "Normal"
match:
- receiver: "feishu"
receivers:
- name: "dump"
stdout: {}
- name: "feishu"
webhook:
endpoint: "https://open.feishu.cn/open-apis/bot/v2/hook/..."
headers:
Content-Type: application/json
layout:
msg_type: interactive
card:
config:
wide_screen_mode: true
enable_forward: true
header:
title:
tag: plain_text
content: XXX IoT K3S 集群告警
template: red
elements:
- tag: div
text:
tag: lark_md
content: "**EventType:** {{ .Type }}\n**EventKind:** {{ .InvolvedObject.Kind }}\n**EventReason:** {{ .Reason }}\n**EventTime:** {{ .LastTimestamp }}\n**EventMessage:** {{ .Message }}"
_EOF_
?? 注意:
endpoint: "https://open.feishu.cn/open-apis/bot/v2/hook/..."按需修改為對應的 webhook endpoint, ?切記勿對外公布!!!content: XXX IoT K3S 集群告警: 按需調(diào)整為方便快速識別的名稱,如:"家里測試 K3S 集群告警"
3. 創(chuàng)建 Deployment
cat << _EOF_ | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: event-exporter
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: event-exporter
version: v1
template:
metadata:
labels:
app: event-exporter
version: v1
spec:
volumes:
- name: cfg
configMap:
name: event-exporter-cfg
defaultMode: 420
- name: localtime
hostPath:
path: /etc/localtime
type: ''
- name: zoneinfo
hostPath:
path: /usr/share/zoneinfo
type: ''
containers:
- name: event-exporter
image: ghcr.io/opsgenie/kubernetes-event-exporter:v0.11
args:
- '-conf=/data/config.yaml'
env:
- name: TZ
value: Asia/Shanghai
volumeMounts:
- name: cfg
mountPath: /data
- name: localtime
readOnly: true
mountPath: /etc/localtime
- name: zoneinfo
readOnly: true
mountPath: /usr/share/zoneinfo
imagePullPolicy: IfNotPresent
serviceAccount: event-exporter
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/controlplane
operator: In
values:
- 'true'
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- 'true'
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: In
values:
- 'true'
tolerations:
- key: node-role.kubernetes.io/controlplane
value: 'true'
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
_EOF_
?? 說明:
event-exporter-cfg相關(guān)配置,是用于加載以 ConfigMap 形式保存的配置文件;localtimezoneinfoTZ相關(guān)配置,是用于修改該 pod 的時區(qū)為Asia/Shanghai, 以使得最終顯示的通知效果為 CST 時區(qū);affinitytolerations相關(guān)配置,是為了確保:無論如何,優(yōu)先調(diào)度到 master node 上去,按需調(diào)整,此處是因為 master 往往在邊緣集群中作為網(wǎng)關(guān)存在,配置較高,且在線時間較長;
自動化部署
效果:安裝 K3S 時就自動部署
在 K3S server 所在節(jié)點,/var/lib/rancher/k3s/server/manifests/ 目錄(如果沒有該目錄就先創(chuàng)建)下,創(chuàng)建 event-exporter.yaml
---
apiVersion: v1
kind: Namespace
metadata:
name: monitoring
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: event-exporter-extra
rules:
- apiGroups:
- ""
resources:
- nodes
verbs:
- get
- list
- watch
---
apiVersion: v1
kind: ServiceAccount
metadata:
namespace: monitoring
name: event-exporter
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: view
subjects:
- kind: ServiceAccount
namespace: monitoring
name: event-exporter
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: event-exporter-extra
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: event-exporter-extra
subjects:
- kind: ServiceAccount
namespace: kube-event-export
name: event-exporter
---
apiVersion: v1
kind: ConfigMap
metadata:
name: event-exporter-cfg
namespace: monitoring
data:
config.yaml: |
logLevel: error
logFormat: json
route:
routes:
- match:
- receiver: "dump"
- drop:
- type: "Normal"
match:
- receiver: "feishu"
receivers:
- name: "dump"
stdout: {}
- name: "feishu"
webhook:
endpoint: "https://open.feishu.cn/open-apis/bot/v2/hook/dc4fd384-996b-4d20-87cf-45b3518869ec"
headers:
Content-Type: application/json
layout:
msg_type: interactive
card:
config:
wide_screen_mode: true
enable_forward: true
header:
title:
tag: plain_text
content: xxxK3S集群告警
template: red
elements:
- tag: div
text:
tag: lark_md
content: "**EventType:** {{ .Type }}\n**EventKind:** {{ .InvolvedObject.Kind }}\n**EventReason:** {{ .Reason }}\n**EventTime:** {{ .LastTimestamp }}\n**EventMessage:** {{ .Message }}"
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: event-exporter
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: event-exporter
version: v1
template:
metadata:
labels:
app: event-exporter
version: v1
spec:
volumes:
- name: cfg
configMap:
name: event-exporter-cfg
defaultMode: 420
- name: localtime
hostPath:
path: /etc/localtime
type: ''
- name: zoneinfo
hostPath:
path: /usr/share/zoneinfo
type: ''
containers:
- name: event-exporter
image: ghcr.io/opsgenie/kubernetes-event-exporter:v0.11
args:
- '-conf=/data/config.yaml'
env:
- name: TZ
value: Asia/Shanghai
volumeMounts:
- name: cfg
mountPath: /data
- name: localtime
readOnly: true
mountPath: /etc/localtime
- name: zoneinfo
readOnly: true
mountPath: /usr/share/zoneinfo
imagePullPolicy: IfNotPresent
serviceAccount: event-exporter
affinity:
nodeAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/controlplane
operator: In
values:
- 'true'
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/control-plane
operator: In
values:
- 'true'
- weight: 100
preference:
matchExpressions:
- key: node-role.kubernetes.io/master
operator: In
values:
- 'true'
tolerations:
- key: node-role.kubernetes.io/controlplane
value: 'true'
effect: NoSchedule
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/master
operator: Exists
effect: NoSchedule
之后啟動 K3S 就會自動部署。
???Reference: 自動部署 manifests 和 Helm charts | Rancher 文檔
最終效果
如下圖:

???參考文檔
- opsgenie/kubernetes-event-exporter: Export Kubernetes events to multiple destinations with routing and filtering (github.com)
- AliyunContainerService/kube-eventer: kube-eventer emit kubernetes events to sinks (github.com)
- kubesphere/kube-events: K8s Event Exporting, Filtering and Alerting in Multi-Tenant Environment (github.com)
- kubesphere/notification-manager: K8s native notification management with multi-tenancy support (github.com)
以上就是IoT邊緣集群Kubernetes Events告警通知實現(xiàn)示例的詳細內(nèi)容,更多關(guān)于IoT集群Kubernetes Events告警的資料請關(guān)注腳本之家其它相關(guān)文章!
相關(guān)文章
Rainbond對前端項目Vue及React的持續(xù)部署
這篇文章主要為大家介紹了Rainbond對前端項目Vue及React的持續(xù)部署,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2022-04-04
Rainbond網(wǎng)絡治理插件ServiceMesh官方文檔說明
這篇文章主要為大家介紹了Rainbond網(wǎng)絡治理插件ServiceMesh官方文檔說明,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步,早日升職加薪2022-04-04
K8s準入控制Admission?Controller深入介紹
本篇我們將聚焦于?kube-apiserver?請求處理過程中一個很重要的部分?--?準入控制器(Admission?Controller)深入講解,有需要的朋友可以借鑒參考下,希望能夠有所幫助,祝大家多多進步早日升職加薪2022-04-04

