快捷導(dǎo)航

基于opencv對(duì)高空拍攝視頻消抖處理方法

更新時(shí)間：2022年10月24日 10:07:05 作者：js君

這篇文章主要介紹了基于opencv對(duì)高空拍攝視頻消抖處理,首先對(duì)視頻進(jìn)行抽第一幀與最后一幀，為什么抽取兩幀?這樣做的主要目的是，我們?cè)谧鰩瑢?duì)齊時(shí)，使用幀中靜態(tài)物的關(guān)鍵點(diǎn)做對(duì)齊，需要的朋友可以參考下

一、問題背景

無人機(jī)在拍攝視頻時(shí)，由于風(fēng)向等影響因素，不可避免會(huì)出現(xiàn)位移和旋轉(zhuǎn)，導(dǎo)致拍攝出的畫面存在平移和旋轉(zhuǎn)的幀間變換，即“抖動(dòng)” 抖動(dòng)會(huì)改變目標(biāo)物體 (車輛、行人) 的坐標(biāo)，給后續(xù)的檢測(cè)、跟蹤任務(wù)引入額外誤差，造成數(shù)據(jù)集不可用。

原效果

目標(biāo)效果

理想的無抖動(dòng)視頻中，對(duì)應(yīng)于真實(shí)世界同一位置的背景點(diǎn)在不同幀中的坐標(biāo)應(yīng)保持一致，從而使車輛、行人等目標(biāo)物體的坐標(biāo)變化只由物體本身的運(yùn)動(dòng)導(dǎo)致，而不包含相機(jī)的運(yùn)動(dòng) 抖動(dòng)可以由不同幀中對(duì)應(yīng)背景點(diǎn)的坐標(biāo)變換來描述

二、量化指標(biāo)

抖動(dòng)可以用相鄰幀之間的 x 方向平移像素 dx，y 方向平移像素 dy，旋轉(zhuǎn)角度 da，縮放比例 s 來描述，分別繪制出 4 個(gè)折線圖，根據(jù)折線圖的走勢(shì)可以判斷抖動(dòng)的程度理想的無抖動(dòng)視頻中，dx、dy、da 幾乎始終為 0，s 幾乎始終為 1。

三、技術(shù)思路

我們最終實(shí)現(xiàn)，將視頻的所有幀都對(duì)齊到第一幀，以達(dá)到視頻消抖問題，實(shí)現(xiàn)邏輯如下圖所示。

（1）首先對(duì)視頻進(jìn)行抽第一幀與最后一幀，為什么抽取兩幀？這樣做的主要目的是，我們?cè)谧鰩瑢?duì)齊時(shí)，使用幀中靜態(tài)物的關(guān)鍵點(diǎn)做對(duì)齊，如果特征點(diǎn)來源于動(dòng)態(tài)物上，那么對(duì)齊后就會(huì)產(chǎn)生形變，我們選取第一幀與最后一幀，提取特征點(diǎn)，留下交集部分，則可以得到靜態(tài)特征點(diǎn)我們這里稱為特征模板，然后將特征模板應(yīng)用到每一幀上，這樣可以做有效對(duì)齊。

（2）常用特征點(diǎn)檢測(cè)器：

SIFT: 04 年提出，廣泛應(yīng)用于各種跟蹤和識(shí)別算法，表現(xiàn)能力強(qiáng)，但計(jì)算復(fù)雜度高。

SURF: 06 年提出，是 SIFT 的演進(jìn)版本，保持強(qiáng)表現(xiàn)能力的同時(shí)大大減少了計(jì)算量。

BRISK: BRIEF 的演進(jìn)版本，壓縮了特征的表示，提高了匹配速度。 ORB: 以速度著稱，是 SURF 的演進(jìn)版本，多用于實(shí)時(shí)應(yīng)用。

GFTT: 最早提出的 Harris 角點(diǎn)的改進(jìn)版本，經(jīng)常合稱為 Harris-Shi-Tomasi 角點(diǎn)。

SimpleBlob: 使用 blob 的概念來抽取圖像中的特征點(diǎn)，相對(duì)于角點(diǎn)的一種創(chuàng)新。 FAST: 相比其他方法特征點(diǎn)數(shù)量最多，但也容易得到距離過近的點(diǎn)，需要經(jīng)過 NMS。

Star: 最初用于視覺測(cè)距，后來也成為一種通用的特征點(diǎn)檢測(cè)方法。

我們這里使用的是SURF特征點(diǎn)檢測(cè)器

第一幀特特征點(diǎn)提取??????

最后一幀特征點(diǎn)提取

（3）在上圖中，我們發(fā)現(xiàn)所提取的特征點(diǎn)中部分來自于車身，由于車是運(yùn)動(dòng)的，所以我們不能使用，我們用第一幀與最后一幀做靜態(tài)特幀點(diǎn)匹配，生成靜態(tài)特征模板，在下圖中，我們發(fā)現(xiàn)只有所有的特征點(diǎn)只選取在靜態(tài)物上

靜態(tài)特征點(diǎn)模板

（4）靜態(tài)特征模板匹配，我們這里使用Flann算法，匹配結(jié)果如下

特征匹配

（5）使用匹配成功的兩組特征點(diǎn)，估計(jì)兩幀之間的透視變換 (Perspective Transformation)。估計(jì)矩陣 H，其中 (x_i, y_i) 和 (x_i^′, y_i^′) 分別是兩幀的特征點(diǎn)。

第一幀

最后一幀對(duì)齊到第一幀

四、實(shí)現(xiàn)代碼

運(yùn)行環(huán)境以及版本，安裝命令如下：
python版本：3.X
opencv-python：3.4.2.16
opencv-contrib-python：3.4.2.16

需要卸載之前的opencv-python版本
pip uninstall opencv-python
pip uninstall opencv-contrib-python
 
安裝新的版本
pip install opencv_python==3.4.2.16 
pip install opencv-contrib-python==3.4.2.16

代碼基于python實(shí)現(xiàn)，如下所示：

import cv2
import numpy as np
from tqdm import tqdm
import argparse
import os
 
# get param
parser = argparse.ArgumentParser(description='')
parser.add_argument('-v', type=str, default='')  # 指定輸入視頻路徑位置（參數(shù)必選）
parser.add_argument('-o', type=str, default='')  # 指定輸出視頻路徑位置（參數(shù)必選）
parser.add_argument('-n', type=int, default=-1)  # 指定處理的幀數(shù)（參數(shù)可選）, 不設(shè)置使用視頻實(shí)際幀
 
# eg: python3 stable.py -v=video/01.mp4 -o=video/01_stable.mp4 -n=100 -p=6
 
args = parser.parse_args()
 
input_path = args.v
output_path = args.o
number = args.n
 
class Stable:
    # 處理視頻文件路徑
    __input_path = None
 
    __output_path = None
 
    __number = number
 
    # surf 特征提取
    __surf = {
        # surf算法
        'surf': None,
        # 提取的特征點(diǎn)
        'kp': None,
        # 描述符
        'des': None,
        # 過濾后的特征模板
        'template_kp': None
    }
 
    # capture
    __capture = {
        # 捕捉器
        'cap': None,
        # 視頻大小
        'size': None,
        # 視頻總幀
        'frame_count': None,
        # 視頻幀率
        'fps': None,
        # 視頻
        'video': None,
    }
 
    # 配置
    __config = {
        # 要保留的最佳特征的數(shù)量
        'key_point_count': 5000,
        # Flann特征匹配
        'index_params': dict(algorithm=0, trees=5),
        'search_params': dict(checks=50),
        'ratio': 0.5,
    }
 
    # 特征提取列表
    __surf_list = []
 
    def __init__(self):
        pass
 
    # 初始化capture
    def __init_capture(self):
        self.__capture['cap'] = cv2.VideoCapture(self.__video_path)
        self.__capture['size'] = (int(self.__capture['cap'].get(cv2.CAP_PROP_FRAME_WIDTH)),
                                  int(self.__capture['cap'].get(cv2.CAP_PROP_FRAME_HEIGHT)))
 
        self.__capture['fps'] = self.__capture['cap'].get(cv2.CAP_PROP_FPS)
 
        self.__capture['video'] = cv2.VideoWriter(self.__output_path, cv2.VideoWriter_fourcc(*"mp4v"),
                                                  self.__capture['fps'], self.__capture['size'])
 
        self.__capture['frame_count'] = int(self.__capture['cap'].get(cv2.CAP_PROP_FRAME_COUNT))
 
        if number == -1:
            self.__number = self.__capture['frame_count']
        else:
            self.__number = min(self.__number, self.__capture['frame_count'])
 
    # 初始化surf
    def __init_surf(self):
 
        self.__capture['cap'].set(cv2.CAP_PROP_POS_FRAMES, 0)
        state, first_frame = self.__capture['cap'].read()
 
        self.__capture['cap'].set(cv2.CAP_PROP_POS_FRAMES, self.__capture['frame_count'] - 1)
        state, last_frame = self.__capture['cap'].read()
 
        self.__surf['surf'] = cv2.xfeatures2d.SURF_create(self.__config['key_point_count'])
 
        self.__surf['kp'], self.__surf['des'] = self.__surf['surf'].detectAndCompute(first_frame, None)
        kp, des = self.__surf['surf'].detectAndCompute(last_frame, None)
 
        # 快速臨近匹配
        flann = cv2.FlannBasedMatcher(self.__config['index_params'], self.__config['search_params'])
        matches = flann.knnMatch(self.__surf['des'], des, k=2)
 
        good_match = []
        for m, n in matches:
            if m.distance < self.__config['ratio'] * n.distance:
                good_match.append(m)
 
        self.__surf['template_kp'] = []
        for f in good_match:
            self.__surf['template_kp'].append(self.__surf['kp'][f.queryIdx])
 
    # 釋放
    def __release(self):
        self.__capture['video'].release()
        self.__capture['cap'].release()
 
    # 處理
    def __process(self):
 
        current_frame = 1
 
        self.__capture['cap'].set(cv2.CAP_PROP_POS_FRAMES, 0)
 
        process_bar = tqdm(self.__number, position=current_frame)
 
        while current_frame <= self.__number:
            # 抽幀
            success, frame = self.__capture['cap'].read()
 
            if not success: return
 
            # 計(jì)算
            frame = self.detect_compute(frame)
 
            # 寫幀
            self.__capture['video'].write(frame)
 
            current_frame += 1
 
            process_bar.update(1)
 
    # 視頻穩(wěn)像
    def stable(self, input_path, output_path, number):
        self.__video_path = input_path
        self.__output_path = output_path
        self.__number = number
 
        self.__init_capture()
        self.__init_surf()
        self.__process()
        self.__release()
 
    # 特征點(diǎn)提取
    def detect_compute(self, frame):
 
        frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
 
        # 計(jì)算特征點(diǎn)
        kp, des = self.__surf['surf'].detectAndCompute(frame_gray, None)
 
        # 快速臨近匹配
        flann = cv2.FlannBasedMatcher(self.__config['index_params'], self.__config['search_params'])
        matches = flann.knnMatch(self.__surf['des'], des, k=2)
 
        # 計(jì)算單應(yīng)性矩陣
        good_match = []
        for m, n in matches:
            if m.distance < self.__config['ratio'] * n.distance:
                good_match.append(m)
 
        # 特征模版過濾
        p1, p2 = [], []
        for f in good_match:
            if self.__surf['kp'][f.queryIdx] in self.__surf['template_kp']:
                p1.append(self.__surf['kp'][f.queryIdx].pt)
                p2.append(kp[f.trainIdx].pt)
 
        # 單應(yīng)性矩陣
        H, _ = cv2.findHomography(np.float32(p2), np.float32(p1), cv2.RHO)
 
        # 透視變換
        output_frame = cv2.warpPerspective(frame, H, self.__capture['size'], borderMode=cv2.BORDER_REPLICATE)
 
        return output_frame
 
if __name__ == '__main__':
 
    if not os.path.exists(input_path):
        print(f'[ERROR] File "{input_path}" not found')
        exit(0)
    else:
        print(f'[INFO] Video "{input_path}" stable begin')
 
    s = Stable()
    s.stable(input_path, output_path, number)
 
    print('[INFO] Done.')
    exit(0)

參數(shù)說明：

-v 指定輸入視頻路徑位置（參數(shù)必選）

-o 指定輸出視頻路徑位置（參數(shù)必選）

-n 指定處理的幀數(shù)（參數(shù)可選）, 不設(shè)置使用視頻實(shí)際幀

調(diào)用示例：

python3 stable.py -v=test.mp4 -o=test_stable.mp4