Python編程實(shí)現(xiàn)小姐姐跳舞并生成詞云視頻示例

更新時(shí)間：2021年10月08日 11:58:14 作者：小張Python

本文用Python做了一個(gè)詞云視頻，以另一種角度來(lái)看小姐姐跳舞視頻左半部分是小姐姐跳舞視頻，右半部分是根據(jù)動(dòng)作生成的的詞云視頻，有需要的朋友可以借鑒參考下

制作過(guò)程分為以下幾個(gè)部分

1、視頻下載

首先需要下載一個(gè)小姐姐跳舞的視頻，這里我用的是 you-get 工具，可借助 Python 的 pip 命令進(jìn)行安裝

pip install you-get

you-get 支持下載平臺(tái)包括：Youtube、Blili、TED、騰訊、優(yōu)酷、愛(ài)奇藝(涵蓋所有視頻平臺(tái)下載鏈接)，

以 youtube 視頻為例，you-get 下載命令

you-get -o ~/Videos(存放視頻路徑) -O zoo.webm(視頻命名) 'https://www.youtube.com/watch?v=jNQXAC9IVRw'

這里通過(guò) os 模塊來(lái)實(shí)現(xiàn) you-get 下載命令，使用時(shí)傳入三個(gè)參數(shù)即可：

1，視頻鏈接，

2，要存放視頻的文件路徑；

3，視頻命名；

def download(video_url,save_path,video_name):
   '''
   youget 下載視頻
   :param video_url:視頻鏈接
   :param save_path: 保存路徑
   :param video_name: 視頻命名
   :return:
   '''
   cmd = 'you-get -o {} -O {} {}'.format(save_path,video_name,video_url)
   res = os.popen(cmd,)
   res.encoding = 'utf-8'
   print(res.read())# 打印輸出

關(guān)于 you-get 更多用法，可參考官網(wǎng)，里面關(guān)于用法介紹的非常詳細(xì)：

https://you-get.org/#getting-started

2、B 站彈幕下載

做詞云圖需要有文本數(shù)據(jù)支持，這里選取 B 站彈幕為素材；關(guān)于 B 站視頻彈幕下載方式，這里一個(gè)快捷方法，用 requests 訪問(wèn)指定視頻的 API 接口，就能得到該視頻下的全部彈幕

http://comment.bilibili.com/{cid}.xml # cid 為B站視頻的cid 編號(hào)

但 API 接口的構(gòu)造，需要知道視頻的 cid 編號(hào)

B站視頻 cid 編號(hào)獲取方式：

F12打開(kāi)開(kāi)發(fā)者模式->NetWork->XHR->v2?cid=… 鏈接，該網(wǎng)頁(yè)鏈接中有一個(gè)”cid=一串?dāng)?shù)字“ 的字符串，其中等號(hào)后面的連續(xù)數(shù)字就是該視頻的 cid 編號(hào)

以上面視頻為例，291424805 就是這個(gè)視頻的 cid 編號(hào)，

有了 cid 之后，通過(guò) requests 請(qǐng)求 API 接口，就能獲取到里面的彈幕數(shù)據(jù)

http://comment.bilibili.com/291424805.xml

def download_danmu():
    '''彈幕下載并存儲(chǔ)'''
    cid = '141367679'# video_id
    url = 'http://comment.bilibili.com/{}.xml'.format(cid)
    f = open('danmu.txt','w+',encoding='utf-8') #打開(kāi) txt 文件
    res = requests.get(url)
    res.encoding = 'utf-8'
    soup = BeautifulSoup(res.text,'lxml')
    items = soup.find_all('d')# 找到 d 標(biāo)簽
    for item in items:
        text = item.text
        print('---------------------------------'*10)
        print(text)
        seg_list = jieba.cut(text,cut_all =True)# 對(duì)字符串進(jìn)行分詞處理，方便后面制作詞云圖
        for j in seg_list:
            print(j)
            f.write(j)
            f.write('\n')
    f.close()

3、視頻切幀，人像分割

下載到視頻之后，先把視頻拆分成一幀一幀圖像；

vc = cv2.VideoCapture(video_path)
    c =0
    if vc.isOpened():
        rval,frame = vc.read()# 讀取視頻幀
    else:
        rval=False
    while rval:
        rval,frame = vc.read()# 讀取每一視頻幀，并保存至圖片中

        cv2.imwrite(os.path.join(Pic_path,'{}.jpg'.format(c)),frame)
        c += 1
        print('第 {} 張圖片存放成功！'.format(c))

對(duì)每一幀中的小姐姐進(jìn)行識(shí)別提取，也就是人像分割，這里借助了百度 API 接口，

APP_ID = "23633750"
    API_KEY = 'uqnHjMZfChbDHvPqWgjeZHCR'
    SECRET_KEY = '************************************'
    client = AipBodyAnalysis(APP_ID, API_KEY, SECRET_KEY)
    # 文件夾
    jpg_file = os.listdir(jpg_path)
    # 要保存的文件夾
    for i in jpg_file:
        open_file = os.path.join(jpg_path,i)
        save_file = os.path.join(save_path,i)
        if not os.path.exists(save_file):#文件不存在時(shí)，進(jìn)行下步操作
            img = cv2.imread(open_file)  # 獲取圖像尺寸
            height, width, _ = img.shape
            if crop_path:# 若Crop_path 不為 None,則不進(jìn)行裁剪
                crop_file = os.path.join(crop_path,i)
                img = img[100:-1,300:-400] #圖片太大，對(duì)圖像進(jìn)行裁剪里面參數(shù)根據(jù)自己情況設(shè)定
                cv2.imwrite(crop_file,img)
                image= get_file_content(crop_file)
            else:

                image = get_file_content(open_file)
            res = client.bodySeg(image)#調(diào)用百度API 對(duì)人像進(jìn)行分割
            labelmap = base64.b64decode(res['labelmap'])
            labelimg = np.frombuffer(labelmap,np.uint8)# 轉(zhuǎn)化為np數(shù)組 0-255
            labelimg = cv2.imdecode(labelimg,1)
            labelimg = cv2.resize(labelimg,(width,height),interpolation=cv2.INTER_NEAREST)
            img_new = np.where(labelimg==1,255,labelimg)# 將 1 轉(zhuǎn)化為 255
            cv2.imwrite(save_file,img_new)
            print(save_file,'save successfully')

將含有人像的圖像轉(zhuǎn)化為二值化圖像，前景為人物，其余部分為背景

API 使用之前需要用自己賬號(hào)在百度智能云平臺(tái)創(chuàng)建一個(gè) 人體分析應(yīng)用，里面需要三個(gè)參數(shù)：ID、AK、SK

關(guān)于百度 API 使用方法，可參考官方文檔資料

4、對(duì)分割后的圖像制作詞云圖

根據(jù)步驟 3 中得到了小姐姐人像 Mask，

借助 wordcloud 詞云庫(kù)和采集到的彈幕信息，對(duì)每一張二值化圖像繪制詞云圖(在制作之前，請(qǐng)確保每一張都是二值化圖像，全部為黑色像素圖像需要剔除)

word_list = []
    with open('danmu.txt',encoding='utf-8') as f:
        con = f.read().split('\n')# 讀取txt文本詞云文本
        for i in con:
            if re.findall('[\u4e00-\u9fa5]+', str(i), re.S): #去除無(wú)中文的詞頻
                word_list.append(i)
    for i in os.listdir(mask_path):
        open_file = os.path.join(mask_path,i)
        save_file = os.path.join(cloud_path,i)
        if not os.path.exists(save_file):
            # 隨機(jī)索引前 start 頻率詞
            start = random.randint(0, 15)
            word_counts = collections.Counter(word_list)
            word_counts = dict(word_counts.most_common()[start:])
            background = 255- np.array(Image.open(open_file))
            wc =WordCloud(
                background_color='black',
                max_words=500,
                mask=background,
                mode = 'RGB',
                font_path ="D:/Data/fonts/HGXK_CNKI.ttf",# 設(shè)置字體路徑，用于設(shè)置中文,

            ).generate_from_frequencies(word_counts)
            wc.to_file(save_file)
            print(save_file,'Save Sucessfully!')

5、圖片拼接，合成視頻

詞云圖全部生成完畢之后，如果一張一張圖像看肯定沒(méi)意思，如果把處理后的詞云圖合成視頻會(huì)更酷一點(diǎn)！

為了視頻前后對(duì)比效果這里我多加了一個(gè)步驟，在合并之前先對(duì)原圖和詞云圖進(jìn)行拼接，合成效果如下：

 num_list = [int(str(i).split('.')[0]) for i in os.listdir(origin_path)]
    fps = 24# 視頻幀率，越大越流暢
    height,width,_=cv2.imread(os.path.join(origin_path,'{}.jpg'.format(num_list[0]))).shape # 視頻高度和寬度
    width = width*2
    # 創(chuàng)建一個(gè)寫(xiě)入操作;
    video_writer = cv2.VideoWriter(video_path,cv2.VideoWriter_fourcc(*'mp4v'),fps,(width,height))
    for i in sorted(num_list):
        i = '{}.jpg'.format(i)
        ori_jpg = os.path.join(origin_path,str(i))
        word_jpg = os.path.join(wordart_path,str(i))
        # com_jpg = os.path.join(Composite_path,str(i))
        ori_arr = cv2.imread(ori_jpg)
        word_arr = cv2.imread(word_jpg)
        # 利用 Numpy 進(jìn)行拼接
        com_arr = np.hstack((ori_arr,word_arr))
        # cv2.imwrite(com_jpg,com_arr)# 合成圖保存
        video_writer.write(com_arr) # 將每一幀畫(huà)面寫(xiě)入視頻流中
        print("{} Save Sucessfully---------".format(ori_jpg))

再加上背景音樂(lè)，視頻又能提升一個(gè)檔次~