用python制作詞云視頻詳解
使用到的第三方庫
Package Version --------------- --------- baidu-aip 2.2.18.0 jieba 0.42.1 moviepy 1.0.3 numpy 1.20.2 opencv-python 4.5.1.48 Pillow 8.2.0 requests 2.25.1 wordcloud 1.8.1 you-get 0.4.1520


B站彈幕爬取
思路
通過視頻BV號請求cid,再使用cid請求彈幕文件,最后使用正則表達式去匹配彈幕文本,將匹配出來的結果保存在本地供之后使用,代碼及思路比較簡單,就不做過多贅述
實現(xiàn)
cid請求鏈接:https://api.bilibili.com/x/web-interface/view?bvid=
彈幕請求鏈接:https://api.bilibili.com/x/v1/dm/list.so?oid=
參考代碼
def get_cid(cls, bv):
url = "https://api.bilibili.com/x/web-interface/view?bvid=" + str(bv)
response = requests.get(url)
dirt = json.loads(response.text)
aid = dirt['data']['cid']
return str(aid)
def get_barrage(cls, bv, to_file_path):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.98 Safari/537.36",
}
cid = cls.get_cid(bv)
response = requests.get("https://api.bilibili.com/x/v1/dm/list.so?oid=" + cid, headers=headers)
html_doc = response.content.decode('utf-8')
regex = re.compile("<d.*?>(.*?)</d>")
DanMu = regex.findall(html_doc)
with open(to_file_path, "w", encoding="utf_8")as f:
for i in DanMu:
f.write(i)
f.write("\n")
視頻下載
思路
使用第三方開源庫you-get進行下載


you-get支持的音視頻網(wǎng)站
| Site | URL | Videos? | Images? | Audios? |
|---|---|---|---|---|
| YouTube | https://www.youtube.com/ | ✓ | ||
| https://twitter.com/ | ✓ | ✓ | ||
| VK | http://vk.com/ | ✓ | ✓ | |
| Vine | https://vine.co/ | ✓ | ||
| Vimeo | https://vimeo.com/ | ✓ | ||
| Veoh | http://www.veoh.com/ | ✓ | ||
| Tumblr | https://www.tumblr.com/ | ✓ | ✓ | ✓ |
| TED | http://www.ted.com/ | ✓ | ||
| SoundCloud | https://soundcloud.com/ | ✓ | ||
| SHOWROOM | https://www.showroom-live.com/ | ✓ | ||
| https://www.pinterest.com/ | ✓ | |||
| MTV81 | http://www.mtv81.com/ | ✓ | ||
| Mixcloud | https://www.mixcloud.com/ | ✓ | ||
| Metacafe | http://www.metacafe.com/ | ✓ | ||
| Magisto | http://www.magisto.com/ | ✓ | ||
| Khan Academy | https://www.khanacademy.org/ | ✓ | ||
| Internet Archive | https://archive.org/ | ✓ | ||
| https://instagram.com/ | ✓ | ✓ | ||
| InfoQ | http://www.infoq.com/presentations/ | ✓ | ||
| Imgur | http://imgur.com/ | ✓ | ||
| Heavy Music Archive | http://www.heavy-music.ru/ | ✓ | ||
| Freesound | http://www.freesound.org/ | ✓ | ||
| Flickr | https://www.flickr.com/ | ✓ | ✓ | |
| FC2 Video | http://video.fc2.com/ | ✓ | ||
| https://www.facebook.com/ | ✓ | |||
| eHow | http://www.ehow.com/ | ✓ | ||
| Dailymotion | http://www.dailymotion.com/ | ✓ | ||
| Coub | http://coub.com/ | ✓ | ||
| CBS | http://www.cbs.com/ | ✓ | ||
| Bandcamp | http://bandcamp.com/ | ✓ | ||
| AliveThai | http://alive.in.th/ | ✓ | ||
| interest.me | http://ch.interest.me/tvn | ✓ | ||
| 755 ナナゴーゴー | http://7gogo.jp/ | ✓ | ✓ | |
| niconico ニコニコ動畫 | http://www.nicovideo.jp/ | ✓ | ||
| 163 網(wǎng)易視頻 網(wǎng)易云音樂 | http://v.163.com/ http://music.163.com/ | ✓ | ✓ | |
| 56網(wǎng) | http://www.56.com/ | ✓ | ||
| AcFun | http://www.acfun.cn/ | ✓ | ||
| Baidu 百度貼吧 | http://tieba.baidu.com/ | ✓ | ✓ | |
| 爆米花網(wǎng) | http://www.baomihua.com/ | ✓ | ||
| bilibili 嗶哩嗶哩 | http://www.bilibili.com/ | ✓ | ✓ | ✓ |
| 豆瓣 | http://www.douban.com/ | ✓ | ✓ | |
| 斗魚 | http://www.douyutv.com/ | ✓ | ||
| 鳳凰視頻 | http://v.ifeng.com/ | ✓ | ||
| 風行網(wǎng) | http://www.fun.tv/ | ✓ | ||
| iQIYI 愛奇藝 | http://www.iqiyi.com/ | ✓ | ||
| 激動網(wǎng) | http://www.joy.cn/ | ✓ | ||
| 酷6網(wǎng) | http://www.ku6.com/ | ✓ | ||
| 酷狗音樂 | http://www.kugou.com/ | ✓ | ||
| 酷我音樂 | http://www.kuwo.cn/ | ✓ | ||
| 樂視網(wǎng) | http://www.le.com/ | ✓ | ||
| 荔枝FM | http://www.lizhi.fm/ | ✓ | ||
| 懶人聽書 | http://www.lrts.me/ | ✓ | ||
| 秒拍 | http://www.miaopai.com/ | ✓ | ||
| MioMio彈幕網(wǎng) | http://www.miomio.tv/ | ✓ | ||
| MissEvan 貓耳FM | http://www.missevan.com/ | ✓ | ||
| 痞客邦 | https://www.pixnet.net/ | ✓ | ||
| PPTV聚力 | http://www.pptv.com/ | ✓ | ||
| 齊魯網(wǎng) | http://v.iqilu.com/ | ✓ | ||
| QQ 騰訊視頻 | http://v.qq.com/ | ✓ | ||
| 企鵝直播 | http://live.qq.com/ | ✓ | ||
| Sina 新浪視頻 微博秒拍視頻 | http://video.sina.com.cn/ http://video.weibo.com/ | ✓ | ||
| Sohu 搜狐視頻 | http://tv.sohu.com/ | ✓ | ||
| Tudou 土豆 | http://www.tudou.com/ | ✓ | ||
| 陽光衛(wèi)視 | http://www.isuntv.com/ | ✓ | ||
| Youku 優(yōu)酷 | http://www.youku.com/ | ✓ | ||
| 戰(zhàn)旗TV | http://www.zhanqi.tv/lives | ✓ | ||
| 央視網(wǎng) | http://www.cntv.cn/ | ✓ | ||
| Naver 네이버 | http://tvcast.naver.com/ | ✓ | ||
| 芒果TV | http://www.mgtv.com/ | ✓ | ||
| 火貓TV | http://www.huomao.com/ | ✓ | ||
| 陽光寬頻網(wǎng) | http://www.365yg.com/ | ✓ | ||
| 西瓜視頻 | https://www.ixigua.com/ | ✓ | ||
| 新片場 | https://www.xinpianchang.com/ | ✓ | ||
| 快手 | https://www.kuaishou.com/ | ✓ | ✓ | |
| 抖音 | https://www.douyin.com/ | ✓ | ||
| TikTok | https://www.tiktok.com/ | ✓ | ||
| 中國體育(TV) | http://v.zhibo.tv/ http://video.zhibo.tv/ | ✓ | ||
| 知乎 | https://www.zhihu.com/ | ✓ |
# 獲取視頻信息 you-get -i https://www.bilibili.com/video/BV1f4411M7QC # 下載視頻 you-get --format=flv -o E:\Desktop\output https://www.bilibili.com/video/BV1f4411M7QC
視頻、音頻剪輯和音頻提取
思路
這部分的需求非常簡單,就是剪下視頻或者音頻中的某一段并保存
Python有一個叫moviepy的第三方庫,可以實現(xiàn)視頻的剪輯、拼接,音頻的剪輯、拼接、提取,以及音視頻的合并等操作
參考代碼
def cut_video(cls, origin_file_path, to_file_path, start, end):
"""
視頻剪輯
:param origin_file_path: 原視頻文件路徑
:param to_file_path: 保存路徑
:param start: 起始時間點
:param end: 結束時間點
"""
clip = VideoFileClip(origin_file_path).subclip(start, end)
clip.write_videofile(to_file_path)
def cut_audio(cls, origin_file_path, to_file_path, start, end):
"""
音頻剪輯
:param origin_file_path: 原視頻文件路徑
:param to_file_path: 保存路徑
:param start: 起始時間點
:param end: 結束時間點
"""
clip = AudioFileClip(origin_file_path).subclip(start, end)
clip.write_audiofile(to_file_path)
def get_audio_from_video(cls, video_file_path, to_file_path):
"""
音頻提取
:param video_file_path: 視頻文件路徑
:param to_file_path: 音頻文件路徑
"""
video = VideoFileClip(video_file_path)
video.audio.write_audiofile(to_file_path)
視頻幀提取
思路
使用opencv-python(cv2)打開視頻文件并按幀讀取,再將每一幀保存到文件夾中

參考代碼
def split(cls, from_file_path, to_folder_path, frames=0):
"""
視頻按幀讀取并保存
:param from_file_path: 視頻路徑
:param to_folder_path: 保存路徑
:param frames: 保存幀數(shù)(張數(shù)),為0則保存所有幀
"""
vc = cv2.VideoCapture(from_file_path) # cv2打開視頻文件
frames_count = vc.get(7) # 獲取視頻總幀數(shù)
c = 0
if vc.isOpened():
ret, frame = vc.read() # 按幀讀取視頻
else:
ret = False
while ret:
if 0 < frames == c:
break
ret, frame = vc.read() # 讀取每一視頻幀,并保存至圖片中
cv2.imwrite(os.path.join(to_folder_path, '{}.jpg'.format(c)), frame)
c += 1
if c == frames_count - 1:
break
print('第 {} 張圖片存放成功!'.format(c))
圖片二值化
思路
圖片二值化這里有兩種思路,一種是使用opencv,還有一種方法是使用百度智能云的人像分割接口。
兩種方法各有優(yōu)劣:
- 使用opencv的速度快,但是只能對整張圖片二值化,無法有效提取出圖片主體部分,只適用于純色背景及輪廓分明的圖片,當圖片中有背景或者其他干擾畫面時,效果不理想,達不到做詞云遮罩的效果
- 百度的人像分割接口可以將圖片中的人物摳出來,單獨對人物進行二值化,但是速度很慢(處理速度慢,還限制接口并發(fā)數(shù)),一千張圖片往往需要一兩個小時
所以具體使用時需要根據(jù)視頻的情況進行切換
下面為兩周處理方法的不同效果(圖一為cv2,圖二為百度人像分割)


參考代碼
def binary_option_cv2(cls, from_file_path, to_file_path):
"""
圖片二值化并保存(使用cv2)
:param from_file_path: 原圖路徑
:param to_file_path: 二值化圖路徑
"""
img = cv2.imread(from_file_path)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
new_gray = np.uint8((255 * (gray / 255.0) ** 1.4))
dst = cv2.adaptiveThreshold(new_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 15, 1)
cv2.medianBlur(dst, 5)
cv2.imwrite(to_file_path, dst)
def binary_option_baidu(cls, from_file_path, to_file_path):
"""
圖片二值化并保存(使用百度人像分割)
:param from_file_path: 原圖路徑
:param to_file_path: 二值化圖路徑
"""
def get_file_content(filePath):
with open(filePath, 'rb') as fp:
return fp.read()
height, width, bgr = cv2.imread(from_file_path).shape
image = get_file_content(from_file_path)
cls.client.bodySeg(image)
res = cls.client.bodySeg(image)
labelmap = base64.b64decode(res['labelmap'])
labelimg = np.frombuffer(labelmap, np.uint8) # 轉化為np數(shù)組 0-255
labelimg = cv2.imdecode(labelimg, 1)
labelimg = cv2.resize(labelimg, (width, height), interpolation=cv2.INTER_NEAREST)
img_new = np.where(labelimg == 1, 255, labelimg) # 將 1 轉化為 255
cv2.imwrite(to_file_path, img_new)
詞云圖片生成
思路
使用wordcloud庫,并使用前面爬取的B站彈幕作為詞云內(nèi)容,二值化圖片作為遮罩

原圖與詞云圖拼接和圖片合并生成視頻
思路
使用numpy拼接圖片,使用cv2將拼接的圖片寫入視頻流并保存
為了將視頻與音軌對齊,生成視頻時需要設置合適的視頻幀率(與原視頻保持一致),原視頻幀率可以使用播放器查看,也可以使用cv2獲取

參考代碼
def joint(cls, origin_folder, word_cloud_folder, to_file_path):
"""
批量拼接圖片并合成視頻
:param origin_folder: 原圖文件夾
:param word_cloud_folder: 詞云圖片文件夾
:param to_file_path: 保存路徑
"""
num_list = [int(str(i).split('.')[0]) for i in os.listdir(origin_folder)]
fps = 30 # 視頻幀率,需要根據(jù)原視頻幀率做調(diào)整
height, width, _ = cv2.imread(os.path.join(origin_folder, '{}.jpg'.format(num_list[0]))).shape # 視頻高度和寬度
width = width * 2
# 創(chuàng)建一個寫入操作;
video_writer = cv2.VideoWriter(to_file_path, cv2.VideoWriter_fourcc(*'mp4v'), fps, (width, height))
for i in sorted(num_list):
i = '{}.jpg'.format(i)
ori_jpg = os.path.join(origin_folder, str(i))
word_jpg = os.path.join(word_cloud_folder, str(i))
# com_jpg = os.path.join(Composite_path,str(i))
ori_arr = cv2.imread(ori_jpg)
word_arr = cv2.imread(word_jpg)
# 利用 Numpy 進行拼接
com_arr = np.hstack((ori_arr, word_arr))
video_writer.write(com_arr) # 將每一幀畫面寫入視頻流中
print("{}寫入視頻流成功".format(ori_jpg))
音視頻合并和視頻導出
思路
與前面 原圖與詞云圖拼接和圖片合并生成視頻 思路相似
參考代碼
def set_audio_for_video(cls, video_file_path, audio_file_path, to_file_path):
"""
音視頻合并
:param video_file_path: 視頻文件路徑
:param audio_file_path: 音頻文件路徑
:param to_file_path: 保存路徑
"""
video = VideoFileClip(video_file_path)
audio = AudioFileClip(audio_file_path)
new_video = video.set_audio(audio)
new_video.write_videofile(to_file_path)
最終效果


到此這篇關于用python制作詞云視頻詳解的文章就介紹到這了,希望對大家有幫助,更多相關python視頻請搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關文章,希望大家以后多多支持腳本之家!
相關文章
Python腳本實現(xiàn)DNSPod DNS動態(tài)解析域名
這篇文章主要介紹了Python腳本實現(xiàn)DNSPod DNS動態(tài)解析域名,本文直接給出實現(xiàn)代碼,需要的朋友可以參考下2015-02-02
pycharm使用Translation插件實現(xiàn)翻譯功能
PyCharm是一款很流行的Python編輯器,經(jīng)常遇到在PyCharm中把中文翻譯成英文的需求,下面這篇文章主要給大家介紹了關于pycharm使用Translation插件實現(xiàn)翻譯功能的相關資料,需要的朋友可以參考下2023-05-05
python探索之BaseHTTPServer-實現(xiàn)Web服務器介紹
這篇文章主要介紹了python探索之BaseHTTPServer-實現(xiàn)Web服務器介紹,小編覺得還是挺不錯的,這里分享給大家,供需要的朋友參考。2017-10-10
在Python運行時動態(tài)查看進程內(nèi)部信息的方法
今天小編就為大家分享一篇在Python運行時動態(tài)查看進程內(nèi)部信息的方法,具有很好的參考價值,希望對大家有所幫助。一起跟隨小編過來看看吧2019-02-02

