快捷導(dǎo)航

python之語(yǔ)音識(shí)別speech模塊

更新時(shí)間：2020年09月09日 09:07:56 作者：uniquefu

這篇文章主要介紹了python之語(yǔ)音識(shí)別speech模塊,本文給大家介紹的非常詳細(xì)，對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值，需要的朋友可以參考下

1.原理

語(yǔ)音操控分為語(yǔ)音識(shí)別和語(yǔ)音朗讀兩部分。

這兩部分本來(lái)是需要自然語(yǔ)言處理技能相關(guān)知識(shí)以及一系列極其復(fù)雜的算法才能搞定，可是這篇文章將會(huì)跳過(guò)此處，如果你只是對(duì)算法和自然語(yǔ)言學(xué)感興趣的話，就只有請(qǐng)您移步了，下面沒(méi)有一個(gè)字會(huì)講述到這些內(nèi)容。

早在上世紀(jì)90年代的時(shí)候，IBM就推出了一款極為強(qiáng)大的語(yǔ)音識(shí)別系統(tǒng)-vio voice , 而其后相關(guān)產(chǎn)品層出不窮，不斷的進(jìn)化和演變著。我們這里將會(huì)使用SAPI實(shí)現(xiàn)語(yǔ)音模塊。

2. 什么是SAPI？

SAPI是微軟Speech API , 是微軟公司推出的語(yǔ)音接口，而細(xì)心的人會(huì)發(fā)現(xiàn)從WINXP開(kāi)始，系統(tǒng)上就已經(jīng)有語(yǔ)音識(shí)別的功能了，可是用武之地相當(dāng)之少，他并沒(méi)有給出一些人性化的自定義方案，僅有的語(yǔ)音操控命令顯得相當(dāng)雞脅。那么這篇文章的任務(wù)就是利用SAPI進(jìn)行個(gè)性化的語(yǔ)音識(shí)別

代碼

前提：打開(kāi)win7的語(yǔ)音自動(dòng)識(shí)別（控制面板--輕松訪問(wèn)--語(yǔ)音識(shí)別）

#!/usr/bin/env python
# -*- codinfg:utf-8 -*-
'''
@author: Jeff LEE
@file: .py
@time: 2018-07-19 11:15
@desc:
'''
from win32com.client import constants
import os
import win32com.client
import pythoncom
 
speaker = win32com.client.Dispatch("SAPI.SPVOICE")
 
 
class SpeechRecognition:
 def __init__(self, wordsToAdd):
 self.speaker = win32com.client.Dispatch("SAPI.SpVoice")
 self.listener = win32com.client.Dispatch("SAPI.SpSharedRecognizer")
 self.context = self.listener.CreateRecoContext()
 self.grammar = self.context.CreateGrammar()
 self.grammar.DictationSetState(0)
 self.wordsRule = self.grammar.Rules.Add("wordsRule", constants.SRATopLevel + constants.SRADynamic, 0)
 self.wordsRule.Clear()
 [self.wordsRule.InitialState.AddWordTransition(None, word) for word in wordsToAdd]
 self.grammar.Rules.Commit()
 self.grammar.CmdSetRuleState("wordsRule", 1)
 self.grammar.Rules.Commit()
 self.eventHandler = ContextEvents(self.context)
 self.say("Started successfully")
 def say(self, phrase):
 self.speaker.Speak(phrase)
 
class ContextEvents(win32com.client.getevents("SAPI.SpSharedRecoContext")):
 def OnRecognition(self, StreamNumber, StreamPosition, RecognitionType, Result):
 newResult = win32com.client.Dispatch(Result)
 print("你在說(shuō) ", newResult.PhraseInfo.GetText())
 speechstr=newResult.PhraseInfo.GetText()
 # 下面即為語(yǔ)音識(shí)別信息對(duì)應(yīng),打開(kāi)響應(yīng)操作
 if speechstr=="記事本":
  os.system('notepad') 
 elif speechstr=="寫字板":
  os.system('write')
 elif speechstr=="畫圖板":
  os.system('mspaint')
 else:
  pass
 
if __name__ == '__main__':
 
 speaker.Speak("語(yǔ)音識(shí)別開(kāi)啟")
 wordsToAdd = ["記事本", "寫字板","畫圖板",]
 speechReco = SpeechRecognition(wordsToAdd)
 while True:
 pythoncom.PumpWaitingMessages()

　　調(diào)試遇到問(wèn)題

python調(diào)用語(yǔ)音模塊時(shí)，遇見(jiàn)TypeError:NoneTypetakesnoarguments這種錯(cuò)誤類型該如何解決

報(bào)錯(cuò)的原因是：不能調(diào)用語(yǔ)音開(kāi)發(fā)包

解決方法：(如果你已經(jīng)安裝了pyWin32，它也安裝了PythonWin)

1.在python35目錄中找到pythonwin文件夾下的pythonwin.exe