1500字范文,内容丰富有趣,写作好帮手!
1500字范文 > speech_recognition实现录音ffmpeg实现音频文件转换 并用百度语音的sdk实现语音识别

speech_recognition实现录音ffmpeg实现音频文件转换 并用百度语音的sdk实现语音识别

时间:2023-05-29 23:30:35

相关推荐

speech_recognition实现录音ffmpeg实现音频文件转换 并用百度语音的sdk实现语音识别

项目说明:

在windows平台下,使用speech_recognition记录音频,并转换为16k的wav, 之后利用ffmpeg将wav转化为pcm文件,上传到百度语音端,返回语音信息,并利用pyttsx3添加了简单的交互功能。

需求模块:

speech_recognition, pyttsx3, pyaudio, wave, aip, ffmpeg

模块安装:

speech_recognition: /project/SpeechRecognition/pyttsx3: /dss_dssssd/article/details/82693742pyaudio: /project/PyAudio/aip:/docs#/ASR-Online-Python-SDK/topffmpeg (Windows下)注意是系统的环境变量,不是个人的path

/zhuiqiuk/article/details/72834385

代码如下

import speech_recognition as srimport pyttsx3import pyaudioimport wavefrom aip import AipSpeechimport os# 读取wav文件并播放def read_wav():CHUNK = 1024# 测试语音wf = wave.open('./2.wav', 'rb')# read datadata = wf.readframes(CHUNK)p = pyaudio.PyAudio()FORMAT = p.get_format_from_width(wf.getsampwidth())CHANNELS = wf.getnchannels()RATE = wf.getframerate()print('FORMAT: {} \nCHANNELS: {} \nRATE: {}'.format(FORMAT, CHANNELS, RATE))stream = p.open(format=FORMAT,channels=CHANNELS,rate=RATE,frames_per_buffer=CHUNK,output=True)# play stream (3)while len(data) > 0:stream.write(data)data = wf.readframes(CHUNK)def wav_to_pcm(wav_file):# 假设 wav_file = "音频文件.wav"# wav_file.split(".") 得到["音频文件","wav"] 拿出第一个结果"音频文件" 与 ".pcm" 拼接 等到结果 "音频文件.pcm"pcm_file = "%s.pcm" %(wav_file.split(".")[0])# 就是此前我们在cmd窗口中输入命令,这里面就是在让Python帮我们在cmd中执行命令os.system("ffmpeg -y -i %s -acodec pcm_s16le -f s16le -ac 1 -ar 16000 %s"%(wav_file,pcm_file))return pcm_filedef get_file_content(filePath):with open(filePath, 'rb') as fp:return fp.read()""" 你的 APPID AK SK """# 需要根据自己申请的填写# APP_ID = '你的 App ID'# API_KEY = '你的 Api Key'# SECRET_KEY = '你的 Secret Key'# 这是测试id,keyAPP_ID = '14545668'API_KEY = 'BLG4GIxozxXia9U8KKtLBl2j'SECRET_KEY = 'z0ITqlx8OXiveTePBvD7jkSCdGKthZAy'def speech_interaction():# 初始化pyttsx3 engineengine = pyttsx3.init()# obtain audio from the microphone# 从麦克风记录数据r = sr.Recognizer()with sr.Microphone() as source:# print("Say something!")engine.say("门外有客人来访,需要开门吗, 请一秒后回答?")engine.runAndWait()r.adjust_for_ambient_noise(source)audio = r.listen(source)engine.say("录音结束, 识别中")engine.runAndWait()# 将数据保存到wav文件中with open("2.wav", "wb") as f: f.write(audio.get_wav_data(convert_rate=16000))# 将记录的语音播放出来read_wav()# 创建百度语音识别客户端client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)# 转成pcm格式pcmFile = wav_to_pcm("./2.wav")result = client.asr(get_file_content(pcmFile), 'pcm', 16000, {'dev_pid': 1537,})print(result)# print(result['err_msg'], result['result'][0])# 上传到百度云识别try:success = True if result['err_msg'] == 'success.' else Falseprint(success)if success:text = result['result'][0]if "不" in text :engine.say("好的,那请您自己去开门")engine.runAndWait()elif "开" in text or '好' in text:engine.say("请您稍等,我去帮您开门,")engine.runAndWait()else:engine.say("语音识别错误")engine.runAndWait()# engine.say(text)# engine.runAndWait()except Exception as e:engine.say("抱歉, 识别错误")engine.runAndWait()# 执行代码speech_interaction()

注意:

pyttsx3的pyttsx3.engine()初始化不能放在线程中进行,会错。

说明:
如果返回timeout错误,在网络畅通的情况下,建议换一个id和key试一下。

项目放在github上了:

/MengRe/speech_commmunication/tree/master

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。