语音识别ASR笔记

Automatic Speech Recognition

python_speech_features
https://github.com/jameslyons/python_speech_features

语音信号处理之(四)梅尔频率倒谱系数(MFCC)
http://blog.csdn.net/zouxy09/article/details/9156785/

语音特征参数MFCC提取过程详解
https://my.oschina.net/jamesju/blog/193343

百度语音识别新技术
http://blog.csdn.net/starzhou/article/details/53319472

python获取语谱图
http://friskit.me/2015/01/29/spectrum-in-python/
matlab获取语谱图
http://blog.csdn.net/ziyuzhao123/article/details/11964459

云知声 语音合成 思必驰
http://dev.hivoice.cn/exp_center/tts/tts.jsp

清华 中文语音
http://data.cslt.org/thchs30/standalone.html
http://www.cslt.org/resources.php?Public%20data

语音合成 wavenet / deep voice Tacotron
https://www.leiphone.com/news/201703/P1OEbKjpB0pHvHDA.html

孤立词 语料库 TI 46-word NOISEX-92 付费
http://www.chineseldc.org/resource_list.php?begin=0&count=20

speech commands dataset
https://www.tensorflow.org/versions/master/tutorials/audio_recognition
http://chinagdg.org/2017/08/launching-the-speech-commands-dataset/

中文的 目前开源的好像只有 THCHS30
http://www.openslr.org/18/

VAD Voice Activity Detection
https://github.com/shiweixingcn/vad/blob/master/vad_baidu/wb_vad.c

avconv将mp3转换成指定频率的wav文件
avconv -i 1.mp3 -ar 16000 -y 1.wav


发表于:2017-10-12 10:42:18

原文链接(转载请保留): http://www.multisilicon.com/blog/a24275128.html

友情链接: MICROIC
首页