摘 要:音素分割是语音研究的一个主要组成部分,在大词汇量连续语音识别及语音合成的过程中起着重要的作用。文章以贵州省中部苗语作为研究对象,对其进行特征的提取和音素边界划分。通过对录音的频谱能量进行低频、中频和高频的均值计算,找到各个频段均值点组成的波形突变点作为边界,去掉宽度低于 20 ms 的边界,然后将得到的边界点进行排序,再一次筛选出宽度大于 20 ms 的边界,得出划分的边界点。在一定的容错范围内,准确率能够达到 83%。
关键词:苗族语音;Praat 标注;语谱能量;语音分割
中图分类号:TN912 文献标识码:A 文章编号:2096-4706(2020)03-0019-03
Study on the Method of Phoneme Boundary Detection of Miao Language in the Middle of Guizhou Province
LI Xuelin,ZHAO Dongmei,LIANG Mingxiu
(Guizhou Minzu University,Guiyang 550025,China)
Abstract:Phoneme segmentation is a main components of speech research,it plays an important role in large vocabulary continuous speech recognition and speech synthesis. In this paper,Miao language in the middle of Guizhou Province is taken as the research object,and its feature extraction and phoneme boundary division are carried out. The mean value of low frequency,intermediate frequency and high frequency is calculated through the spectrum energy of recording. Find the wave mutation point composed of the mean points of each frequency band as the boundary and remove boundary with width less than 20 ms. Then the boundary points are sorted,and the boundary points with a width of more than 20 ms are screened out again to get the boundary points. The accuracy can reach 83% in a certain range of fault tolerance.
Keywords:Miao nationality’s voice;Praat annotation;spectrogram energy;speech segmentation
课题项目:贵州民族大学校级课题([2018]5 773-QN02)
参考文献:
[1] 刘豫军,夏聪 . 语音合成音库自动标注方法研究 [J]. 网络安全技术与应用,2015(2):65-66.
[2] 杨艳珍 . 语音半自动标注系统的设计与实现 [D]. 兰州:西北师范大学,2015.
[3] 李永宏,于洪志,孔江平 . 藏语连续语音语料库设计与实现 [J]. 计算机工程与应用,2010,46(13):233-235+248.
[4] 杨辰雨 . 语音合成音库自动标注方法研究 [D]. 合肥:中国科学技术大学,2014.
[5] 李立永,张连海,冯志远 . 基于语谱能量的音素边界检测 [J]. 太赫兹科学与电子信息学报,2013,11(6):936-941.
作者简介:李学林(1990.02-),男,汉族,贵州湄潭人,实验室管理员,实验师,硕士研究生,研究方向:风险管理与统计决策。