(1. 贵州警察学院 刑事技术系,贵州 贵阳 550005;2. 香港教育大学,香港 999077)

摘  要:为了判断犯罪嫌疑人的方言归属地,从而为案件的侦破提供重要线索,本研究从贵州6 个不同的地区采集到600份不同年龄和性别的语音样本,并提取梅尔频率倒谱系数MFCC,采用主成分分析与本研究所提出的数据压缩方法对MFCC 进行降维处理,得到用于概率神经网络训练的数据集,然后对概率神经网络进行改进,并构建贵州地区方言辨识模型。仿真结果表明,方言模型辨识结果与实际结果的相关系数R 为90%,该模型能有效地对贵州地区方言进行辨识。


中图分类号:TP391.4         文献标识码:A        文章编号:2096-4706(2019)06-0005-05

Identification of Guizhou Dialect Based on Data Compression and
Improved Probabilistic Neural Network
AI Hu1,LI Fei2
(1.Department of Criminal Technology,Guizhou Police College,Guiyang 550005,China;
2.The Education University of Hong Kong,Hong Kong 999077,China)

Abstract:In order to judge the location of the suspect’s dialect,it provides important clues for the detection of the case. In this study,600 phonetic samples of different ages and sexes were collected from 6 different regions of Guizhou and the Mel frequency cepstrum coefficient MFCC was extracted from the samples. The Principal Component Analysis (PCA) and the data compression method proposed in this study are used to reduce the dimensionality of the MFCC to get the data set used in the training of probabilistic neural network. Then the probabilistic neural network is improved,and then it is used to construct the identification model of Guizhou dialect. The simulation results show that the correlation coefficient R between the dialect model identification result and the actual result is 90%. This model can effectively identify the dialects in Guizhou.

 Keywords:Chinese dialect identification;mel frequency cepstrum coefficients;principal component analysis;probabilisti probabilisticneural network


