Arkaplan Veri Süresinin Konuşmacı Doğrulama Performansına Etkisi

Cemal HANİLÇİ, Figen ERTAŞ
1.478 529

Abstract


Gaussian mixture models with universal background model (GMM-UBM) and vector quantization with universal background model (VQ-UBM) are the two well-known classifiers used for speaker verification. Generally, UBM is trained with many hours of speech from a large pool of different speakers. In this study, we analyze the effect of data duration used to train UBM on text-independent speaker verification performance using GMM-UBM and VQ-UBM modeling techniques. Experiments carried out NIST 2002 speaker recognition evaluation (SRE) corpus show that background data duration to train UBM has small impact on recognition performance for GMM-UBM and VQ-UBM classifiers

Keywords


Speaker verification, Gaussian mixture model, Vector Quantization, Universal background model

Full Text:

PDF (Türkçe)


References


Campbell, W., Sturim, D. E., Reynolds, D. A., Support Vector Machines Using GMM Supervectors for Speaker Verification, IEEE Signal Processing Letters, Vol. 13, No. 5, pp. 308–311, May 2006.

Dehak, N., Kenny, P., Dehak, R., Dumouchel, P and Ouellet, P. (2011) Front-End Factor Analysis for Speaker Verification, IEEE Transactions on Audio, Speech and Language Processing, 19(4), 788-798.

Hanilçi, C. and Ertaş, F. (2011) Comparison of the impact of some Minkowski metrics on VQ/GMM based speaker recognition, Computers & Electrical Engineering, 37(1), 41-56.

Hautamäki, V., Kinnunen, T., Kärkkäinen, I., Tuononen, M., Saastamoinen, J. and Fränti, P. (2008) Maximum a Posteriori Estimation of the Centroid Model for Speaker Verification, IEEE Signal Processing Letters, 15: 162--165.

Kenny, P., Boulianne, G., Ouellet, P. and Dumouchel, P. (2007) Joint factor analysis versus eigenchannels in speaker recognition, IEEE Transactions on Audio, Speech and Language Processing, 15 (4), 1435-1447.

Kinnunen, T., Saastamoinen, J., Hautamäki, V., Vinni, M. and Fränti, P. (2009) Comparative Evaluation of Maximum a Posteriori Vector Quantization and Gaussian Mixture Models in Speaker Verification, Pattern Recognition Letters, 30(4): 341--347.

Kinnunen, T. and Li, H. (2011) An Overview of Text-Independent Speaker Recognition: from Features to Supervectors, Speech Communication 52(1), 12--40.

NIST, (2001). http://www.itl.nist.gov/iad/mig/tests/sre/2002/index.html, Retrieved: July 2012, Subject: NIST 2002 SRE Evaluation Plan

NIST, (2002). http://www.itl.nist.gov/iad/mig/tests/sre/2001/index.html, Retrieved: July 2012, Subject: NIST 2001 SRE Evaluation Plan

Reynolds, D. A., Quatieri, T. F. and Dunn, R. B. (2000) Speaker Verification Using Adapted Gaussian Mixture Models, Digital Signal Processing, 10(1-3), 19-41.

Makale 01.11.2012 tarihinde alınmış, 20.12.2012 tarihinde düzeltilmiş, 21.12.2012 tarihinde

kabul edilmiştir.




Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.