№1, 2013

AZ-SRDAT – A SPEECH DATABASE FOR AZERBAIJANI LANGUAGE

Yadigar N.Imamverdiyev, Lyudmila V.Sukhostat

This paper describes AZ-SRDat (AZerbaijani language Speaker Recognition DATa), a speech database for speaker recognition. The database contains speech utterances produced by 86 speakers in Azerbaijani language in two sessions. The main purpose of AZ-SRDat is to provide data for evaluation of different methods for speaker recognition. (pp. 67-73)

Keywords: speech database, speech database for Azerbaijani language, speaker recognition, International Phonetic Alphabet
References
  • Doddington G. Speaker recognition-identifying people by their voices // Proc. IEEE, 1985, vol. 73, no. 11, pp. 1651–166.
  • Reynolds D.A. An overview of automatic speaker recognition technology// Proc. IEEEInternational Conference on Acoustics, Speech, and Signal Processing, 2002, vol. 4, pp. 4072–4075.
  • Campbell J.P., Reynolds D.A. Corpora for the evaluation of speaker recognition systems // Proc. IEEEInternational Conference on Acoustics, Speech, and Signal Processing, 1999.
  • Abbasov A., Fatullayev R., Fatullayev A. HMM-based large vocablary continuous speech recognition system for Azerbaijani// of PCI-2010, 2010, vol. 1, pp. 23–26.
  • Imamverdiyev Y.N., Sukhostat L.V. SVM based recognition of Azerbaijani vowels // 5th Int-l Conf. on Application of Information and Communication Technologies (AICT), 12–14 Oct. 2011, Baku.
  • Имамвердиев Я.Н., Сухостат Л.В. Речевые базы данных для систем распознавания диктора // Вопросы защиты информации, 2011, № 4, c. 27–32.
  • Сухостат Л.В. Разработка методов и алгоритмов для синтеза систем биометрической идентификации личности по голосу, Науч. семинар, 30 ноября 2012, Баку, с. 29–30.
  • Имамвердиев Я.Н., Сухостат Л.В. Об одном методе извлечения признаков для систем распознавания диктора // İnformasiya texnologiyaları problemləri, 2012, №2, pp. 14–19.
  • Сухостат Л.В. Разработка прототипа системы распознавания личности по голосу //Azərbaycan xalqının ümummilli lideri Heydər Əliyevin 90 illik yubileyinə həsr olunmuş “İnformasiya təhlükəsizliyi problemləri üzrə I respublika elmi-praktiki konfransı, 2013, c. 151–154.
  • LDC, Lingustic Data Consortium. Сайт: http://www.ldc.upenn.edu/
  • ELRA, European Language Resource Association. Сайт: http://www.elra.info/
  • ELDA, Evaluations and Language resources Distribution Agency. Сайт: http://www.elda.org/
  • British National Corpus, http://www.natcorp.ox.ac.uk/
  • Allwood J., Bjornberg M., Gronqvist L., Ahlsen E. and Ottesjo C. Spoken Language Corpus at the Department of Linguistics // Forum: Qualitative Social Research, Goteborg University, 2000, vol. 1, no. 3.
  • Ouzounov A. BG-SRDat: A Corpus in Bulgarian Language for Speaker Recognition over Telephone Channels // Cybernetics and Information Technologies, 2003, vol.3, no.2, pp.101–108.
  • Melin H. Databases for Speaker Recognition: Activities in COST250 Working Group 2, COST 250 - Speaker Recognition in Telephony, Final Report 1999, European Commission DG-XIII, Brussels, August 2000.
  • Ortega-Garsia J., Gonzalez-Rodriguez J., Marrero-Aguiar V. AHUMADA: A large speech corpus in Spanish for speaker characterization and identification // Speech Communication, 2000, vol. 31, pp. 255–264.
  • Handbook, IPA: Handbook of the International Phonetic Association, Cambridge University Press.1999, 214 p.
  • Barlow M., Booth L. and Parr A. The Collection of Two Speaker Recognition Targeted Speech Databases // Proc. 4th Aust. Int. Conf. Speech Science and Technology, 1992, pp. 706–711.
  • Yin S.-C., Rose R., Kenny P. A joint factor analysis approach to progressive model adaptation in text-independent speaker verification // IEEE Transactions on Audio, Speech, and Language Processing, 2007, vol. 15, no. 7, pp. 1999–2010.
  • Dehak N., Kenny P., Dehak R., Dumouchel P., Ouellet P. Front-end factor analysis for speaker verification // IEEE Transactions on Audio, Speech, and Language Processing, 2011, vol. 19, no. 4, pp. 788–798.